Fwd: [LLVMdev] 3.4.1 Release Plans

Tue Apr 1 01:25:49 PDT 2014

CC mail list ...

---------- Forwarded message ----------
From: Jiangning Liu <liujiangning1 at gmail.com>
Date: 2014-04-01 15:40 GMT+08:00
Subject: Re: [LLVMdev] 3.4.1 Release Plans
To: Tom Stellard <tom at stellard.net>
Cc: llvmdev at cs.uiuc.edu, Ben Pope <benpope81 at gmail.com>, Erik Verbruggen <
erik.verbruggen at me.com>

Hi Tom,

We want the following patches go into release 3.4.1.

Clang:

196750 [AArch64]Add missing pair intrinsics such as: int32_t
vminv_s32(int32x2_t a)
196834 [AArch64] Remove q and non-q intrinsic definitions from the NEON
scalar reduce pairwise implementation, using an overloaded definition
instead.
196835 [AArch64] Refactor the NEON scalar reduce pairwise front-end codegen
to remove unnecessary patterns in tablegen.
196836 [AArch64] Refactor the NEON scalar reduce pairwise intrinsics so
that they use float/double rather than the vector equivalents when
appropriate.
196888 [AArch64 NEON] Support poly128_t and implement relevant intrinsic.
196927 [AArch64] Refactor the Neon vector/scalar floating-point convert
implementation.  Specifically, reuse the ARM intrinsics when possible.
196931 [AArch64] Refactor the Neon vector/scalar floating-point convert
intrinsics so that they use float/double rather than the vector equivalents
when appropriate.
196936 [AArch64] Refactor the redundant code in the
EmitAArch64ScalarBuiltinExpr() function.  No functional change intended.
196966 [AArch64] Overload NEON signed/unsigned integer convert to
floating-point LLVM AArch64 intrinsics.
196967 [AArch64] Overload NEON signed/unsigned floating-point convert to
fixed-point and fixed-point convert to floating-point LLVM AArch64
intrinsics.
196968 [AArch64] Refactor the NEON signed/unsigned floating-point convert
to fixed-point LLVM AArch64 intrinsics to use f32/f64, rather than their
vector equivalents.
196969 [AArch64] Refactor the NEON floating-point absolute difference LLVM
AArch64 intrinsic to use f32/f64 types, rather than their vector
equivalents.
197069 [AArch64] Refactor the NEON scalar floating-point reciprocal
estimate, floating-point reciprocal exponent, and floating-point reciprocal
square root estimate
197070 [AArch64] Refactor the NEON scalar floating-point reciprocal step
and floating-point reciprocal square root step LLVM AArch64 intrinsics to
197071 [AArch64] Add NEON scalar floating-point compare LLVM AArch64
intrinsics that use f32/f64 types, rather than their vector equivalents.
197091 [AArch64] Refactor NEON floating-point Max/Min/Maxnm/Minnm across
vector AArch64 intrinsics to use f32 types, rather than their vector
equivalents.
197112 [AArch64] Fix Incorrect CHECK message [0-31]+ in test case.
197403 [AArch64] Fix v1fx patterns for Floating-point Multiply Extend and
Floating-point Compare to Zero.
197898 [AArch64] The compare to zero intrinsics should be implemented by
'icmp/fcmp' and 'sext' not 'zext'. Modify the implementation by replacing
zext with sext.
197994 [AArch64] Add some missing test cases for ACLE intrinsics of AArch64
NEON.
198195 [AArch64] For AArch64 Neon, simplify scalar dup by lane0 for fp.
198741 [AArch64] For AArch64, support builtin neon vector type with 'long'
as base element type.
199866 [AArch64 NEON] Fix a bug about vcles_f32 and vcled_f64.
200114 [AArch64] For AArch64 Neon, fix intrinsics implementation using
nested macros.
200470 ARM & AArch64: share the BI__builtin_neon enum defs.
200471 ARM & AArch64: fully share NEON implementation of permutation
intrinsics
200472 ARM & AArch64: extend shared NEON implementation to first block.
200524 ARM & AArch64: merge another NEON block completely.
200525 ARM & AArch64: more instructions into common block
200526 ARM & AArch64: move shared vld/vst intrinsics to common
implementation.
200527 ARM & AArch64: another block of miscellaneous NEON sharing.
200528 ARM & AArch64: unify the rest of the completely shared NEON
implementations
200707 [AArch64] AArch64: use new non-polymorphic crypto intrinsics This
was caused by r200708 which enabled the crypto feature for these cores.
200708 ARM: implement support for crypto intrinsics in arm_neon.h
200769 ARM & AArch64: combine implementation of vcaXYZ intrinsics
201112 [AArch64] Fixed vget/vset_lane_f16 implementation
201384 [AArch64] Enable AArch64 NEON by default.
202004 [AArch64] Change int64_t from 'long long int' to 'long int' for
AArch64 target.

LLVM:

196748 [AArch64]Pattern match failures for truncate store and extend load
196749 [AArch64]Add missing pair intrinsics such as: int32_t
vminv_s32(int32x2_t a)
196831 [AArch64] Remove q and non-q intrinsic definitions in the NEON
scalar reduce pairwise implementation, using an overloaded definition
instead.
196832 [AArch64] Refactor NEON scalar reduce pairwise front-end codegen to
remove unnecessary patterns in tablegen.
196833 [AArch64] Refactor the NEON scalar reduce pairwise intrinsics, so
that they use float/double rather than the vector equivalents when
appropriate.
196887 [AArch64 NEON] Support poly128_t and implement relevant intrinsic.
196889 [AArch64 NEON] Replace fpimm with fpz32 for floating compare with
zero.  This is a small change to be strict. Just want get pattern safer.
196926 [AArch64] Refactor the Neon vector/scalar floating-point convert
implementation.  Specifically, reuse the ARM intrinsics when possible.
196930 [AArch64] Refactor the Neon vector/scalar floating-point convert
intrinsics so that they use float/double rather than the vector equivalents
when appropriate.
196962 [AArch64] Overload NEON signed/unsigned integer convert to
floating-point LLVM AArch64 intrinsics.
196963 [AArch64] Overload NEON signed/unsigned floating-point convert to
fixed-point and fixed-point convert to floating-point LLVM AArch64
intrinsics.
196964 [AArch64] Refactor the NEON signed/unsigned floating-point convert
to fixed-point LLVM AArch64 intrinsics to use f32/f64, rather than their
vector equivalents.
196965 [AArch64] Refactor the NEON floating-point absolute difference LLVM
AArch64 intrinsic to use f32/f64 types, rather than their vector
equivalents.
196998 [AArch64 NEON] Get instruction BSL matched to VSELECT.
197066 [AArch64] Refactor the NEON scalar floating-point reciprocal
estimate, floating- point reciprocal exponent, and floating-point
reciprocal square root estimate
197067 [AArch64] Refactor the NEON scalar floating-point reciprocal step
and floating-point reciprocal square root step LLVM AArch64 intrinsics to
197068 [AArch64] Add NEON scalar floating-point compare LLVM AArch64
intrinsics that use f32/f64 types, rather than their vector equivalents.
197090 [AArch64] Refactor NEON floating-point Max/Min/Maxnm/Minnm across
vector AArch64 intrinsics to use f32 types, rather than their vector
equivalents.
197113 Fix Incorrect CHECK message [0-31]+ in test case.  In regular
expression, [0-31]+ equals to [0-3]+, not the number from
197135 [AArch64]Fix the problem that AArch64 backend fails to select
scalar_to_vector of vector types having more than one element.
197159 [AArch64] Removed unnecessary copy patterns with v1fx types.
197250 [AArch64] Simplify the Neon Scalar3Same patterns for floating-point
reciprocal step, floating-point reciprocal square root step, floating-point
absolute
197361 [AArch64]Fix the pattern match failure for v1i8/v1i16/v1i32 types.
 Currently we have such types as legal vector types. The DAG combiner may
generate some DAG nodes having such types but we don't have patterns to
match them.
197402 [AArch64] Fix v1fx patterns for Floating-point Multiply Extend and
Floating-point Compare to Zero.
197551 [AArch64 NEON]Implment loading vector constant form constant pool.
197897 [AArch64]The compare to zero intrinsics should be implemented by
'icmp/fcmp' and 'sext' not 'zext'. Modify the test cases.
197928 [AArch64 NEON] Fixed fused multiply negate add/sub patterns
197929 [AArch64] Check fmul node single use in fused multiply patterns
197966 [AArch64 NEON] Fix a pattern match failure with NEON_VDUP.
197967 [AArch64 NEON] Fix a bug when lowering BUILD_VECTOR.
197969 [AArch64]Add patterns to match normal shift nodes: shl, sra and srl.
197993 Add missing pattern matches to support ACLE intrinsics of AArch64
NEON.
198001 [AArch64]Fix a problem that the register order of fmls/fmla by
element is incorrect.
198084 Teach DAGCombiner how to fold a SIGN_EXTEND_INREG of a BUILD_VECTOR
of ConstantSDNodes (or UNDEFs) into a simple BUILD_VECTOR.
198188 [AArch64]Fix the problem that can't select mul of v1i64/v2i64 types.
198190 Fix a bug in DAGcombiner about zero-extend after setcc.
198192 [AArch64]Can't select shift left 0 of type v1i64
198193 [AArch64]Add code to spill/fill Q register tuples such as
QPair/QTriple/QQuad.
198194 For AArch64 Neon, simplify scalar dup by lane0 for fp.
198437 [AArch64][NEON] Added SXTL and SXTL2 instruction aliases
198675 [AArch64 NEON] Fixed incorrect immediate used in BIC instruction.
198682 [AArch64]Add support to copy D tuples such as DPair/DTriple/DQuad
and Q tuples such as QPair/QTriple/QQuad. There is no test case for D tuple
as the original test cases are too large. As the copy of the D tuple is
similar to the Q tuple, the correctness can be guaranteed.
198684 [AArch64]Add support to spill/fill D tuples such as
DPair/DTriple/DQuad. There is no test cases for D tuple as the original
test cases are too large. As the spill/fill of the D tuple is similar to
the Q tuple, the correctness can be guaranteed.
198730 Fix a bug about generating undef operand when optimising shuffle
vector and insert element in instruction combine.
198743 [AArch64 NEON] Fix generating incorrect value type of NEON_VDUPLANE
when lower build_vector if result value type mismatch with operand
198791 [AArch64][NEON] Added UXTL and UXTL2 instruction aliases
198937 Make sure -use-init-array has intended effect on all AArch64 ELF
targets, not just linux.
198941 Silence unused variable warning for non-asserting builds that was
introduced in r198937.
199069 [AArch64 NEON] Add more scenarios to use perm instructions when
lowering shuffle_vector
199070 [AArch64 NEON] Add missing patterns for bitcast from or to v1f64
199242 [AArch64] Added vselect patterns with float and double types
199296 For AArch64, lowering sext_inreg and generate optimized code by
using SXTL.
199369 For ARM, fix assertuib failures for some ld/st 3/4 instruction with
wirteback.
199461 [AArch64]Fix the problem can't select concat_vectors of two v1i32
types.  Also fix the problem can't select scalar_to_vector from f32 to
v2f32/v4f32.
199462 [AArch64 NEON] Custom lower conversion between vector integer and
vector floating point if element bit-width doesn't match.
199463 [AArch64]Fix the problem can't select f16_to_f32 and f32_to_f16.
 Also add copy support for FPR16.
199485 [AArch64 NEON] Expand vector for UDIV/SDIV/UREM/SREM/FREM as neon
doesn't support these operations.
199621 [AArch64 NEON] Accept both #0.0 and #0 for comparing with floating
point zero in asm parser.
199628 [AArch64 NEON] Fix a bug caused by undef lane when generating VEXT.
199631 Revert r199628: "[AArch64 NEON] Fix a bug caused by undef lane when
generating VEXT."
199791 [AArch64 NEON] Try to generate CONCAT_VECTOR when lowering
BUILD_VECTOR or SHUFFLE_VECTOR.
199858 fix some spell mistakes around 'ConcatVector' and 'ShuffleVector' in
AArch64 backend.
199861 [AArch64]Add CHECK for two test cases testing scalar_to_vector
committed in r199461.
199978 [AArch64 NEON] Fix a bug in implementing register copy bwtween FPR16.
200109 [AArch64 NEON] Fix pattern match failed on FP_ROUND from v1f128 to
v1f64.
200110 [AArch64 NEON] Add test case for vector FP_ROUND.
200111 [AArch64 NEON] Add patterns for concat_vector on v2i32.
200113 Implement pattern match from v1xx to v1xx for AArch64 Neon.
200119 Improve pattern match from v1i8 to v1i32 for AArch64 Neon.
200179 Revert r199791.
200180 [AArch64 NEON] Try to generate CONCAT_VECTOR when lowering
BUILD_VECTOR or SHUFFLE_VECTOR.
200365 [AArch64 NEON] Lower SELECT_CC with vector operand.
200491 [AArch64] Custom lower concat_vector patterns with v4i16, v4i32,
v8i8, v8i16, v16i8 types.
200706 AArch64 & ARM: refactor crypto intrinsics to take scalars
200768 ARM & AArch64: merge NEON absolute compare intrinsics
201061 [AArch64]Implement the copy of two FPR8 registers by using FMOVss of
two FPR32 registers in copyPhysReg.
201091 [AArch64] Handle aliases of conditional branches without b.pred form.
201287 [AArch64]Add support for spilling FPR8/FPR16.
201298 [AArch64]Fix the problems that can't select mul/add/sub of
v1i8/v1i16/v1i32 types.  As this problems are similar to shl/sra/srl, also
add patterns for shift nodes.
201381 [AArch64]Fix the assertion failure caused by "v1i1 SETCC" DAG node.
 As v1i1 is illegal, the type legalizer tries to scalarize such node. But
if the type operands of SETCC is legal, the scalarization algorithm will
cause an assertion failure.
201385 Enable AArch64 NEON by default.
201395 [AArch64 NEON] Fix a bug to avoid using floating type as condition
type in lowering SELECT_CC.
201541 Fix a typo about lowering AArch64 va_copy.
201793 [AArch64] Add support for TargetTransformInfo Analysis.
201841 [AArch64] Add register constraints to avoid generating STLXR and
STXR with unpredictable behavior.
202775 [AArch64]Fix improper diagnostics about offset range of load/store
instructions.
204304 [ARM]Fix an assertion failure in A15SDOptimizer about DPair reg
class by treating DPair as QPR.
204424 [AArch64] Remove .data_region directive from AArch64.

I know the patch list is little bit longer, we have the following reasons,
1) Last year, when branch 3.4 was created, actually we didn't really have
time to complete all AArch64 neon work. So branch 3.4 is actually at the
middle stage of aarch64 neon implementation. Now the patches I'm requesting
intends to give a complete AArch64 NEON feature.
2) There are several critical bug fixes solving compiler crash issue, and
our end-user really want them to be fixed in new release, and end-user
can't wait until release 3.5.
3) A lot of patches are interleaved and have dependence one another, so
it's easy to introduce bug if do cherry picking only for some of them.

The patches I listed are in time ordering, so it's easy for you to apply
them to branch 3.4. There are only the following failures, but it's easy to
be fixed,
1) 200708: Only need to manually add line "Features["crypto"] = true;"
after line 5924 of file lib/Basic/Targets.cpp
2) 201384: Manually add two lines below after line 7135 of
file lib/Driver/Tools.cpp
  else
    Features.push_back("+neon");
3) 202004: Insert into line 3353 of file test/Preprocessor/init.c. Remove
the part around AARCH64-NETBSD, and remove line below as well,
// AARCH64:#define __ALIGNOF_MAX_ALIGN_T__ 16

Please simply remove file "CodeGen/aarch64-neon-crypto.c", because it is
renamed to be CodeGen/neon-crypto.c. I also attached two monolithic patches
for your reference.

To minimize your effort, I already did initial test.
The tests I did cover the followings, and all can pass.
1) LLVM regression test. "make check-all"
2) ARM internal emperor random test
3) Spec2000 test.

Finally, those patches could bring the followings to release 3.4.1,
1) Complete AArch64 NEON feature:
* support all intrinsics as required by ACLE2.0, and enable AArch64 NEON as
default.
* fixed all pattern match issues for AArch64 NEON back-end.
2) Bug fixes:
* Change 64-bit integer type int64_t mapping from "long long" to "long" and
it potentially affects binary compatibility.
* va_copy run-time behavior failure for AArch64.
* Fix a silent codegen fault for atomic operations (e.g. __sync_...
Intrinsics).
* Fix an assertion failure in A15 SDOptimizer about DPair reg class by
treating DPair as QPR.
* Fix ARM back-end ld/st for v1i64 vector list failure around writeback
mode.

Let me know if you want more info, please! Appreciate your kindly help in
advance!

Thanks,
-Jiangning

2014-03-27 0:10 GMT+08:00 Tom Stellard <tom at stellard.net>:

Hi,
>
> We are now about halfway between the 3.4 and 3.5 releases, and I would
> like to start preparing for a 3.4.1 release.  Here is my proposed release
> schedule:
>
> Mar 26 - April 9: Identify and backport additional bug fixes to the 3.4
> branch.
> April 9 - April 18: Testing Phase
> April 18: 3.4.1 Release
>
> How you can help:
>
> - If you have any bug fixes you think should be included to 3.4.1, send
>   me an email with the SVN revision in trunk and also cc the code owner
>   and llvm-commits (or cfe-commits if it is a clang patch).
>
> - Start integrating the 3.4 branch into your project or OS distribution
>   to and check for any issues.
>
> - Volunteer as a tester for the testing phase.
>
> Thank you,
>
> Tom
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

-- 
Thanks,
-Jiangning

-- 
Thanks,
-Jiangning
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140401/08b7ff88/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: release_3_4_1_llvm.patch.tgz
Type: application/x-gzip
Size: 94936 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140401/08b7ff88/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: release_3_4_1_clang.patch.tgz
Type: application/x-gzip
Size: 53034 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140401/08b7ff88/attachment-0001.bin>