RFC: Enable vectorization of call instructions in the loop vectorizer

James Molloy james at jamesmolloy.co.uk
Wed Jan 15 11:22:07 PST 2014


Hi Arnold,

After a Christmas hiatus, please find attached the latest version of this
patch. I also attach a larger patch which contains the entire
implementation of function call vectorization (scalarization soon to
follow).

I followed all your advice except about creating a library function to
transform the call. The reason for this is that the logic is absolutely
trivial - most of the difficulty is in converting the arguments and this is
a per-vectorizer thing - a utility function wouldn't be able to help.

Please review!

Cheers,

James


On 25 December 2013 17:03, James Molloy <james at jamesmolloy.co.uk> wrote:

> Hi Arnold,
>
> Thanks for the reply over the Christmas season :)
>
> Yep, your proposed scheme would work for me and I will implement that.
> Given that I'm ditching uniforms (at least for now), I don't think a
> library function is really required as the generation of the vectorized
> function call is trivial.
>
> Merry Christmas,
>
> James
>
>
> On 24 December 2013 22:43, Arnold <aschwaighofer at apple.com> wrote:
>
>>
>>
>> Sent from my iPhone
>>
>> On Dec 22, 2013, at 4:04 AM, James Molloy <james at jamesmolloy.co.uk>
>> wrote:
>>
>> Hi Arnold,
>>
>> No worries, it's Christmas season so I expect long delays between replies
>> (hence the day delay with this reply!)
>>
>> > I don't think that TargetLibraryInfo should do the transformation from
>> scalar to vectorized call instruction itself.
>> > I think, TargetLibraryInfo should just provide a mapping from function
>> name to function name (including the vector factor, etc). There should > be
>> a function that lives in lib/transforms that does the transformation. I
>> don’t think IR type transformations should go into lib/Target.
>>
>> > Do we need to reuse the LibFunc enum to identify these functions?
>>
>> OK. I tried to go via LibFuncs purely so that TLI exposed a more coherent
>> interface (call getLibFunc on a function name, get back an identifier for
>> that function which you can then use to query the rest of TLI). I'm very
>> happy with going direct from fn name to fn name, and leaving the creation
>> of the CallInst to something else (probably LoopVectorizer itself).
>>
>>
>> We will want to vectorize function calls in both vectorizers so a library
>> function would be best.
>>
>>
>> Regarding uniforms - I think the best way to handle these is to ignore
>> them for the moment. At least in OpenCL, any function that can take a
>> scalar uniform can also take a vector non-uniform. So mapping from all
>> arguments scalar to all arguments vector is always valid, and will simplify
>> the logic a lot. Then a later pass could, if it wanted, reidentify uniforms
>> and change which function is called.
>>
>>
>> Okay.
>>
>>
>> > I don't understand why the function calls need to be virtual. The
>> mapping function name to family of functions should capture everything we
>> need?
>> >  (Fun name, VF) -> {#num_entries, vector fun, {vector fun with uniform
>> params, 1}, …}
>>
>>
>> As Nadav mentioned, the entire suite of OpenCL built in functions is
>> massive - hundreds of functions. Pre-seeding a map with all of these
>> functions is something I'd quite like to avoid. There are two options -
>> parameterise TLI with some map, like you said, or make TLI's functionality
>> overridable. I prefer the latter, because it allows an implementation to do
>> a lot of work lazily which is important for compile time performance (see
>> the performance improvements I had to implement in the LLVM bitcode linker
>> as an example of how much lazy link performance matters - small kernel, big
>> library).
>>
>> Making TLI's functionality overridable is, however, a soft requirement
>> and I'm keen to satisfy reviewers. So if this use case doesn't wash with
>> you, let me know and I'll make it take a map and sort out laziness my own
>> way downstream.
>>
>>
>>
>> I don’t think we need to use virtual function calls to lazily setup the
>> table.
>>
>> One of the reasons why I don’t like subclassing as an extension mechanism
>> is that it does not compose. Say we had VecLibA and VecLibB (and other
>> times just one of them) it is hard to express this with subclassing.
>>
>> How about a scheme similar to this:
>>
>> VecDesc *veclibtbl = {{ “cos”, “cos2”, 2}, {"cos", “cos4”, 4}, {"sin",
>> "sin4", 4}, ...};
>> SmallVector<VecDesc *, 4> VectorTables;
>> mutable StringMap<const char *> VecToScalar;
>>
>> during initialization we push VectorDescs onto the vector.
>>
>> intialize(*tli) {
>>   if (data layout.contains(“OpenCL”)
>>     tli->addVectorTable(openclvectbl);
>>   if (data layout.contains(“veclib”);
>>     tli->addVectorTable(veclibtbl);
>>   ...
>> }
>>
>> When we query TLI for a vector variant we binary search each VecDesc for
>> the function (similar to how we do for LibFuncs entries now).
>>
>> The VecToScalar map can be lazily initialized the first time it is
>> queried.
>>
>>
>> Would this work for you?
>>
>>
>> > Can you infer the vectorized function name from the scalar function
>> name in a predictable way (but why use an enum then)? I don’t understand
>> the use case that requires virtual functions.
>>
>> Alas no, CL (my main use case) functions are overloaded and thus are
>> name-mangled. I don't want to put  logic in *anywhere*.
>>
>>
>> On 20 December 2013 18:45, Arnold <aschwaighofer at apple.com> wrote:
>>
>>> Hi James,
>>>
>>> Again thank you for moving this forward! Sorry for not chiming in
>>> earlier, I am in the midst of a move.
>>>
>>> I don't think that TargetLibraryInfo should do the transformation from
>>> scalar to vectorized call instruction itself.
>>> I think, TargetLibraryInfo should just provide a mapping from function
>>> name to function name (including the vector factor, etc). There should be a
>>> function that lives in lib/transforms that does the transformation. I don’t
>>> think IR type transformations should go into lib/Target.
>>>
>>> I don't understand why the function calls need to be virtual. The
>>> mapping function name to family of functions should capture everything we
>>> need?
>>>  (Fun name, VF) -> {#num_entries, vector fun, {vector fun with uniform
>>> params, 1}, …}
>>> Can you infer the vectorized function name from the scalar function name
>>> in a predictable way (but why use an enum then)? I don’t understand the use
>>> case that requires virtual functions.
>>>
>>> We can then initialize this mapping in a static function like we do now
>>> or at later point have this mapping additionally initialized by Module
>>> metadata.
>>>
>>> Do we need to reuse the LibFunc enum to identify these functions? Do you
>>> want to add (before-patch) TargetLibraryInfo::LibFunc style optimizations
>>> to the optimizer? (This would go beyond just enabling vectorization of call
>>> instruction).
>>> It seems to me you are solving two separate things with this patch: one
>>> is vectorization of call instructions and the second one is adding target
>>> (really "target language") defined “LibFunc” functions?
>>>
>>> For the former, I think a map of function names should be enough if we
>>> just want to know scalar and vector variants of functions? Otherwise, we
>>> would have to go through a map of function name string to LibFunc to
>>> vectorized function name. I think we can omit the intermediate step. If we
>>> had a string mapping, this would also readily support Module metadata.
>>>
>>> The latter I think is out of scope for this patch and something that
>>> needs to be discussed separately.
>>>
>>> Thanks,
>>> Arnold
>>>
>>>
>>> > On Dec 19, 2013, at 5:43 AM, James Molloy <James.Molloy at arm.com>
>>> wrote:
>>> >
>>> > Hi all,
>>> >
>>> > Attached is the first patch in a sequence to implement this behaviour
>>> in the way described in this thread.
>>> >
>>> > This patch simply:
>>> > * Changes most methods on TargetLibraryInfo to be virtual, to allow
>>> clients to override them.
>>> > * Adds three new functions for querying the vectorizability (and
>>> scalarizability) of library functions. The default implementation returns
>>> failure for all of these queries.
>>> >
>>> > Please review!
>>> >
>>> > James
>>> >
>>> >> -----Original Message-----
>>> >> From: Hal Finkel [mailto:hfinkel at anl.gov]
>>> >> Sent: 16 December 2013 21:10
>>> >> To: Arnold Schwaighofer
>>> >> Cc: llvm-commits; James Molloy
>>> >> Subject: Re: RFC: Enable vectorization of call instructions in the
>>> loop
>>> >> vectorizer
>>> >>
>>> >> ----- Original Message -----
>>> >>> From: "Arnold Schwaighofer" <aschwaighofer at apple.com>
>>> >>> To: "Hal Finkel" <hfinkel at anl.gov>
>>> >>> Cc: "llvm-commits" <llvm-commits at cs.uiuc.edu>, "James Molloy"
>>> >> <James.Molloy at arm.com>
>>> >>> Sent: Monday, December 16, 2013 3:08:13 PM
>>> >>> Subject: Re: RFC: Enable vectorization of call instructions in the
>>> loop
>>> >> vectorizer
>>> >>>
>>> >>>
>>> >>>> On Dec 16, 2013, at 2:59 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>>> >>>>
>>> >>>> ----- Original Message -----
>>> >>>>> From: "Arnold Schwaighofer" <aschwaighofer at apple.com>
>>> >>>>> To: "James Molloy" <James.Molloy at arm.com>
>>> >>>>> Cc: "llvm-commits" <llvm-commits at cs.uiuc.edu>
>>> >>>>> Sent: Monday, December 16, 2013 12:03:02 PM
>>> >>>>> Subject: Re: RFC: Enable vectorization of call instructions in the
>>> >>>>> loop     vectorizer
>>> >>>>>
>>> >>>>>
>>> >>>>> On Dec 16, 2013, at 11:08 AM, James Molloy <James.Molloy at arm.com>
>>> >>>>> wrote:
>>> >>>>>
>>> >>>>>> Hi Renato, Nadav,
>>> >>>>>>
>>> >>>>>> Attached is a proof of concept[1] patch for adding the ability to
>>> >>>>>> vectorize calls. The intended use case for this is in domain
>>> >>>>>> specific languages such as OpenCL where tuned implementation of
>>> >>>>>> functions for differing vector widths exist and can be guaranteed
>>> >>>>>> to be semantically the same as the scalar version.
>>> >>>>>>
>>> >>>>>> I’ve considered two approaches to this. The first was to create a
>>> >>>>>> set of hooks that allow the LoopVectorizer to interrogate its
>>> >>>>>> client as to whether calls are vectorizable and if so, how.
>>> >>>>>> Renato
>>> >>>>>> argued that this was suboptimal as it required a client to invoke
>>> >>>>>> the LoopVectorizer manually and couldn’t be tested through opt. I
>>> >>>>>> agree.
>>> >>>>>
>>> >>>>> I don’t understand this argument.
>>> >>>>>
>>> >>>>> We could extend target library info with additional api calls to
>>> >>>>> query whether a function is vectorizable at a vector factor.
>>> >>>>> This can be tested by providing the target triple string (e.g
>>> >>>>> “target
>>> >>>>> triple = x86_64-gnu-linux-with_opencl_vector_lib") in the .ll file
>>> >>>>> that informs the optimizer that a set of vector library calls is
>>> >>>>> available.
>>> >>>>>
>>> >>>>> The patch seems to restrict legal vector widths dependent on
>>> >>>>> available vectorizable function calls. I don’t think this should
>>> >>>>> work like this.
>>> >>>>> I believe, there should be an api on TargetTransformInfo for
>>> >>>>> library
>>> >>>>> function calls. The vectorizer chooses the cheapest of either an
>>> >>>>> intrinsic call or a library function call.
>>> >>>>> The overall cost model determines which VF will be chosen.
>>> >>>>
>>> >>>> We don't have a good model currently for non-intrinsic function
>>> >>>> calls. Once we do, we'll want to know how expensive the vectorized
>>> >>>> versions are compared to the scalar version. Short of that, I
>>> >>>> think that a reasonable approximation is that any function calls
>>> >>>> will be the most expensive things in a loop, and the ability to
>>> >>>> vectorize them will be the most important factor in determining
>>> >>>> the vectorization factor.
>>> >>>
>>> >>> Yes and we can easily model this in the cost model by asking what is
>>> >>> the cost of a (library) function call (vectorized or not) and have
>>> >>> this return a reasonably high value.
>>> >>
>>> >> Sounds good to me.
>>> >>
>>> >> -Hal
>>> >>
>>> >>>
>>> >>>
>>> >>>>
>>> >>>> -Hal
>>> >>>>
>>> >>>>>
>>> >>>>> Thanks,
>>> >>>>> Arnold
>>> >>>>>
>>> >>>>>
>>> >>>>> _______________________________________________
>>> >>>>> llvm-commits mailing list
>>> >>>>> llvm-commits at cs.uiuc.edu
>>> >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>> >>>>
>>> >>>> --
>>> >>>> Hal Finkel
>>> >>>> Assistant Computational Scientist
>>> >>>> Leadership Computing Facility
>>> >>>> Argonne National Laboratory
>>> >>
>>> >> --
>>> >> Hal Finkel
>>> >> Assistant Computational Scientist
>>> >> Leadership Computing Facility
>>> >> Argonne National Laboratory
>>> >
>>> >
>>> > -- IMPORTANT NOTICE: The contents of this email and any attachments
>>> are confidential and may also be privileged. If you are not the intended
>>> recipient, please notify the sender immediately and do not disclose the
>>> contents to any other person, use it for any purpose, or store or copy the
>>> information in any medium.  Thank you.
>>> >
>>> > ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
>>> Registered in England & Wales, Company No:  2557590
>>> > ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1
>>> 9NJ, Registered in England & Wales, Company No:  2548782
>>> > <vectorizer-tli.diff>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140115/0d68567e/attachment.html>
-------------- next part --------------
Index: include/llvm/Target/TargetLibraryInfo.h
===================================================================
--- include/llvm/Target/TargetLibraryInfo.h	(revision 197586)
+++ include/llvm/Target/TargetLibraryInfo.h	(working copy)
@@ -11,6 +11,7 @@
 #define LLVM_TARGET_TARGETLIBRARYINFO_H
 
 #include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/ArrayRef.h"
 #include "llvm/Pass.h"
 
 namespace llvm {
@@ -667,10 +668,26 @@
 /// library functions are available for the current target, and allows a
 /// frontend to disable optimizations through -fno-builtin etc.
 class TargetLibraryInfo : public ImmutablePass {
+public:
+  /// VecDesc - Describes a possible vectorization of a function.
+  /// Function 'VectorFnName' is equivalent to 'ScalarFnName' vectorized
+  /// by a factor 'VectorizationFactor'.
+  struct VecDesc {
+    const char *ScalarFnName;
+    const char *VectorFnName;
+    unsigned VectorizationFactor;
+  };
+
+private:
   virtual void anchor();
   unsigned char AvailableArray[(LibFunc::NumLibFuncs+3)/4];
   llvm::DenseMap<unsigned, std::string> CustomNames;
   static const char* StandardNames[LibFunc::NumLibFuncs];
+  /// Vectorization descriptors - sorted by ScalarFnName.
+  std::vector<VecDesc> VectorDescs;
+  /// Scalarization descriptors - same content as VectorDescs but sorted based
+  /// on VectorFnName rather than ScalarFnName.
+  std::vector<VecDesc> ScalarDescs;
 
   enum AvailabilityState {
     StandardName = 3, // (memset to all ones)
@@ -766,6 +783,38 @@
   /// disableAllFunctions - This disables all builtins, which is used for
   /// options like -fno-builtin.
   void disableAllFunctions();
+
+  /// addVectorizableFunctions - Add a set of scalar -> vector mappings,
+  /// queryable via getVectorizedFunction and getScalarizedFunction.
+  void addVectorizableFunctions(ArrayRef<VecDesc> Fns);
+
+  /// isFunctionVectorizable - Return true if the function F has a
+  /// vector equivalent with vectorization factor VF.
+  bool isFunctionVectorizable(StringRef F, unsigned VF) const {
+    return !getVectorizedFunction(F, VF).empty();
+  }
+
+  /// isFunctionVectorizable - Return true if the function F has a
+  /// vector equivalent with any vectorization factor.
+  bool isFunctionVectorizable(StringRef F) const;
+
+  /// getVectorizedFunction - Return the name of the equivalent of 
+  /// F, vectorized with factor VF. If no such mapping exists,
+  /// return the empty string.
+  StringRef getVectorizedFunction(StringRef F, unsigned VF) const;
+
+  /// isFunctionScalarizable - Return true if the function F has a
+  /// scalar equivalent, and set VF to be the vectorization factor.
+  bool isFunctionScalarizable(StringRef F, unsigned &VF) const {
+    return !getScalarizedFunction(F, VF).empty();
+  }
+
+  /// getScalarizedFunction - Return the name of the equivalent of 
+  /// F, scalarized. If no such mapping exists, return the empty string.
+  ///
+  /// Set VF to the vectorization factor.
+  StringRef getScalarizedFunction(StringRef F, unsigned &VF) const;
+
 };
 
 } // end namespace llvm
Index: unittests/Transforms/Makefile
===================================================================
--- unittests/Transforms/Makefile	(revision 197586)
+++ unittests/Transforms/Makefile	(working copy)
@@ -9,7 +9,7 @@
 
 LEVEL = ../..
 
-PARALLEL_DIRS = DebugIR Utils
+PARALLEL_DIRS = DebugIR Utils LoopVectorize
 
 include $(LEVEL)/Makefile.common
 
Index: unittests/Transforms/CMakeLists.txt
===================================================================
--- unittests/Transforms/CMakeLists.txt	(revision 197586)
+++ unittests/Transforms/CMakeLists.txt	(working copy)
@@ -1,2 +1,3 @@
 add_subdirectory(DebugIR)
 add_subdirectory(Utils)
+add_subdirectory(LoopVectorize)
Index: unittests/Transforms/LoopVectorize/LoopVectorize.cpp
===================================================================
--- unittests/Transforms/LoopVectorize/LoopVectorize.cpp	(revision 0)
+++ unittests/Transforms/LoopVectorize/LoopVectorize.cpp	(working copy)
@@ -0,0 +1,171 @@
+//===- LoopVectorize.cpp - Unit tests for vectorizing libcalls ------------===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Analysis/TargetTransformInfo.h"
+#include "llvm/Assembly/Parser.h"
+#include "llvm/IR/DataLayout.h"
+#include "llvm/IR/Function.h"
+#include "llvm/IR/GlobalValue.h"
+#include "llvm/IR/LLVMContext.h"
+#include "llvm/IR/Module.h"
+#include "llvm/PassManager.h"
+#include "llvm/Support/SourceMgr.h"
+#include "llvm/Target/TargetLibraryInfo.h"
+#include "llvm/Transforms/Vectorize.h"
+#include "gtest/gtest.h"
+using namespace llvm;
+
+namespace llvm {
+void initializeDummyTTIPass(PassRegistry &);
+}
+
+/// A TargetTransformInfo pass that merely states that the target
+/// has vector registers. This is used to force the LoopVectorizer
+/// to vectorize even without a "real" codegen target linked in.
+class DummyTTI : public ImmutablePass, public TargetTransformInfo {
+public:
+  DummyTTI() : ImmutablePass(ID) {
+    initializeDummyTTIPass(*PassRegistry::getPassRegistry());
+  }
+
+  virtual void initializePass() {
+    pushTTIStack(this);
+  }
+
+  virtual void finalizePass() {
+    popTTIStack();
+  }
+
+  virtual void getAnalysisUsage(AnalysisUsage &AU) const {
+    TargetTransformInfo::getAnalysisUsage(AU);
+  }
+
+  static char ID;
+
+  /// Provide necessary pointer adjustments for the two base classes.
+  virtual void *getAdjustedAnalysisPointer(const void *ID) {
+    if (ID == &TargetTransformInfo::ID)
+      return (TargetTransformInfo*)this;
+    return this;
+  }
+
+  unsigned getNumberOfRegisters(bool Vector) const {
+    return 16;
+  }
+
+  unsigned getRegisterBitWidth(bool Vector) const {
+    return 128;
+  }
+};
+
+INITIALIZE_AG_PASS(DummyTTI, TargetTransformInfo, "dummytti",
+                   "Dummy Target Transform Info", true, true, false)
+char DummyTTI::ID = 0;
+
+const char *SourceStr = 
+  "target datalayout = \"e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128\"\n"
+  "target triple = \"armv7-linux-gnueabihf\"\n"
+  "\n"
+  "declare double @test_vectorizable(double %p1, double %p2)\n"
+  "declare <2 x double> @test_vectorizable_vf2(<2 x double> %p1, <2 x double> %p2)\n"
+  "declare <5 x double> @test_vectorizable_vf5(<5 x double> %p1, <5 x double> %p2)\n"
+  "\n"
+  "define void @test(double* %d, double %t) {\n"
+  "entry:\n"
+  "  br label %for.body\n"
+  "\n"
+  "for.body:\n"
+  "  %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]\n"
+  "  %arrayidx = getelementptr inbounds double* %d, i64 %indvars.iv\n"
+  "  %0 = load double* %arrayidx, align 8\n"
+  "  %1 = call double @test_vectorizable(double %0, double %t)\n"
+  "  store double %1, double* %arrayidx, align 8\n"
+  "  %indvars.iv.next = add i64 %indvars.iv, 1\n"
+  "  %lftr.wideiv = trunc i64 %indvars.iv.next to i32\n"
+  "  %exitcond = icmp ne i32 %lftr.wideiv, 128\n"
+  "  br i1 %exitcond, label %for.body, label %for.end\n"
+  "\n"
+  "for.end:\n"
+  "  ret void\n"
+  "}\n"
+  ;
+
+// Test that a scalar call can be vectorized (by a factor 2).
+TEST(LoopVectorize, VectorizeCall) {
+  LLVMContext &C(getGlobalContext());
+  SMDiagnostic Err;
+
+  Module M("test tli", C); 
+  ASSERT_TRUE(ParseAssemblyString(SourceStr, &M, Err, C) != NULL);
+
+  // Just check the module was parsed correctly.
+  ASSERT_TRUE(M.getFunction("test"));
+  ASSERT_TRUE(M.getFunction("test_vectorizable"));
+  Function *VF = M.getFunction("test_vectorizable_vf2");
+  ASSERT_TRUE(VF);
+  // Expect that the declaration is not called.
+  EXPECT_TRUE(VF->getNumUses() == 0);
+
+  PassManager PM;
+
+  TargetLibraryInfo *TLI = new TargetLibraryInfo();
+  TargetLibraryInfo::VecDesc VD = {"test_vectorizable", "test_vectorizable_vf2", 2};
+  ArrayRef<TargetLibraryInfo::VecDesc> AVD(VD);
+  TLI->addVectorizableFunctions(AVD);
+
+  PM.add(TLI);
+  PM.add(new DataLayout(M.getDataLayout()));
+  PM.add(new DummyTTI());
+  PM.add(createLoopVectorizePass());
+  PM.run(M);
+
+  // Expect that the declaration is now called!
+  EXPECT_TRUE(VF->getNumUses() == 1);
+}
+
+// Test that a scalar call which does NOT have a vectorized version
+// for the vectorization factor chosen is handled.
+TEST(LoopVectorize, ScalarChain) {
+  LLVMContext &C(getGlobalContext());
+  SMDiagnostic Err;
+
+  Module M("test tli", C); 
+  ASSERT_TRUE(ParseAssemblyString(SourceStr, &M, Err, C) != NULL);
+
+  // Just check the module was parsed correctly.
+  ASSERT_TRUE(M.getFunction("test"));
+  Function *SF = M.getFunction("test_vectorizable");
+  ASSERT_TRUE(SF);
+  Function *VF = M.getFunction("test_vectorizable_vf5");
+  ASSERT_TRUE(VF);
+  // Expect that the vector declaration is not called.
+  EXPECT_TRUE(VF->getNumUses() == 0);
+
+  PassManager PM;
+
+  // In this test, state that test_vectorizable is vectorizable only
+  // by a factor 5. The loop vectorizer will try and vectorize, but
+  // will pick a factor of 2. So the function call needs to be 
+  // emitted as a chain of scalars.
+  TargetLibraryInfo *TLI = new TargetLibraryInfo();
+  TargetLibraryInfo::VecDesc VD = {"test_vectorizable", "test_vectorizable_vf5", 5};
+  ArrayRef<TargetLibraryInfo::VecDesc> AVD(VD);
+  TLI->addVectorizableFunctions(AVD);
+
+  PM.add(TLI);
+  PM.add(new DataLayout(M.getDataLayout()));
+  PM.add(new DummyTTI());
+  PM.add(createLoopVectorizePass());
+  PM.run(M);
+
+  // Expect that the vectorized declaration is not called, and that
+  // the scalar declaration is called more than once.
+  EXPECT_TRUE(VF->getNumUses() == 0);
+  EXPECT_TRUE(SF->getNumUses() > 1);
+}
Index: unittests/Transforms/LoopVectorize/Makefile
===================================================================
--- unittests/Transforms/LoopVectorize/Makefile	(revision 0)
+++ unittests/Transforms/LoopVectorize/Makefile	(working copy)
@@ -0,0 +1,15 @@
+##===- unittests/Transforms/LoopVectorize/Makefile ---------*- Makefile -*-===##
+#
+#                     The LLVM Compiler Infrastructure
+#
+# This file is distributed under the University of Illinois Open Source
+# License. See LICENSE.TXT for details.
+#
+##===----------------------------------------------------------------------===##
+
+LEVEL = ../../..
+TESTNAME = LoopVectorize
+LINK_COMPONENTS := Analysis AsmParser Core Support Vectorize
+
+include $(LEVEL)/Makefile.config
+include $(LLVM_SRC_ROOT)/unittests/Makefile.unittest
Index: unittests/Transforms/LoopVectorize/CMakeLists.txt
===================================================================
--- unittests/Transforms/LoopVectorize/CMakeLists.txt	(revision 0)
+++ unittests/Transforms/LoopVectorize/CMakeLists.txt	(working copy)
@@ -0,0 +1,11 @@
+set(LLVM_LINK_COMPONENTS
+  Analysis
+  AsmParser
+  Core
+  Support
+  Vectorize
+  )
+
+add_llvm_unittest(LoopVectorizeTests
+  LoopVectorize.cpp
+  )
Index: lib/Transforms/Vectorize/LoopVectorize.cpp
===================================================================
--- lib/Transforms/Vectorize/LoopVectorize.cpp	(revision 197586)
+++ lib/Transforms/Vectorize/LoopVectorize.cpp	(working copy)
@@ -2750,28 +2750,94 @@
 
       Module *M = BB->getParent()->getParent();
       CallInst *CI = cast<CallInst>(it);
-      Intrinsic::ID ID = getIntrinsicIDForCall(CI, TLI);
-      assert(ID && "Not an intrinsic call!");
-      switch (ID) {
-      case Intrinsic::lifetime_end:
-      case Intrinsic::lifetime_start:
-        scalarizeInstruction(it);
-        break;
-      default:
+      StringRef FnName = CI->getCalledFunction()->getName();
+      if (Intrinsic::ID ID = getIntrinsicIDForCall(CI, TLI)) {
+        switch (ID) {
+        case Intrinsic::lifetime_end:
+        case Intrinsic::lifetime_start:
+          scalarizeInstruction(it);
+          break;
+        default:
+          for (unsigned Part = 0; Part < UF; ++Part) {
+            SmallVector<Value *, 4> Args;
+            for (unsigned i = 0, ie = CI->getNumArgOperands(); i != ie; ++i) {
+              VectorParts &Arg = getVectorValue(CI->getArgOperand(i));
+              Args.push_back(Arg[Part]);
+            }
+            Type *Tys[] = {CI->getType()};
+            if (VF > 1)
+              Tys[0] = VectorType::get(CI->getType()->getScalarType(), VF);
+
+            Function *F = Intrinsic::getDeclaration(M, ID, Tys);
+            Entry[Part] = Builder.CreateCall(F, Args);
+          }
+          break;
+        }
+      } else if (TLI &&
+                 TLI->isFunctionVectorizable(FnName, VF)) {
+        // This is a function with a vector form.
+        StringRef VFnName = TLI->getVectorizedFunction(FnName, VF);
+        assert(!VFnName.empty());
+
+        Function *VectorF = M->getFunction(VFnName);
+        assert(VectorF && "Vectorized function did not exist!");
+
         for (unsigned Part = 0; Part < UF; ++Part) {
           SmallVector<Value *, 4> Args;
           for (unsigned i = 0, ie = CI->getNumArgOperands(); i != ie; ++i) {
             VectorParts &Arg = getVectorValue(CI->getArgOperand(i));
             Args.push_back(Arg[Part]);
           }
-          Type *Tys[] = {CI->getType()};
-          if (VF > 1)
-            Tys[0] = VectorType::get(CI->getType()->getScalarType(), VF);
 
-          Function *F = Intrinsic::getDeclaration(M, ID, Tys);
-          Entry[Part] = Builder.CreateCall(F, Args);
+          Entry[Part] = Builder.CreateCall(VectorF, Args);;
         }
-        break;
+
+      } else {
+        // We have a function call that has no vector form - we must scalarize
+        // it.
+        // FIXME: We could check if it has a vector form for smaller values of
+        // VF, then chain them together instead of bailing and being fully
+        // scalar.
+        bool IsVoidTy = CI->getType()->isVoidTy();
+
+        for (unsigned UPart = 0; UPart < UF; ++UPart) {
+          Value *VRet = NULL;
+          // If we have to return something, start with an undefined vector and
+          // fill it in element by element.
+          if (!IsVoidTy)
+            VRet = UndefValue::get(VectorType::get(CI->getType(), VF));
+
+          for (unsigned VPart = 0; VPart < VF; ++VPart) {
+            
+            SmallVector<Value *, 4> Args;
+            for (unsigned I = 0, IE = CI->getNumArgOperands(); I != IE; ++I) {
+              Value *Operand = CI->getArgOperand(I);
+
+              Instruction *Inst = dyn_cast<Instruction>(Operand);
+              if (!Inst || Legal->isUniformAfterVectorization(Inst)) {
+                // Uniform variable - just use the original scalar argument.
+                Args.push_back(Operand);
+              } else {
+                // Non-uniform.
+                assert(WidenMap.has(Operand) &&
+                       "Non-uniform values must be in WidenMap!");
+                Value *VArg = WidenMap.get(Operand)[UPart];
+                Value *Arg =
+                  Builder.CreateExtractElement(VArg,
+                                               Builder.getInt32(VPart));
+                Args.push_back(Arg);
+              }
+            }
+            
+            Value *NewCI = Builder.CreateCall(CI->getCalledFunction(), Args);
+
+            if (!IsVoidTy)
+              VRet = Builder.CreateInsertElement(VRet, NewCI,
+                                                 Builder.getInt32(VPart));
+          }
+          Entry[UPart] = VRet;
+        }
+
       }
       break;
     }
@@ -3099,11 +3165,16 @@
         return false;
       }// end of PHI handling
 
-      // We still don't handle functions. However, we can ignore dbg intrinsic
-      // calls and we do handle certain intrinsic and libm functions.
+      // We handle calls that:
+      //   * Are debug info intrinsics.
+      //   * Have a mapping to an IR intrinsic.
+      //   * Have a vector version available.
+
       CallInst *CI = dyn_cast<CallInst>(it);
-      if (CI && !getIntrinsicIDForCall(CI, TLI) && !isa<DbgInfoIntrinsic>(CI)) {
-        DEBUG(dbgs() << "LV: Found a call site.\n");
+      if (CI && !getIntrinsicIDForCall(CI, TLI) && !isa<DbgInfoIntrinsic>(CI)
+          && !(CI->getCalledFunction() && TLI &&
+               TLI->isFunctionVectorizable(CI->getCalledFunction()->getName()))) {
+        DEBUG(dbgs() << "LV: Found a non-intrinsic, non-libfunc callsite.\n");
         return false;
       }
 
@@ -3892,6 +3963,12 @@
         if (Call && getIntrinsicIDForCall(Call, TLI))
           continue;
 
+        // If the function has an explicit vectorized counterpart, we can safely
+        // assume that it can be vectorized.
+        if (Call && Call->getCalledFunction() &&
+            TLI->isFunctionVectorizable(Call->getCalledFunction()->getName()))
+          continue;
+
         LoadInst *Ld = dyn_cast<LoadInst>(it);
         if (!Ld) return false;
         if (!Ld->isSimple() && !IsAnnotatedParallel) {
@@ -5018,13 +5095,29 @@
   }
   case Instruction::Call: {
     CallInst *CI = cast<CallInst>(I);
-    Intrinsic::ID ID = getIntrinsicIDForCall(CI, TLI);
-    assert(ID && "Not an intrinsic call!");
-    Type *RetTy = ToVectorTy(CI->getType(), VF);
-    SmallVector<Type*, 4> Tys;
-    for (unsigned i = 0, ie = CI->getNumArgOperands(); i != ie; ++i)
-      Tys.push_back(ToVectorTy(CI->getArgOperand(i)->getType(), VF));
-    return TTI.getIntrinsicInstrCost(ID, RetTy, Tys);
+    Function *F = CI->getCalledFunction();
+
+    if (Intrinsic::ID ID = getIntrinsicIDForCall(CI, TLI)) {
+
+      Type *RetTy = ToVectorTy(CI->getType(), VF);
+      SmallVector<Type*, 4> Tys;
+      for (unsigned i = 0, ie = CI->getNumArgOperands(); i != ie; ++i)
+        Tys.push_back(ToVectorTy(CI->getArgOperand(i)->getType(), VF));
+      return TTI.getIntrinsicInstrCost(ID, RetTy, Tys);
+
+    } else if (TLI && F && TLI->isFunctionVectorizable(F->getName(), VF)) {
+
+      Module *M = CI->getParent()->getParent()->getParent();
+      F = M->getFunction(TLI->getVectorizedFunction(F->getName(), VF));
+      assert(F && "Vectorized function did not exist!");
+      return TTI.getCallCost(F, CI->getNumArgOperands());
+
+    } else {
+
+      // No vectorized function available - must scalarize to a chain of calls.
+      return TTI.getCallCost(F, CI->getNumArgOperands()) * VF;
+
+    }
   }
   default: {
     // We are scalarizing the instruction. Return the cost of the scalar
Index: lib/Target/TargetLibraryInfo.cpp
===================================================================
--- lib/Target/TargetLibraryInfo.cpp	(revision 197586)
+++ lib/Target/TargetLibraryInfo.cpp	(working copy)
@@ -22,6 +22,50 @@
 
 void TargetLibraryInfo::anchor() { }
 
+namespace {
+  // A comparison functor for VecDesc instances for the scalar -> vector mapping.
+  // This sorts by ScalarFnName.
+  struct ScalarToVectorComparator {
+    bool operator()(const TargetLibraryInfo::VecDesc &LHS,
+                    const TargetLibraryInfo::VecDesc &RHS) const {
+      // Compare prefixes with strncmp. If prefixes match we know that LHS is
+      // greater or equal to RHS as RHS can't contain any '\0'.
+      return std::strncmp(LHS.ScalarFnName,
+                          RHS.ScalarFnName,
+                          std::strlen(RHS.ScalarFnName)) < 0;
+    }
+
+    bool operator()(const TargetLibraryInfo::VecDesc &LHS, StringRef S) const {
+      // Compare prefixes with strncmp. If prefixes match we know that LHS is
+      // greater or equal to RHS as RHS can't contain any '\0'.
+      return std::strncmp(LHS.ScalarFnName,
+                          S.data(),
+                          S.size()) < 0;
+    }
+  };
+
+  // A comparison functor for VecDesc instances for the vector -> scalar mapping.
+  // This sorts by VectorFnName.
+  struct VectorToScalarComparator {
+    bool operator()(const TargetLibraryInfo::VecDesc &LHS,
+                    const TargetLibraryInfo::VecDesc &RHS) const {
+      // Compare prefixes with strncmp. If prefixes match we know that LHS is
+      // greater or equal to RHS as RHS can't contain any '\0'.
+      return std::strncmp(LHS.VectorFnName,
+                          RHS.VectorFnName,
+                          std::strlen(RHS.VectorFnName)) < 0;
+    }
+
+    bool operator()(const TargetLibraryInfo::VecDesc &LHS, StringRef S) const {
+      // Compare prefixes with strncmp. If prefixes match we know that LHS is
+      // greater or equal to RHS as RHS can't contain any '\0'.
+      return std::strncmp(LHS.VectorFnName,
+                          S.data(),
+                          S.size()) < 0;
+    }
+  };
+} // end anonymous namespace
+
 const char* TargetLibraryInfo::StandardNames[LibFunc::NumLibFuncs] =
   {
     "_IO_getc",
@@ -697,20 +741,28 @@
 };
 }
 
-bool TargetLibraryInfo::getLibFunc(StringRef funcName,
-                                   LibFunc::Func &F) const {
-  const char **Start = &StandardNames[0];
-  const char **End = &StandardNames[LibFunc::NumLibFuncs];
-
+static StringRef sanitizeFunctionName(StringRef funcName) {
   // Filter out empty names and names containing null bytes, those can't be in
   // our table.
   if (funcName.empty() || funcName.find('\0') != StringRef::npos)
-    return false;
+    return StringRef();
 
   // Check for \01 prefix that is used to mangle __asm declarations and
   // strip it if present.
   if (funcName.front() == '\01')
     funcName = funcName.substr(1);
+  return funcName;
+}
+
+bool TargetLibraryInfo::getLibFunc(StringRef funcName,
+                                   LibFunc::Func &F) const {
+  const char **Start = &StandardNames[0];
+  const char **End = &StandardNames[LibFunc::NumLibFuncs];
+
+  funcName = sanitizeFunctionName(funcName);
+  if (funcName.empty())
+    return false;
+
   const char **I = std::lower_bound(Start, End, funcName, StringComparator());
   if (I != End && *I == funcName) {
     F = (LibFunc::Func)(I - Start);
@@ -724,3 +776,58 @@
 void TargetLibraryInfo::disableAllFunctions() {
   memset(AvailableArray, 0, sizeof(AvailableArray));
 }
+
+void TargetLibraryInfo::addVectorizableFunctions(ArrayRef<VecDesc> Fns) {
+  VectorDescs.insert(VectorDescs.end(), Fns.begin(), Fns.end());
+  std::sort(VectorDescs.begin(), VectorDescs.end(), ScalarToVectorComparator());
+
+  ScalarDescs.insert(ScalarDescs.end(), Fns.begin(), Fns.end());
+  std::sort(ScalarDescs.begin(), ScalarDescs.end(), VectorToScalarComparator());
+}
+
+bool TargetLibraryInfo::isFunctionVectorizable(StringRef funcName) const {
+  funcName = sanitizeFunctionName(funcName);
+  if (funcName.empty())
+    return false;
+
+  std::vector<VecDesc>::const_iterator I = std::lower_bound(VectorDescs.begin(),
+                                                      VectorDescs.end(),
+                                                      funcName,
+                                                      ScalarToVectorComparator());
+  return I != VectorDescs.end();
+}
+
+StringRef TargetLibraryInfo::getVectorizedFunction(StringRef F,
+                                                   unsigned VF) const {
+  F = sanitizeFunctionName(F);
+  if (F.empty())
+    return F;
+
+  std::vector<VecDesc>::const_iterator I = std::lower_bound(VectorDescs.begin(),
+                                                      VectorDescs.end(),
+                                                      F,
+                                                      ScalarToVectorComparator());
+  while (I != VectorDescs.end() && StringRef(I->ScalarFnName) == F) {
+    if (I->VectorizationFactor == VF)
+      return I->VectorFnName;
+    ++I;
+  }
+  return StringRef();
+}
+
+StringRef TargetLibraryInfo::getScalarizedFunction(StringRef F,
+                                                   unsigned &VF) const {
+  F = sanitizeFunctionName(F);
+  if (F.empty())
+    return F;
+
+  std::vector<VecDesc>::const_iterator
+    I = std::lower_bound(ScalarDescs.begin(),
+                         ScalarDescs.end(),
+                         F,
+                         VectorToScalarComparator());
+  if (I != VectorDescs.end())
+    return StringRef();
+  VF = I->VectorizationFactor;
+  return I->ScalarFnName;
+}
-------------- next part --------------
Index: include/llvm/Target/TargetLibraryInfo.h
===================================================================
--- include/llvm/Target/TargetLibraryInfo.h	(revision 197586)
+++ include/llvm/Target/TargetLibraryInfo.h	(working copy)
@@ -11,6 +11,7 @@
 #define LLVM_TARGET_TARGETLIBRARYINFO_H
 
 #include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/ArrayRef.h"
 #include "llvm/Pass.h"
 
 namespace llvm {
@@ -667,10 +668,26 @@
 /// library functions are available for the current target, and allows a
 /// frontend to disable optimizations through -fno-builtin etc.
 class TargetLibraryInfo : public ImmutablePass {
+public:
+  /// VecDesc - Describes a possible vectorization of a function.
+  /// Function 'VectorFnName' is equivalent to 'ScalarFnName' vectorized
+  /// by a factor 'VectorizationFactor'.
+  struct VecDesc {
+    const char *ScalarFnName;
+    const char *VectorFnName;
+    unsigned VectorizationFactor;
+  };
+
+private:
   virtual void anchor();
   unsigned char AvailableArray[(LibFunc::NumLibFuncs+3)/4];
   llvm::DenseMap<unsigned, std::string> CustomNames;
   static const char* StandardNames[LibFunc::NumLibFuncs];
+  /// Vectorization descriptors - sorted by ScalarFnName.
+  std::vector<VecDesc> VectorDescs;
+  /// Scalarization descriptors - same content as VectorDescs but sorted based
+  /// on VectorFnName rather than ScalarFnName.
+  std::vector<VecDesc> ScalarDescs;
 
   enum AvailabilityState {
     StandardName = 3, // (memset to all ones)
@@ -766,6 +783,38 @@
   /// disableAllFunctions - This disables all builtins, which is used for
   /// options like -fno-builtin.
   void disableAllFunctions();
+
+  /// addVectorizableFunctions - Add a set of scalar -> vector mappings,
+  /// queryable via getVectorizedFunction and getScalarizedFunction.
+  void addVectorizableFunctions(ArrayRef<VecDesc> Fns);
+
+  /// isFunctionVectorizable - Return true if the function F has a
+  /// vector equivalent with vectorization factor VF.
+  bool isFunctionVectorizable(StringRef F, unsigned VF) const {
+    return !getVectorizedFunction(F, VF).empty();
+  }
+
+  /// isFunctionVectorizable - Return true if the function F has a
+  /// vector equivalent with any vectorization factor.
+  bool isFunctionVectorizable(StringRef F) const;
+
+  /// getVectorizedFunction - Return the name of the equivalent of 
+  /// F, vectorized with factor VF. If no such mapping exists,
+  /// return the empty string.
+  StringRef getVectorizedFunction(StringRef F, unsigned VF) const;
+
+  /// isFunctionScalarizable - Return true if the function F has a
+  /// scalar equivalent, and set VF to be the vectorization factor.
+  bool isFunctionScalarizable(StringRef F, unsigned &VF) const {
+    return !getScalarizedFunction(F, VF).empty();
+  }
+
+  /// getScalarizedFunction - Return the name of the equivalent of 
+  /// F, scalarized. If no such mapping exists, return the empty string.
+  ///
+  /// Set VF to the vectorization factor.
+  StringRef getScalarizedFunction(StringRef F, unsigned &VF) const;
+
 };
 
 } // end namespace llvm
Index: lib/Target/TargetLibraryInfo.cpp
===================================================================
--- lib/Target/TargetLibraryInfo.cpp	(revision 197586)
+++ lib/Target/TargetLibraryInfo.cpp	(working copy)
@@ -22,6 +22,50 @@
 
 void TargetLibraryInfo::anchor() { }
 
+namespace {
+  // A comparison functor for VecDesc instances for the scalar -> vector mapping.
+  // This sorts by ScalarFnName.
+  struct ScalarToVectorComparator {
+    bool operator()(const TargetLibraryInfo::VecDesc &LHS,
+                    const TargetLibraryInfo::VecDesc &RHS) const {
+      // Compare prefixes with strncmp. If prefixes match we know that LHS is
+      // greater or equal to RHS as RHS can't contain any '\0'.
+      return std::strncmp(LHS.ScalarFnName,
+                          RHS.ScalarFnName,
+                          std::strlen(RHS.ScalarFnName)) < 0;
+    }
+
+    bool operator()(const TargetLibraryInfo::VecDesc &LHS, StringRef S) const {
+      // Compare prefixes with strncmp. If prefixes match we know that LHS is
+      // greater or equal to RHS as RHS can't contain any '\0'.
+      return std::strncmp(LHS.ScalarFnName,
+                          S.data(),
+                          S.size()) < 0;
+    }
+  };
+
+  // A comparison functor for VecDesc instances for the vector -> scalar mapping.
+  // This sorts by VectorFnName.
+  struct VectorToScalarComparator {
+    bool operator()(const TargetLibraryInfo::VecDesc &LHS,
+                    const TargetLibraryInfo::VecDesc &RHS) const {
+      // Compare prefixes with strncmp. If prefixes match we know that LHS is
+      // greater or equal to RHS as RHS can't contain any '\0'.
+      return std::strncmp(LHS.VectorFnName,
+                          RHS.VectorFnName,
+                          std::strlen(RHS.VectorFnName)) < 0;
+    }
+
+    bool operator()(const TargetLibraryInfo::VecDesc &LHS, StringRef S) const {
+      // Compare prefixes with strncmp. If prefixes match we know that LHS is
+      // greater or equal to RHS as RHS can't contain any '\0'.
+      return std::strncmp(LHS.VectorFnName,
+                          S.data(),
+                          S.size()) < 0;
+    }
+  };
+} // end anonymous namespace
+
 const char* TargetLibraryInfo::StandardNames[LibFunc::NumLibFuncs] =
   {
     "_IO_getc",
@@ -697,20 +741,28 @@
 };
 }
 
-bool TargetLibraryInfo::getLibFunc(StringRef funcName,
-                                   LibFunc::Func &F) const {
-  const char **Start = &StandardNames[0];
-  const char **End = &StandardNames[LibFunc::NumLibFuncs];
-
+static StringRef sanitizeFunctionName(StringRef funcName) {
   // Filter out empty names and names containing null bytes, those can't be in
   // our table.
   if (funcName.empty() || funcName.find('\0') != StringRef::npos)
-    return false;
+    return StringRef();
 
   // Check for \01 prefix that is used to mangle __asm declarations and
   // strip it if present.
   if (funcName.front() == '\01')
     funcName = funcName.substr(1);
+  return funcName;
+}
+
+bool TargetLibraryInfo::getLibFunc(StringRef funcName,
+                                   LibFunc::Func &F) const {
+  const char **Start = &StandardNames[0];
+  const char **End = &StandardNames[LibFunc::NumLibFuncs];
+
+  funcName = sanitizeFunctionName(funcName);
+  if (funcName.empty())
+    return false;
+
   const char **I = std::lower_bound(Start, End, funcName, StringComparator());
   if (I != End && *I == funcName) {
     F = (LibFunc::Func)(I - Start);
@@ -724,3 +776,58 @@
 void TargetLibraryInfo::disableAllFunctions() {
   memset(AvailableArray, 0, sizeof(AvailableArray));
 }
+
+void TargetLibraryInfo::addVectorizableFunctions(ArrayRef<VecDesc> Fns) {
+  VectorDescs.insert(VectorDescs.end(), Fns.begin(), Fns.end());
+  std::sort(VectorDescs.begin(), VectorDescs.end(), ScalarToVectorComparator());
+
+  ScalarDescs.insert(ScalarDescs.end(), Fns.begin(), Fns.end());
+  std::sort(ScalarDescs.begin(), ScalarDescs.end(), VectorToScalarComparator());
+}
+
+bool TargetLibraryInfo::isFunctionVectorizable(StringRef funcName) const {
+  funcName = sanitizeFunctionName(funcName);
+  if (funcName.empty())
+    return false;
+
+  std::vector<VecDesc>::const_iterator I = std::lower_bound(VectorDescs.begin(),
+                                                      VectorDescs.end(),
+                                                      funcName,
+                                                      ScalarToVectorComparator());
+  return I != VectorDescs.end();
+}
+
+StringRef TargetLibraryInfo::getVectorizedFunction(StringRef F,
+                                                   unsigned VF) const {
+  F = sanitizeFunctionName(F);
+  if (F.empty())
+    return F;
+
+  std::vector<VecDesc>::const_iterator I = std::lower_bound(VectorDescs.begin(),
+                                                      VectorDescs.end(),
+                                                      F,
+                                                      ScalarToVectorComparator());
+  while (I != VectorDescs.end() && StringRef(I->ScalarFnName) == F) {
+    if (I->VectorizationFactor == VF)
+      return I->VectorFnName;
+    ++I;
+  }
+  return StringRef();
+}
+
+StringRef TargetLibraryInfo::getScalarizedFunction(StringRef F,
+                                                   unsigned &VF) const {
+  F = sanitizeFunctionName(F);
+  if (F.empty())
+    return F;
+
+  std::vector<VecDesc>::const_iterator
+    I = std::lower_bound(ScalarDescs.begin(),
+                         ScalarDescs.end(),
+                         F,
+                         VectorToScalarComparator());
+  if (I != VectorDescs.end())
+    return StringRef();
+  VF = I->VectorizationFactor;
+  return I->ScalarFnName;
+}


More information about the llvm-commits mailing list