[LLVMdev] PATH and LD_LIBRARY_PATH

Thu Jul 19 21:30:22 PDT 2007

On Thu, 2007-07-19 at 21:03 -0700, Chris Lattner wrote:
> On Thu, 19 Jul 2007, Reid Spencer wrote:
> > With the pending reorganization of the software, I have some questions
> > about how developers set their PATH and LD_LIBRARY_PATH variables when
> > working with LLVM. This is a bit long winded, but bear with me.
> 
> ok :)
> 
> > We're planning to break the "llvm" module up into three modules:
> >
> >      * support - lib/Support, lib/System, autoconf, make support,
> >        utilities
> 
> which utilities?  The C++ programs in llvm/utils should not be moved.

No, I was thinking more like "mkpatch" and "llvmgrep".

> 
> >      * core - VMCore, Asm, Bitcode and the essential IR tools (llvm-as,
> >        etc.)
> 
> I'm still not convinced that this is useful to split out from the rest of 
> the LLVM tree, we should discuss this again after support is split out.

Surely. About the only reason is for modules that only work with the IR
and don't want to have to compile all of the transforms and targets just
to use the IR. I know there's no technical difference, its merely a
developer convenience. In any event, we can discuss later. My point was
there will be more modules.

> 
> >      * opt (not sure that's the final name) - everything else:
> >        Analysis, Transforms, CodeGen, Target, etc
> >
> > Additionally, there are new modules such as "hlvm", "cfe",
> > "llvm-gcc-4.2" and undoubtedly more to come in the future.
> 
> Yep.
> 
> > We haven't decided the final architecture so don't quibble about what
> > goes in what module (yet). The point is, there will be several modules
> > instead of everything being in "llvm". With this situation we can no
> > longer just put llvm/Debug/bin in PATH and llvm/Debug/lib in
> > LD_LIBRARY_PATH and just have things work. Build products would be
> > places in the Debug/bin and Debug/lib directories for each module.
> 
> Ok.
> 
> > However, even with only a single checkout (environment) of llvm
> > software, there are details to be taken care of. We would like to
> > support this better, but the question is how.
> >
> > Here are some of the issues:
> >      * On some platforms you set SHLIB_PATH or SHOBJ_PATH, etc.
> 
> This is up to the user to know what to do.
> 
> >      * With more modules the PATH and LD_LIBRARY_PATH become long (one
> >        entry per module). Having every module's Debug/bin in PATH and
> >        Debug/lib in LD_LIBRARY_PATH gets hard to maintain when there's
> >        multiple environments.
> 
> Lets take a specific example, someone working on the clang front-end.  For 
> these people, they will check out support,llvm, and clang modules.  If 
> they *only* are playing with the front-end, and don't want to install, 
> they just need to add the clang bin directory to their path.  If they also 
> want convenient access to the llvm tools, they can add that dir to their 
> path.  This seems reasonable to me.
> 
> Regardless of the users PATH setting, the build process for  the various 
> modules should invoke the tools from other modules *without* PATH 
> needing to be set.

Yeah, its more LD_LIBRARY_PATH that I'm concerned about. It can and does
screw up linking if its set wrong.

> 
> >        Furthermore, the paths need to change
> >        when you switch to a release or release+asserts or release
> >        +expensive_checks build.
> 
> We have this problem today, it isn't a significant issue AFAICT.
> 
> >      * There are inter-dependencies between modules which may affect
> >        the relative ordering of the PATH and LD_LIBRARY_PATH component
> >        paths.
> 
> This is only an issue if you have a name collision, right?  IF so the 
> answer is "don't do that" :)

Sure .. this entire email is about striking a balance between making the
build system "fool proof" and "flexible".

> >      * Building things can be affected because if you put the wrong
> >        directory in your LD_LIBRARY_PATH you can end up linking against
> >        libraries built by the compiler instead of your platform's
> >        native compiler, which will ultimately fail (very late too).
> 
> This is only for llvm-gcc?

Yes.

> 
> >      * Having two llvm-gcc versions (4.0 and 4.2) in separate modules
> >        could lead to conflicts.
> 
> The only thing that depends on llvm-gcc is the llvm-test suite.  It's 
> configure script should probably try to autodetect which C front-end you 
> have (4.0,4.2, clang) and "build in" the paths it needs into its 
> Makefile.config.

What if you "have" all three? Which does it pick? It probably needs to
be a configure or make option. In any event, if you switch compilers and
don't "make clean" you can end up with link errors (e.g. clang compiled
object linked with llvm-gcc-4.2)

> 
> >      * Upstream projects like hlvm and cfe will have several
> >        dependencies so getting the paths straight is important for
> >        successful building.  Additionally, users will have their own
> >        project directories, at the top of the food chain, which are
> >        dependent on everything.
> 
> I don't see this.  Building should just be a matter of typing the moral 
> equivalent of "make".  If clang used tblgen for its build, it would know 
> to invoke it from the llvm module, and would use an absolute path 
> generated by the makefile.

That's one way to do it :)

> 
> >      * We want to treat each module, as much as possible, as a separate
> >        entity (very loose coupling), but they are API locked anyway and
> >        we can't do much about that. The dependencies are real.
> 
> Yep.  The dependencies are hard dependencies, though it would also be nice 
> to support "optional dependencies" down the line (if you check out "this" 
> in your tree, it enables "that" feature in some other dependent module).

Okay, you can implement that feature :)

> 
> >      * There are utilities that we want in the paths (like llvm/utils)
> >        as well as utilities like TableGen that might eventually be
> >        needed across projects (e.g. "core" would need TableGen for the
> >        intrinsic functions but the module containing the targets also
> >        needs it).
> 
> The makefiles that build the projects should not depend on PATH.  The only 
> need for PATH to be set is if the user wants to invoke something (like 
> llvm-as, opt, etc).

Yeah, I mixed up two things here. So the user's path might want to have
all the llvm-top/*/utils directories in their PATH? And if you want to
keep TestRunner.sh in test, then you need that in your path, and ...
this just gets awkward quickly.

As for build utilities, I'm trying to strike a balance here between hard
coding paths (which make it brittle) and having it "just work". I think
its probably fine to put things in "support" that many other modules
will use. For example, the makefile system itself. That one module name
can be hard coded.  But, as we go along, there may be other things used
(e.g. the hlvm utilities for all of hlvm's front ends) during build. To
avoid PATH for such things we need to know the module its in, *OR* just
always reference the installed thing and then you have perfect control
over which utility is being used, even while developing experimental
versions of that utility (into module/Debug/bin).

> 
> >      * Does every module need its own "llvm-config" program?
> 
> It would be nice if this was shared, perhaps to live in the support 
> module?

I was thinking the same thing, but then it can't give you all the
project specific configuration, just the stuff that support knows about
(which is a lot, but not everything).  Recall that llvm-config needs to
be built after all the libraries have been built so it can pick up all
the dependencies and generate the -l options in the right order. So,
conversely, it actually needs to be in a module that is the LAST one
built, not the first. 

> >      * Some of us have multiple things going on at the same time and so
> >        work with multiple LLVM environments. For example, you could be
> >        working on an involved bug fix, your normal development work,
> >        quickie fixes, a branch for some side work, etc. In each of
> >        these cases you want a separate checkout and the associated
> >        environment variable settings for that directory. I call this an
> >        "environment". It is basically just a way to keep various works
> >        in progress separated. How can multiple environment be best
> >        supported?
> 
> Wow, those people need to learn to work more incrementally ;-).  j/k

Oh, it is incremental .. there's just a lot of increments :)

> 
> It seems that they should just check out multiple trees and have scripts 
> or something to set their PATH as appropriate... just like today.
> 
> I use this sort of thing when I have a frozen version of a tree for some 
> project, and we need to backport a patch to that tree.  In this case, I 
> don't mess with my path at all, I just manually invoke utilities from that 
> tree with absolute paths.

I'm too forgetful for that. I want to run something that would put me in
"my backport patch branch environment" so I can just type "llvm-as" and
it gets the right one. Then all I have to do is remember to run that
thing.

> 
> > So, the question is .. what do you want to do about all this?
> >
> > Here are some options to be discussed:
> >
> >     1. Punt - Let each developer/user figure this out for the
> >        themselves.
> 
> This is the defacto answer until we get a solution :)

Of course :)

> 
> >     2. Install - That is, set your PATH and LD_LIBRARY_PATH to one
> >        place and "make install" the build results into that directory.
> 
> We *need* to support make install, but we also should not make it 
> required.  End users just want to 'check out/download + build + install', 
> they don't want to mess with their environment or anything else for that 
> matter.

So, you're saying if you just want to use LLVM than invoke "install" and
set your path to the install location. 

But, if you're a developer then .. see #1 :)

> 
> >     3. Shell - Provide some shell functions and aliases to manage
> >        setting the environment   correctly. This could even use the
> >        ModuleInfo.txt file to glean dependencies. For example,the
> >        llvm-top module could have a "setenv.sh" scrip that is invoked
> >        with ". ./setenv.sh"to set the environment for whatever is
> >        checked out in that llvm-top. We'd need one for each type of
> >        shell and users would have to remember to run it.
> 
> Ick.

Okay, disapproval noted. Got anything substantive to add to that?

> 
> > I need help with #4 but I'm also looking for general feedback on solving 
> > the issues raised.
> 
> To be clear, we're talking about LLVM developers here, not end users (who 
> just use make install).  I think LLVM devs can know to add a directory or 
> two if they want convenient access to some llvm tool that gets built. 
> Worse case they can use absolute paths if they want.

So, basically .. #1 :)

> 
> -Chris
>