[PATCH] D97340: [HIP] Support Spack packages

Artem Belevich via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Mar 2 15:36:52 PST 2021


tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.

LGTM modulo couple of nits.



================
Comment at: clang/include/clang/Driver/Options.td:3535-3536
   HelpText<"Print the registered targets">;
+def print_rocm_search_dirs : Flag<["-", "--"], "print-rocm-search-dirs">,
+  HelpText<"Print the paths used for finding ROCm installation">;
 def private__bundle : Flag<["-"], "private_bundle">;
----------------
yaxunl wrote:
> tra wrote:
> > Perhaps the existing `print-search-dirs` option would do the job?
> The concern is similar to https://reviews.llvm.org/D79210:
> 
> There may be tools parsing print-search-dirs output, changing it may break those tools.
> 
OK. New option it is then,


================
Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:163
 
 // For candidate specified by --rocm-path we do not do strict check.
+const SmallVectorImpl<RocmInstallationDetector::Candidate> &
----------------
I'm not quite sure which part of the code is related to the `do not do strict check`.  Perhaps move the comment closer to the code it describes and add the details what we'd do otherwise, so one can understand what's missing.

Also, adding a comment describing what the function does (returns candidate list, populating it once, if necessary) would be great.


================
Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:361-372
       auto Path = CandidatePath;
+      // Device library built by SPACK is installed to
+      // <rocm_root>/rocm-device-libs-<rocm_release_string>-<hash> directory.
+      if (Candidate.isSPACK()) {
+        auto SPACKPath = findSPACKPackage(D, Path, "rocm-device-libs",
+                                          Candidate.SPACKReleaseStr);
+        if (!SPACKPath.empty())
----------------
This could be simplified to
```
 auto MaybeSpackPath = Candidate.isSPACK() ? findSPACKPackage() : "";
 auto Path = MaybeSpackPath.empty() ? CandidatePath : MaybeSpackPath;

```

We could further simplify it if we move  the `isSpack()` check into findSPACKPackage and make it return an empty string if it's not. 


================
Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:180
   if (!RocmPathArg.empty()) {
-    Candidates.emplace_back(RocmPathArg.str());
-    return Candidates;
+    ROCmSearchDirs.emplace_back(RocmPathArg.str());
+    DoPrintROCmSearchDirs();
----------------
yaxunl wrote:
> tra wrote:
> > We call getInstallationPathCandidates() more more than one place, so we may end up with the same path added multiple times.
> > 
> > I think population of the candidate list should be separated from reading it back, and should be done only once.
> > E.g.
> > ```
> >   if (!isCandidateListReady) 
> >      initCandidateList(); // populates ROCmSearchDirs, sets isCandidateListReady
> >   return ROCmSearchDirs;
> > ```
> > 
> > 
> We check whether ROCmSearchDirs is populated on line 166 and only populate it if not.
Agreed, the code works. That said, the fact that I didn't notice that is a point in case -- it's not obvious.
It's much easier to keep track of what's going on when one function does one things without changing its behavior from one call to another.
I'd at least add a comment around line 166 saying that just return the list if we've already populated it.



================
Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:207
+    // installation candidate for SPACK.
+    if (ParentName.startswith("llvm-amdgpu-")) {
+      auto SPACKReleaseStr =
----------------
yaxunl wrote:
> tra wrote:
> > What would be a consequence of mistakingly detecting a non-SPACK installation as SPACK?
> > 
> > I'm still not quite comfortable with using a path name, which can be arbitrarily named by a user, as a way to detect particular installation method.
> > Do SPACK packages have something *inside* the package directory that we could use to tell that it is indeed an SPACK package? Some sort of metadata or version file, perhaps?
> > 
> > It's probably OK to rely on the path naming scheme, but there are still corner cases it can't handle.
> > I could have a setup when the path-name based detection would still fail despite considering both the reaolved and unresolved symlink names.
> > E.g. 
> > ```
> > /pkg/llvm-amdgpu-hash -> /pkg/some-other-real-dir-name
> > /usr/local/bin/clang -> /pkg/llvm-amdgpu-hash/bin/clang
> > ```
> > If I compile with `/usr/local/bin/clang` neither the realpath nor the unresolved symlink will have 'llvm-amdgpu'. 
> > 
> > Granted, it's a somewhat contrived setup. Using the name as the hint would work for the standard setups, so we can live with that, provided that we can always tell compiler where to find the files it needs, if the user has an unusual install.
> > 
> llvm-amdgpu Spack installation does not add extra files. It uses normal llvm cmake files to build and differs from a non-Spack llvm build only by name. However, its name follows a very unusual pattern llvm-amdgpu-<release>-<hash>, where <hash> is a string of 32 alphanumerical characters. I think we could detect it by checking the length of <hash>.
I think we're getting beyond the point of diminishing returns here. If we don't have a way to identify an SPACK package directly, detecting it by the name will never be 100% reliable. The current code will be correct most of the time. For those few cases where it may get it wrong, youser should be able to specify the correct path directly. No need to complicate it further -- it's not going to buy us much.


================
Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:354-372
+  CandidatesInfo CanInfo;
+  if (!HIPPathArg.empty())
+    CanInfo.Candidates.emplace_back(HIPPathArg.str(), /*StrictChecking=*/true);
+  else
+    CanInfo = getInstallationPathCandidates();
   auto &FS = D.getVFS();
 
----------------
yaxunl wrote:
> tra wrote:
> > yaxunl wrote:
> > > tra wrote:
> > > > So, normally `--hip-path` would point to the HIP package installation. 
> > > > However, is clang installation path contains `llvm-amdgpu` (and thus CanInfo.SPACKReleaseStr is set to whatever version may follow),
> > > > then `--hip-path` (and other candidate paths) directories will also be treated as an SPACK package directory and, if the `hip` package is found there, then *that* package directory will be used as the actual path to be used.
> > > > 
> > > > I see several potential issues with that.
> > > > 
> > > > - if clang path contains llvm-amdgpu, but it is *not* installed as an SPACK package We have no control over user's choice of the name for the installation path.
> > > > - what if clang is installed as an SPCAK, but is invoked via a symlink with the name that does not contain `llvm-amdgpu`
> > > > - the fact that `--hip-path`/`--rocm-path` behavior may change based on clang's executable path is rather peculiar. User-supplied parameters should not change their meaning depending on what you call the executable. 
> > > > 
> > > > I'd be OK to derive spack-based paths if user didn't provide the paths explicitly, but I don't think we should add any magic to explicit overrides.
> > > > 
> > > > 
> > > Will make SPACKReleaseStr a member of Candidate and allow mixed Spack candidates and non-Spack candidates. If clang path contains llvm-amdgpu but is not a Spack package, clang will still look for non-Spack ROCm candidates.
> > > 
> > > Also added parent of real path of clang as ROCm candidate. If clang is invoked through a sym link which is not under the real ROCm directory, clang will still be able to detect it. Also added --print-rocm-search-dirs for testing this.
> > > 
> > > When --hip-path/--rocm-path is set, the specified ROCm/HIP path is used. Spack detection is disabled, since the candidate specified by --rocm-path or --hip-path has empty SPACKReleaseStr.
> > > Also added --print-rocm-search-dirs for testing this.
> > 
> > Maybe we could just enable that with `-v`.  We already are printing various locations for GCC installations, etc.
> The situation is somehow similar to https://reviews.llvm.org/D79210
> 
> Another reason not to print this with -v is that normally users may not need this information unless clang cannot find ROCm and users want to know why. 
OK.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97340/new/

https://reviews.llvm.org/D97340



More information about the cfe-commits mailing list