[llvm-dev] Experiment on how to improve our temporary file handing.

Rui Ueyama via llvm-dev llvm-dev at lists.llvm.org
Mon Nov 13 01:06:00 PST 2017


On Fri, Nov 10, 2017 at 6:13 AM, Rafael Avila de Espindola <
rafael.espindola at gmail.com> wrote:

> Currently a power failure or other hard crash can cause lld leave a
> temporary
> file around. The same is true for other llvm tools.
>
> As an example, put a breakpoint in Writer.cpp:236 ("writeBuildId()") and
> restart the run a few times. You will get
>
> t.tmp43a735a  t.tmp4deeabb  t.tmp9bacdd3  t.tmpe4115c4  t.tmpeb01fff
>
> The same would happen if there was a fatal error between the
> FileOutputBuffer creation and commit. I don't think that is a code path
> where that is possible right now, but it would be an easy thing to miss
> in a code review.
>
> I was hopping the OS could help us manage the temporary file so that
> there was no way to accidentally leave it behind.
>
> On linux there is O_TMPFILE, which allows us to create a file with no
> name in the file system. A name can be given with linkat. Unfortunately
> we can't use
>
> linkat(fd, "", AT_FDCWD, "destination", AT_EMPTY_PATH)
>
> Without special permissions and have instead to depend on proc:
>
> linkat(AT_FDCWD, "/proc/self/fd/<num>", AT_FDCWD, "destination",
>        AT_SYMLINK_FOLLOW)
>

I often execute lld in a chroot environment in which /proc is not mounted,
and I expect lld would work just fine in that environment. So the presence
of /proc should be considered optional even on Linux.

Another annoyance is that linkat will not replace the destination and
> renameat2 doesn't support AT_EMPTY_PATH. The result is that we have to
> use unlink+linkat and loop.
>

I think you can avoid unlink+linkat. You can instead give a temporary file
name to an unnamed file and then rename the temporary file the destination
file. There's a small chance between linkat and rename, but that race
condition is probably better than unlink+linkat.

On windows there is FILE_FLAG_DELETE_ON_CLOSE, but there seems to be no
> way to cancel it. If a file is created with it I can rename it, but it
> is still deleted in the end.
>
> I couldn't find any support for this on FreeBSD.
>
> This suggest that we cannot just have a createUniqueEntity that returs
> just an FD. We will need a class that stores a temporary name too for
> the systems where we cannot give a filename to a FD.
>
> I have attached the patch I got so far, but given the existing
> restrictions I would say it is not worth it.
>
> It will try to just make it more obvious that lld's FileOutputBuffer is
> deleted on all code paths.
>
>
> diff --git a/include/llvm/Support/FileSystem.h b/include/llvm/Support/
> FileSystem.h
> index 03015a0ca3b..5d80db1d5a2 100644
> --- a/include/llvm/Support/FileSystem.h
> +++ b/include/llvm/Support/FileSystem.h
> @@ -31,6 +31,7 @@
>  #include "llvm/ADT/StringRef.h"
>  #include "llvm/ADT/Twine.h"
>  #include "llvm/Support/Chrono.h"
> +#include "llvm/Support/Error.h"
>  #include "llvm/Support/ErrorHandling.h"
>  #include "llvm/Support/ErrorOr.h"
>  #include "llvm/Support/MD5.h"
> @@ -694,6 +695,26 @@ std::error_code createUniqueFile(const Twine &Model,
> int &ResultFD,
>  std::error_code createUniqueFile(const Twine &Model,
>                                   SmallVectorImpl<char> &ResultPath);
>
> +// Represents a temporary file and hides the name. This allows not even
> having
> +// no name when O_TMPFILE is supported.
> +class TempFile {
> +  // Temporary name used when O_TMPFILE is not avilable.
> +  std::string TmpName;
> +
> +public:
> +  TempFile(StringRef Name, int FD);
> +  TempFile(TempFile &&Other);
> +
> +  // The open file descriptor.
> +  int FD;
> +
> +  Error keep(const Twine &Name);
> +  ~TempFile();
> +};
> +
> +Expected<TempFile> createUniqueFile(const Twine &Directory,
> +                                    unsigned Mode = all_read | all_write);
> +
>  /// @brief Create a file in the system temporary directory.
>  ///
>  /// The filename is of the form prefix-random_chars.suffix. Since the
> directory
> diff --git a/lib/Support/FileOutputBuffer.cpp b/lib/Support/
> FileOutputBuffer.cpp
> index db8ff38e5b5..af61af3c32e 100644
> --- a/lib/Support/FileOutputBuffer.cpp
> +++ b/lib/Support/FileOutputBuffer.cpp
> @@ -34,9 +34,9 @@ using namespace llvm::sys;
>  // with the temporary file on commit().
>  class OnDiskBuffer : public FileOutputBuffer {
>  public:
> -  OnDiskBuffer(StringRef Path, StringRef TempPath,
> +  OnDiskBuffer(StringRef Path, fs::TempFile Temp,
>                 std::unique_ptr<fs::mapped_file_region> Buf)
> -      : FileOutputBuffer(Path), Buffer(std::move(Buf)),
> TempPath(TempPath) {}
> +      : FileOutputBuffer(Path), Buffer(std::move(Buf)),
> Temp(std::move(Temp)) {}
>
>    uint8_t *getBufferStart() const override { return (uint8_t
> *)Buffer->data(); }
>
> @@ -51,21 +51,18 @@ public:
>      Buffer.reset();
>
>      // Atomically replace the existing file with the new one.
> -    auto EC = fs::rename(TempPath, FinalPath);
> -    sys::DontRemoveFileOnSignal(TempPath);
> -    return errorCodeToError(EC);
> +    return Temp.keep(FinalPath);
>    }
>
>    ~OnDiskBuffer() override {
>      // Close the mapping before deleting the temp file, so that the
> removal
>      // succeeds.
>      Buffer.reset();
> -    fs::remove(TempPath);
>    }
>
>  private:
>    std::unique_ptr<fs::mapped_file_region> Buffer;
> -  std::string TempPath;
> +  fs::TempFile Temp;
>  };
>
>  // A FileOutputBuffer which keeps data in memory and writes to the final
> @@ -110,13 +107,13 @@ createInMemoryBuffer(StringRef Path, size_t Size,
> unsigned Mode) {
>
>  static Expected<std::unique_ptr<OnDiskBuffer>>
>  createOnDiskBuffer(StringRef Path, size_t Size, unsigned Mode) {
> -  // Create new file in same directory but with random name.
> -  SmallString<128> TempPath;
> -  int FD;
> -  if (auto EC = fs::createUniqueFile(Path + ".tmp%%%%%%%", FD, TempPath,
> Mode))
> -    return errorCodeToError(EC);
> -
> -  sys::RemoveFileOnSignal(TempPath);
> +  StringRef Directory = path::parent_path(Path);
> +  if (Directory.empty())
> +    Directory = ".";
> +  Expected<fs::TempFile> FileOrErr = fs::createUniqueFile(Directory,
> Mode);
> +  if (!FileOrErr)
> +    return FileOrErr.takeError();
> +  fs::TempFile File = std::move(*FileOrErr);
>
>  #ifndef LLVM_ON_WIN32
>    // On Windows, CreateFileMapping (the mmap function on Windows)
> @@ -124,18 +121,18 @@ createOnDiskBuffer(StringRef Path, size_t Size,
> unsigned Mode) {
>    // extend the file beforehand. _chsize (ftruncate on Windows) is
>    // pretty slow just like it writes specified amount of bytes,
>    // so we should avoid calling that function.
> -  if (auto EC = fs::resize_file(FD, Size))
> +  if (auto EC = fs::resize_file(File.FD, Size))
>      return errorCodeToError(EC);
>  #endif
>
>    // Mmap it.
>    std::error_code EC;
>    auto MappedFile = llvm::make_unique<fs::mapped_file_region>(
> -      FD, fs::mapped_file_region::readwrite, Size, 0, EC);
> -  close(FD);
> +      File.FD, fs::mapped_file_region::readwrite, Size, 0, EC);
>    if (EC)
>      return errorCodeToError(EC);
> -  return llvm::make_unique<OnDiskBuffer>(Path, TempPath,
> std::move(MappedFile));
> +  return llvm::make_unique<OnDiskBuffer>(Path, std::move(File),
> +                                         std::move(MappedFile));
>  }
>
>  // Create an instance of FileOutputBuffer.
> diff --git a/lib/Support/Path.cpp b/lib/Support/Path.cpp
> index 9692acb5283..d3a356cecde 100644
> --- a/lib/Support/Path.cpp
> +++ b/lib/Support/Path.cpp
> @@ -18,8 +18,12 @@
>  #include "llvm/Support/ErrorHandling.h"
>  #include "llvm/Support/FileSystem.h"
>  #include "llvm/Support/Process.h"
> +#include "llvm/Support/Signals.h"
>  #include <cctype>
>  #include <cstring>
> +#include <fcntl.h>
> +#include <sys/stat.h>
> +#include <sys/types.h>
>
>  #if !defined(_MSC_VER) && !defined(__MINGW32__)
>  #include <unistd.h>
> @@ -759,6 +763,73 @@ std::error_code createUniqueFile(const Twine &Model,
>    return createUniqueEntity(Model, Dummy, ResultPath, false, 0, FS_Name);
>  }
>
> +TempFile::TempFile(StringRef Name, int FD) : TmpName(Name), FD(FD) {}
> +TempFile::TempFile(TempFile &&Other) {
> +  TmpName = std::move(Other.TmpName);
> +  FD = Other.FD;
> +  Other.FD = -1;
> +}
> +
> +TempFile::~TempFile() {
> +  // FIXME: error handling. Move this to a discard method?
> +  if (FD != -1)
> +    close(FD);
> +  if (!TmpName.empty())
> +    fs::remove(TmpName);
> +  sys::DontRemoveFileOnSignal(TmpName);
> +}
> +
> +Error TempFile::keep(const Twine &Name) {
> +  if (TmpName.empty()) {
> +    // We can't do it atomically.
> +    // FIXME: should we loop?
> +    if (std::error_code EC = fs::remove(Name))
> +      return errorCodeToError(EC);
> +    SmallString<128> Storage;
> +    StringRef NameP = Name.toNullTerminatedStringRef(Storage);
> +    std::string ProcFD = "/proc/self/fd/" + std::to_string(FD);
> +    if (linkat(AT_FDCWD, ProcFD.c_str(), AT_FDCWD, NameP.begin(),
> +               AT_SYMLINK_FOLLOW) == -1) {
> +      std::error_code EC(errno, std::generic_category());
> +      return errorCodeToError(EC);
> +    }
> +  } else {
> +    if (std::error_code EC = fs::rename(TmpName, Name))
> +      return errorCodeToError(EC);
> +    sys::DontRemoveFileOnSignal(TmpName);
> +    TmpName = "";
> +  }
> +
> +  if (close(FD) == -1) {
> +    std::error_code EC(errno, std::generic_category());
> +    return errorCodeToError(EC);
> +  }
> +  FD = -1;
> +
> +  return Error::success();
> +}
> +
> +Expected<TempFile> createUniqueFile(const Twine &Directory, unsigned
> Mode) {
> +#if 1
> +  SmallString<128> Storage;
> +  StringRef P = Directory.toNullTerminatedStringRef(Storage);
> +  int FD = open(P.begin(), O_RDWR | O_TMPFILE | O_CLOEXEC, Mode);
> +  if (FD == -1) {
> +    std::error_code EC(errno, std::generic_category());
> +    return errorCodeToError(EC);
> +  }
> +  return TempFile("", FD);
> +#else
> +  int FD;
> +  SmallString<128> ResultPath;
> +  if (std::error_code EC =
> +          createUniqueFile(Directory + "/tmp%%%%%%%", FD, ResultPath,
> Mode))
> +    return errorCodeToError(EC);
> +  sys::RemoveFileOnSignal(ResultPath);
> +  return TempFile(ResultPath, FD);
> +#endif
> +}
> +
>  static std::error_code
>  createTemporaryFile(const Twine &Model, int &ResultFD,
>                      llvm::SmallVectorImpl<char> &ResultPath, FSEntity
> Type) {
> diff --git a/lib/Support/Unix/Path.inc b/lib/Support/Unix/Path.inc
> index 2ecb97316c8..3099b75a1b8 100644
> --- a/lib/Support/Unix/Path.inc
> +++ b/lib/Support/Unix/Path.inc
> @@ -779,6 +779,8 @@ std::error_code openFileForWrite(const Twine &Name,
> int &ResultFD,
>
>    int OpenFlags = O_CREAT;
>
> +  // add O_TMPFILE?
> +
>  #ifdef O_CLOEXEC
>    OpenFlags |= O_CLOEXEC;
>  #endif
>
>
> Cheers,
> Rafael
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171113/3540f2c1/attachment.html>


More information about the llvm-dev mailing list