[llvm-dev] Experiment on how to improve our temporary file handing.

Reid Kleckner via llvm-dev llvm-dev at lists.llvm.org
Thu Nov 9 13:31:30 PST 2017


I agree, having a nameless FD would be the most elegant way to solve this
problem on Unix, but when I investigated it, I came to the same conclusion
that you did: it can't be done portably. Nameless files are just too weird,
so I think we're stuck with RemoveFileOnSignal. =/

For Windows, I was able to use SetFileInformationByHandle to get the OS to
clean up files for us on exit. I attached my proof-of-concept program that
does this.

On Thu, Nov 9, 2017 at 1:13 PM, Rafael Avila de Espindola via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Currently a power failure or other hard crash can cause lld leave a
> temporary
> file around. The same is true for other llvm tools.
>
> As an example, put a breakpoint in Writer.cpp:236 ("writeBuildId()") and
> restart the run a few times. You will get
>
> t.tmp43a735a  t.tmp4deeabb  t.tmp9bacdd3  t.tmpe4115c4  t.tmpeb01fff
>
> The same would happen if there was a fatal error between the
> FileOutputBuffer creation and commit. I don't think that is a code path
> where that is possible right now, but it would be an easy thing to miss
> in a code review.
>
> I was hopping the OS could help us manage the temporary file so that
> there was no way to accidentally leave it behind.
>
> On linux there is O_TMPFILE, which allows us to create a file with no
> name in the file system. A name can be given with linkat. Unfortunately
> we can't use
>
> linkat(fd, "", AT_FDCWD, "destination", AT_EMPTY_PATH)
>
> Without special permissions and have instead to depend on proc:
>
> linkat(AT_FDCWD, "/proc/self/fd/<num>", AT_FDCWD, "destination",
>        AT_SYMLINK_FOLLOW)
>
> Another annoyance is that linkat will not replace the destination and
> renameat2 doesn't support AT_EMPTY_PATH. The result is that we have to
> use unlink+linkat and loop.
>
> On windows there is FILE_FLAG_DELETE_ON_CLOSE, but there seems to be no
> way to cancel it. If a file is created with it I can rename it, but it
> is still deleted in the end.
>
> I couldn't find any support for this on FreeBSD.
>
> This suggest that we cannot just have a createUniqueEntity that returs
> just an FD. We will need a class that stores a temporary name too for
> the systems where we cannot give a filename to a FD.
>
> I have attached the patch I got so far, but given the existing
> restrictions I would say it is not worth it.
>
> It will try to just make it more obvious that lld's FileOutputBuffer is
> deleted on all code paths.
>
>
> diff --git a/include/llvm/Support/FileSystem.h b/include/llvm/Support/
> FileSystem.h
> index 03015a0ca3b..5d80db1d5a2 100644
> --- a/include/llvm/Support/FileSystem.h
> +++ b/include/llvm/Support/FileSystem.h
> @@ -31,6 +31,7 @@
>  #include "llvm/ADT/StringRef.h"
>  #include "llvm/ADT/Twine.h"
>  #include "llvm/Support/Chrono.h"
> +#include "llvm/Support/Error.h"
>  #include "llvm/Support/ErrorHandling.h"
>  #include "llvm/Support/ErrorOr.h"
>  #include "llvm/Support/MD5.h"
> @@ -694,6 +695,26 @@ std::error_code createUniqueFile(const Twine &Model,
> int &ResultFD,
>  std::error_code createUniqueFile(const Twine &Model,
>                                   SmallVectorImpl<char> &ResultPath);
>
> +// Represents a temporary file and hides the name. This allows not even
> having
> +// no name when O_TMPFILE is supported.
> +class TempFile {
> +  // Temporary name used when O_TMPFILE is not avilable.
> +  std::string TmpName;
> +
> +public:
> +  TempFile(StringRef Name, int FD);
> +  TempFile(TempFile &&Other);
> +
> +  // The open file descriptor.
> +  int FD;
> +
> +  Error keep(const Twine &Name);
> +  ~TempFile();
> +};
> +
> +Expected<TempFile> createUniqueFile(const Twine &Directory,
> +                                    unsigned Mode = all_read | all_write);
> +
>  /// @brief Create a file in the system temporary directory.
>  ///
>  /// The filename is of the form prefix-random_chars.suffix. Since the
> directory
> diff --git a/lib/Support/FileOutputBuffer.cpp b/lib/Support/
> FileOutputBuffer.cpp
> index db8ff38e5b5..af61af3c32e 100644
> --- a/lib/Support/FileOutputBuffer.cpp
> +++ b/lib/Support/FileOutputBuffer.cpp
> @@ -34,9 +34,9 @@ using namespace llvm::sys;
>  // with the temporary file on commit().
>  class OnDiskBuffer : public FileOutputBuffer {
>  public:
> -  OnDiskBuffer(StringRef Path, StringRef TempPath,
> +  OnDiskBuffer(StringRef Path, fs::TempFile Temp,
>                 std::unique_ptr<fs::mapped_file_region> Buf)
> -      : FileOutputBuffer(Path), Buffer(std::move(Buf)),
> TempPath(TempPath) {}
> +      : FileOutputBuffer(Path), Buffer(std::move(Buf)),
> Temp(std::move(Temp)) {}
>
>    uint8_t *getBufferStart() const override { return (uint8_t
> *)Buffer->data(); }
>
> @@ -51,21 +51,18 @@ public:
>      Buffer.reset();
>
>      // Atomically replace the existing file with the new one.
> -    auto EC = fs::rename(TempPath, FinalPath);
> -    sys::DontRemoveFileOnSignal(TempPath);
> -    return errorCodeToError(EC);
> +    return Temp.keep(FinalPath);
>    }
>
>    ~OnDiskBuffer() override {
>      // Close the mapping before deleting the temp file, so that the
> removal
>      // succeeds.
>      Buffer.reset();
> -    fs::remove(TempPath);
>    }
>
>  private:
>    std::unique_ptr<fs::mapped_file_region> Buffer;
> -  std::string TempPath;
> +  fs::TempFile Temp;
>  };
>
>  // A FileOutputBuffer which keeps data in memory and writes to the final
> @@ -110,13 +107,13 @@ createInMemoryBuffer(StringRef Path, size_t Size,
> unsigned Mode) {
>
>  static Expected<std::unique_ptr<OnDiskBuffer>>
>  createOnDiskBuffer(StringRef Path, size_t Size, unsigned Mode) {
> -  // Create new file in same directory but with random name.
> -  SmallString<128> TempPath;
> -  int FD;
> -  if (auto EC = fs::createUniqueFile(Path + ".tmp%%%%%%%", FD, TempPath,
> Mode))
> -    return errorCodeToError(EC);
> -
> -  sys::RemoveFileOnSignal(TempPath);
> +  StringRef Directory = path::parent_path(Path);
> +  if (Directory.empty())
> +    Directory = ".";
> +  Expected<fs::TempFile> FileOrErr = fs::createUniqueFile(Directory,
> Mode);
> +  if (!FileOrErr)
> +    return FileOrErr.takeError();
> +  fs::TempFile File = std::move(*FileOrErr);
>
>  #ifndef LLVM_ON_WIN32
>    // On Windows, CreateFileMapping (the mmap function on Windows)
> @@ -124,18 +121,18 @@ createOnDiskBuffer(StringRef Path, size_t Size,
> unsigned Mode) {
>    // extend the file beforehand. _chsize (ftruncate on Windows) is
>    // pretty slow just like it writes specified amount of bytes,
>    // so we should avoid calling that function.
> -  if (auto EC = fs::resize_file(FD, Size))
> +  if (auto EC = fs::resize_file(File.FD, Size))
>      return errorCodeToError(EC);
>  #endif
>
>    // Mmap it.
>    std::error_code EC;
>    auto MappedFile = llvm::make_unique<fs::mapped_file_region>(
> -      FD, fs::mapped_file_region::readwrite, Size, 0, EC);
> -  close(FD);
> +      File.FD, fs::mapped_file_region::readwrite, Size, 0, EC);
>    if (EC)
>      return errorCodeToError(EC);
> -  return llvm::make_unique<OnDiskBuffer>(Path, TempPath,
> std::move(MappedFile));
> +  return llvm::make_unique<OnDiskBuffer>(Path, std::move(File),
> +                                         std::move(MappedFile));
>  }
>
>  // Create an instance of FileOutputBuffer.
> diff --git a/lib/Support/Path.cpp b/lib/Support/Path.cpp
> index 9692acb5283..d3a356cecde 100644
> --- a/lib/Support/Path.cpp
> +++ b/lib/Support/Path.cpp
> @@ -18,8 +18,12 @@
>  #include "llvm/Support/ErrorHandling.h"
>  #include "llvm/Support/FileSystem.h"
>  #include "llvm/Support/Process.h"
> +#include "llvm/Support/Signals.h"
>  #include <cctype>
>  #include <cstring>
> +#include <fcntl.h>
> +#include <sys/stat.h>
> +#include <sys/types.h>
>
>  #if !defined(_MSC_VER) && !defined(__MINGW32__)
>  #include <unistd.h>
> @@ -759,6 +763,73 @@ std::error_code createUniqueFile(const Twine &Model,
>    return createUniqueEntity(Model, Dummy, ResultPath, false, 0, FS_Name);
>  }
>
> +TempFile::TempFile(StringRef Name, int FD) : TmpName(Name), FD(FD) {}
> +TempFile::TempFile(TempFile &&Other) {
> +  TmpName = std::move(Other.TmpName);
> +  FD = Other.FD;
> +  Other.FD = -1;
> +}
> +
> +TempFile::~TempFile() {
> +  // FIXME: error handling. Move this to a discard method?
> +  if (FD != -1)
> +    close(FD);
> +  if (!TmpName.empty())
> +    fs::remove(TmpName);
> +  sys::DontRemoveFileOnSignal(TmpName);
> +}
> +
> +Error TempFile::keep(const Twine &Name) {
> +  if (TmpName.empty()) {
> +    // We can't do it atomically.
> +    // FIXME: should we loop?
> +    if (std::error_code EC = fs::remove(Name))
> +      return errorCodeToError(EC);
> +    SmallString<128> Storage;
> +    StringRef NameP = Name.toNullTerminatedStringRef(Storage);
> +    std::string ProcFD = "/proc/self/fd/" + std::to_string(FD);
> +    if (linkat(AT_FDCWD, ProcFD.c_str(), AT_FDCWD, NameP.begin(),
> +               AT_SYMLINK_FOLLOW) == -1) {
> +      std::error_code EC(errno, std::generic_category());
> +      return errorCodeToError(EC);
> +    }
> +  } else {
> +    if (std::error_code EC = fs::rename(TmpName, Name))
> +      return errorCodeToError(EC);
> +    sys::DontRemoveFileOnSignal(TmpName);
> +    TmpName = "";
> +  }
> +
> +  if (close(FD) == -1) {
> +    std::error_code EC(errno, std::generic_category());
> +    return errorCodeToError(EC);
> +  }
> +  FD = -1;
> +
> +  return Error::success();
> +}
> +
> +Expected<TempFile> createUniqueFile(const Twine &Directory, unsigned
> Mode) {
> +#if 1
> +  SmallString<128> Storage;
> +  StringRef P = Directory.toNullTerminatedStringRef(Storage);
> +  int FD = open(P.begin(), O_RDWR | O_TMPFILE | O_CLOEXEC, Mode);
> +  if (FD == -1) {
> +    std::error_code EC(errno, std::generic_category());
> +    return errorCodeToError(EC);
> +  }
> +  return TempFile("", FD);
> +#else
> +  int FD;
> +  SmallString<128> ResultPath;
> +  if (std::error_code EC =
> +          createUniqueFile(Directory + "/tmp%%%%%%%", FD, ResultPath,
> Mode))
> +    return errorCodeToError(EC);
> +  sys::RemoveFileOnSignal(ResultPath);
> +  return TempFile(ResultPath, FD);
> +#endif
> +}
> +
>  static std::error_code
>  createTemporaryFile(const Twine &Model, int &ResultFD,
>                      llvm::SmallVectorImpl<char> &ResultPath, FSEntity
> Type) {
> diff --git a/lib/Support/Unix/Path.inc b/lib/Support/Unix/Path.inc
> index 2ecb97316c8..3099b75a1b8 100644
> --- a/lib/Support/Unix/Path.inc
> +++ b/lib/Support/Unix/Path.inc
> @@ -779,6 +779,8 @@ std::error_code openFileForWrite(const Twine &Name,
> int &ResultFD,
>
>    int OpenFlags = O_CREAT;
>
> +  // add O_TMPFILE?
> +
>  #ifdef O_CLOEXEC
>    OpenFlags |= O_CLOEXEC;
>  #endif
>
>
> Cheers,
> Rafael
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171109/48f36447/attachment.html>
-------------- next part --------------
#include <windows.h>
#include <string>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// Create a string with last error message
static std::string GetLastErrorStdStr(DWORD error) {
  if (error == 0)
    return "";
  LPVOID lpMsgBuf;
  DWORD bufLen = FormatMessage(
      FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_FROM_SYSTEM |
          FORMAT_MESSAGE_IGNORE_INSERTS,
      NULL, error, MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), (LPTSTR)&lpMsgBuf,
      0, NULL);
  if (bufLen == 0)
    return "";
  LPCSTR lpMsgStr = (LPCSTR)lpMsgBuf;
  std::string result(lpMsgStr, lpMsgStr + bufLen);
  LocalFree(lpMsgBuf);
  return result;
}

static void error(const char *func) {
  DWORD error = GetLastError();
  std::string err_str = GetLastErrorStdStr(error);
  fprintf(stderr, "call to %s failed: %lu %s\n", func, error, err_str.c_str());
  fflush(stderr);
  exit(1);
}

int main(int argc, char **argv) {
  printf("starting\n");
  fflush(stdout);

  HANDLE h = CreateFileW(                    //
      L"outfile.tmp.txt",                    // lpFileName
      GENERIC_READ | GENERIC_WRITE | DELETE, // dwDesiredAccess
      0,                                     // dwShareMode
      NULL,                                  // lpSecurityAttributes
      CREATE_ALWAYS,                         // dwCreationDisposition
      FILE_ATTRIBUTE_NORMAL,                 // dwFlagsAndAttributes
      NULL                                   // hTemplateFile
  );

  if (h == INVALID_HANDLE_VALUE)
    error("CreateFileW");

  FILE_DISPOSITION_INFO disposition;
  disposition.DeleteFile = TRUE;
  if (!SetFileInformationByHandle(h, FileDispositionInfo, &disposition,
                                  sizeof(disposition)))
    error("SetFileInformationByHandle");

  static const char buf[] = "output data\n";
  DWORD written;
  if (!WriteFile(h, buf, sizeof(buf), &written, NULL))
    error("WriteFile");
  if (written != sizeof(buf)) {
    fprintf(stderr, "short write?\n");
    fflush(stderr);
    exit(1);
  }

  if (argc > 1) {
    if (strcmp("abort", argv[1]) == 0) {
      fprintf(stderr, "aborting, file should be deleted\n");
      fflush(stderr);
      abort();
    } else if (strcmp("fastfail", argv[1]) == 0) {
      fprintf(stderr, "__fastfail, file should be deleted\n");
      fflush(stderr);
      __fastfail(FAST_FAIL_FATAL_APP_EXIT);
    } else if (strcmp("derefnull", argv[1]) == 0) {
      fprintf(stderr, "null deref, file should be deleted\n");
      fflush(stderr);
      *(volatile int *)0 = 42;
    }
  }

  // OK, we want to keep this file.
  disposition.DeleteFile = FALSE;
  if (!SetFileInformationByHandle(h, FileDispositionInfo, &disposition,
                                  sizeof(disposition)))
    error("SetFileInformationByHandle");

  char rename_info_buf[sizeof(FILE_RENAME_INFO) + 128 * sizeof(wchar_t)];
  FILE_RENAME_INFO *rename_info =
      reinterpret_cast<FILE_RENAME_INFO *>(rename_info_buf);

  const wchar_t *outfile = L"C:\\src\\llvm-project\\build\\outfile.txt";
  rename_info->ReplaceIfExists = TRUE;
  rename_info->RootDirectory = NULL;
  rename_info->FileNameLength = wcslen(outfile) * sizeof(wchar_t);
  wcsncpy_s(&rename_info->FileName[0], 128, outfile, wcslen(outfile));

  printf("before rename\n");
  fflush(stdout);

  if (!SetFileInformationByHandle(h, FileRenameInfo, rename_info,
                                  sizeof(rename_info_buf)))
    error("SetFileInformationByHandle");

  printf("after rename\n");
  fflush(stdout);

  CloseHandle(h);
  printf("post close\n");
  fflush(stdout);
}


More information about the llvm-dev mailing list