[llvm-dev] Experiment on how to improve our temporary file handing.
Rui Ueyama via llvm-dev
llvm-dev at lists.llvm.org
Mon Nov 13 01:06:00 PST 2017
On Fri, Nov 10, 2017 at 6:13 AM, Rafael Avila de Espindola <
rafael.espindola at gmail.com> wrote:
> Currently a power failure or other hard crash can cause lld leave a
> temporary
> file around. The same is true for other llvm tools.
>
> As an example, put a breakpoint in Writer.cpp:236 ("writeBuildId()") and
> restart the run a few times. You will get
>
> t.tmp43a735a t.tmp4deeabb t.tmp9bacdd3 t.tmpe4115c4 t.tmpeb01fff
>
> The same would happen if there was a fatal error between the
> FileOutputBuffer creation and commit. I don't think that is a code path
> where that is possible right now, but it would be an easy thing to miss
> in a code review.
>
> I was hopping the OS could help us manage the temporary file so that
> there was no way to accidentally leave it behind.
>
> On linux there is O_TMPFILE, which allows us to create a file with no
> name in the file system. A name can be given with linkat. Unfortunately
> we can't use
>
> linkat(fd, "", AT_FDCWD, "destination", AT_EMPTY_PATH)
>
> Without special permissions and have instead to depend on proc:
>
> linkat(AT_FDCWD, "/proc/self/fd/<num>", AT_FDCWD, "destination",
> AT_SYMLINK_FOLLOW)
>
I often execute lld in a chroot environment in which /proc is not mounted,
and I expect lld would work just fine in that environment. So the presence
of /proc should be considered optional even on Linux.
Another annoyance is that linkat will not replace the destination and
> renameat2 doesn't support AT_EMPTY_PATH. The result is that we have to
> use unlink+linkat and loop.
>
I think you can avoid unlink+linkat. You can instead give a temporary file
name to an unnamed file and then rename the temporary file the destination
file. There's a small chance between linkat and rename, but that race
condition is probably better than unlink+linkat.
On windows there is FILE_FLAG_DELETE_ON_CLOSE, but there seems to be no
> way to cancel it. If a file is created with it I can rename it, but it
> is still deleted in the end.
>
> I couldn't find any support for this on FreeBSD.
>
> This suggest that we cannot just have a createUniqueEntity that returs
> just an FD. We will need a class that stores a temporary name too for
> the systems where we cannot give a filename to a FD.
>
> I have attached the patch I got so far, but given the existing
> restrictions I would say it is not worth it.
>
> It will try to just make it more obvious that lld's FileOutputBuffer is
> deleted on all code paths.
>
>
> diff --git a/include/llvm/Support/FileSystem.h b/include/llvm/Support/
> FileSystem.h
> index 03015a0ca3b..5d80db1d5a2 100644
> --- a/include/llvm/Support/FileSystem.h
> +++ b/include/llvm/Support/FileSystem.h
> @@ -31,6 +31,7 @@
> #include "llvm/ADT/StringRef.h"
> #include "llvm/ADT/Twine.h"
> #include "llvm/Support/Chrono.h"
> +#include "llvm/Support/Error.h"
> #include "llvm/Support/ErrorHandling.h"
> #include "llvm/Support/ErrorOr.h"
> #include "llvm/Support/MD5.h"
> @@ -694,6 +695,26 @@ std::error_code createUniqueFile(const Twine &Model,
> int &ResultFD,
> std::error_code createUniqueFile(const Twine &Model,
> SmallVectorImpl<char> &ResultPath);
>
> +// Represents a temporary file and hides the name. This allows not even
> having
> +// no name when O_TMPFILE is supported.
> +class TempFile {
> + // Temporary name used when O_TMPFILE is not avilable.
> + std::string TmpName;
> +
> +public:
> + TempFile(StringRef Name, int FD);
> + TempFile(TempFile &&Other);
> +
> + // The open file descriptor.
> + int FD;
> +
> + Error keep(const Twine &Name);
> + ~TempFile();
> +};
> +
> +Expected<TempFile> createUniqueFile(const Twine &Directory,
> + unsigned Mode = all_read | all_write);
> +
> /// @brief Create a file in the system temporary directory.
> ///
> /// The filename is of the form prefix-random_chars.suffix. Since the
> directory
> diff --git a/lib/Support/FileOutputBuffer.cpp b/lib/Support/
> FileOutputBuffer.cpp
> index db8ff38e5b5..af61af3c32e 100644
> --- a/lib/Support/FileOutputBuffer.cpp
> +++ b/lib/Support/FileOutputBuffer.cpp
> @@ -34,9 +34,9 @@ using namespace llvm::sys;
> // with the temporary file on commit().
> class OnDiskBuffer : public FileOutputBuffer {
> public:
> - OnDiskBuffer(StringRef Path, StringRef TempPath,
> + OnDiskBuffer(StringRef Path, fs::TempFile Temp,
> std::unique_ptr<fs::mapped_file_region> Buf)
> - : FileOutputBuffer(Path), Buffer(std::move(Buf)),
> TempPath(TempPath) {}
> + : FileOutputBuffer(Path), Buffer(std::move(Buf)),
> Temp(std::move(Temp)) {}
>
> uint8_t *getBufferStart() const override { return (uint8_t
> *)Buffer->data(); }
>
> @@ -51,21 +51,18 @@ public:
> Buffer.reset();
>
> // Atomically replace the existing file with the new one.
> - auto EC = fs::rename(TempPath, FinalPath);
> - sys::DontRemoveFileOnSignal(TempPath);
> - return errorCodeToError(EC);
> + return Temp.keep(FinalPath);
> }
>
> ~OnDiskBuffer() override {
> // Close the mapping before deleting the temp file, so that the
> removal
> // succeeds.
> Buffer.reset();
> - fs::remove(TempPath);
> }
>
> private:
> std::unique_ptr<fs::mapped_file_region> Buffer;
> - std::string TempPath;
> + fs::TempFile Temp;
> };
>
> // A FileOutputBuffer which keeps data in memory and writes to the final
> @@ -110,13 +107,13 @@ createInMemoryBuffer(StringRef Path, size_t Size,
> unsigned Mode) {
>
> static Expected<std::unique_ptr<OnDiskBuffer>>
> createOnDiskBuffer(StringRef Path, size_t Size, unsigned Mode) {
> - // Create new file in same directory but with random name.
> - SmallString<128> TempPath;
> - int FD;
> - if (auto EC = fs::createUniqueFile(Path + ".tmp%%%%%%%", FD, TempPath,
> Mode))
> - return errorCodeToError(EC);
> -
> - sys::RemoveFileOnSignal(TempPath);
> + StringRef Directory = path::parent_path(Path);
> + if (Directory.empty())
> + Directory = ".";
> + Expected<fs::TempFile> FileOrErr = fs::createUniqueFile(Directory,
> Mode);
> + if (!FileOrErr)
> + return FileOrErr.takeError();
> + fs::TempFile File = std::move(*FileOrErr);
>
> #ifndef LLVM_ON_WIN32
> // On Windows, CreateFileMapping (the mmap function on Windows)
> @@ -124,18 +121,18 @@ createOnDiskBuffer(StringRef Path, size_t Size,
> unsigned Mode) {
> // extend the file beforehand. _chsize (ftruncate on Windows) is
> // pretty slow just like it writes specified amount of bytes,
> // so we should avoid calling that function.
> - if (auto EC = fs::resize_file(FD, Size))
> + if (auto EC = fs::resize_file(File.FD, Size))
> return errorCodeToError(EC);
> #endif
>
> // Mmap it.
> std::error_code EC;
> auto MappedFile = llvm::make_unique<fs::mapped_file_region>(
> - FD, fs::mapped_file_region::readwrite, Size, 0, EC);
> - close(FD);
> + File.FD, fs::mapped_file_region::readwrite, Size, 0, EC);
> if (EC)
> return errorCodeToError(EC);
> - return llvm::make_unique<OnDiskBuffer>(Path, TempPath,
> std::move(MappedFile));
> + return llvm::make_unique<OnDiskBuffer>(Path, std::move(File),
> + std::move(MappedFile));
> }
>
> // Create an instance of FileOutputBuffer.
> diff --git a/lib/Support/Path.cpp b/lib/Support/Path.cpp
> index 9692acb5283..d3a356cecde 100644
> --- a/lib/Support/Path.cpp
> +++ b/lib/Support/Path.cpp
> @@ -18,8 +18,12 @@
> #include "llvm/Support/ErrorHandling.h"
> #include "llvm/Support/FileSystem.h"
> #include "llvm/Support/Process.h"
> +#include "llvm/Support/Signals.h"
> #include <cctype>
> #include <cstring>
> +#include <fcntl.h>
> +#include <sys/stat.h>
> +#include <sys/types.h>
>
> #if !defined(_MSC_VER) && !defined(__MINGW32__)
> #include <unistd.h>
> @@ -759,6 +763,73 @@ std::error_code createUniqueFile(const Twine &Model,
> return createUniqueEntity(Model, Dummy, ResultPath, false, 0, FS_Name);
> }
>
> +TempFile::TempFile(StringRef Name, int FD) : TmpName(Name), FD(FD) {}
> +TempFile::TempFile(TempFile &&Other) {
> + TmpName = std::move(Other.TmpName);
> + FD = Other.FD;
> + Other.FD = -1;
> +}
> +
> +TempFile::~TempFile() {
> + // FIXME: error handling. Move this to a discard method?
> + if (FD != -1)
> + close(FD);
> + if (!TmpName.empty())
> + fs::remove(TmpName);
> + sys::DontRemoveFileOnSignal(TmpName);
> +}
> +
> +Error TempFile::keep(const Twine &Name) {
> + if (TmpName.empty()) {
> + // We can't do it atomically.
> + // FIXME: should we loop?
> + if (std::error_code EC = fs::remove(Name))
> + return errorCodeToError(EC);
> + SmallString<128> Storage;
> + StringRef NameP = Name.toNullTerminatedStringRef(Storage);
> + std::string ProcFD = "/proc/self/fd/" + std::to_string(FD);
> + if (linkat(AT_FDCWD, ProcFD.c_str(), AT_FDCWD, NameP.begin(),
> + AT_SYMLINK_FOLLOW) == -1) {
> + std::error_code EC(errno, std::generic_category());
> + return errorCodeToError(EC);
> + }
> + } else {
> + if (std::error_code EC = fs::rename(TmpName, Name))
> + return errorCodeToError(EC);
> + sys::DontRemoveFileOnSignal(TmpName);
> + TmpName = "";
> + }
> +
> + if (close(FD) == -1) {
> + std::error_code EC(errno, std::generic_category());
> + return errorCodeToError(EC);
> + }
> + FD = -1;
> +
> + return Error::success();
> +}
> +
> +Expected<TempFile> createUniqueFile(const Twine &Directory, unsigned
> Mode) {
> +#if 1
> + SmallString<128> Storage;
> + StringRef P = Directory.toNullTerminatedStringRef(Storage);
> + int FD = open(P.begin(), O_RDWR | O_TMPFILE | O_CLOEXEC, Mode);
> + if (FD == -1) {
> + std::error_code EC(errno, std::generic_category());
> + return errorCodeToError(EC);
> + }
> + return TempFile("", FD);
> +#else
> + int FD;
> + SmallString<128> ResultPath;
> + if (std::error_code EC =
> + createUniqueFile(Directory + "/tmp%%%%%%%", FD, ResultPath,
> Mode))
> + return errorCodeToError(EC);
> + sys::RemoveFileOnSignal(ResultPath);
> + return TempFile(ResultPath, FD);
> +#endif
> +}
> +
> static std::error_code
> createTemporaryFile(const Twine &Model, int &ResultFD,
> llvm::SmallVectorImpl<char> &ResultPath, FSEntity
> Type) {
> diff --git a/lib/Support/Unix/Path.inc b/lib/Support/Unix/Path.inc
> index 2ecb97316c8..3099b75a1b8 100644
> --- a/lib/Support/Unix/Path.inc
> +++ b/lib/Support/Unix/Path.inc
> @@ -779,6 +779,8 @@ std::error_code openFileForWrite(const Twine &Name,
> int &ResultFD,
>
> int OpenFlags = O_CREAT;
>
> + // add O_TMPFILE?
> +
> #ifdef O_CLOEXEC
> OpenFlags |= O_CLOEXEC;
> #endif
>
>
> Cheers,
> Rafael
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171113/3540f2c1/attachment.html>
More information about the llvm-dev
mailing list