[llvm-dev] Experiment on how to improve our temporary file handing.
Reid Kleckner via llvm-dev
llvm-dev at lists.llvm.org
Thu Nov 9 13:31:30 PST 2017
I agree, having a nameless FD would be the most elegant way to solve this
problem on Unix, but when I investigated it, I came to the same conclusion
that you did: it can't be done portably. Nameless files are just too weird,
so I think we're stuck with RemoveFileOnSignal. =/
For Windows, I was able to use SetFileInformationByHandle to get the OS to
clean up files for us on exit. I attached my proof-of-concept program that
does this.
On Thu, Nov 9, 2017 at 1:13 PM, Rafael Avila de Espindola via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Currently a power failure or other hard crash can cause lld leave a
> temporary
> file around. The same is true for other llvm tools.
>
> As an example, put a breakpoint in Writer.cpp:236 ("writeBuildId()") and
> restart the run a few times. You will get
>
> t.tmp43a735a t.tmp4deeabb t.tmp9bacdd3 t.tmpe4115c4 t.tmpeb01fff
>
> The same would happen if there was a fatal error between the
> FileOutputBuffer creation and commit. I don't think that is a code path
> where that is possible right now, but it would be an easy thing to miss
> in a code review.
>
> I was hopping the OS could help us manage the temporary file so that
> there was no way to accidentally leave it behind.
>
> On linux there is O_TMPFILE, which allows us to create a file with no
> name in the file system. A name can be given with linkat. Unfortunately
> we can't use
>
> linkat(fd, "", AT_FDCWD, "destination", AT_EMPTY_PATH)
>
> Without special permissions and have instead to depend on proc:
>
> linkat(AT_FDCWD, "/proc/self/fd/<num>", AT_FDCWD, "destination",
> AT_SYMLINK_FOLLOW)
>
> Another annoyance is that linkat will not replace the destination and
> renameat2 doesn't support AT_EMPTY_PATH. The result is that we have to
> use unlink+linkat and loop.
>
> On windows there is FILE_FLAG_DELETE_ON_CLOSE, but there seems to be no
> way to cancel it. If a file is created with it I can rename it, but it
> is still deleted in the end.
>
> I couldn't find any support for this on FreeBSD.
>
> This suggest that we cannot just have a createUniqueEntity that returs
> just an FD. We will need a class that stores a temporary name too for
> the systems where we cannot give a filename to a FD.
>
> I have attached the patch I got so far, but given the existing
> restrictions I would say it is not worth it.
>
> It will try to just make it more obvious that lld's FileOutputBuffer is
> deleted on all code paths.
>
>
> diff --git a/include/llvm/Support/FileSystem.h b/include/llvm/Support/
> FileSystem.h
> index 03015a0ca3b..5d80db1d5a2 100644
> --- a/include/llvm/Support/FileSystem.h
> +++ b/include/llvm/Support/FileSystem.h
> @@ -31,6 +31,7 @@
> #include "llvm/ADT/StringRef.h"
> #include "llvm/ADT/Twine.h"
> #include "llvm/Support/Chrono.h"
> +#include "llvm/Support/Error.h"
> #include "llvm/Support/ErrorHandling.h"
> #include "llvm/Support/ErrorOr.h"
> #include "llvm/Support/MD5.h"
> @@ -694,6 +695,26 @@ std::error_code createUniqueFile(const Twine &Model,
> int &ResultFD,
> std::error_code createUniqueFile(const Twine &Model,
> SmallVectorImpl<char> &ResultPath);
>
> +// Represents a temporary file and hides the name. This allows not even
> having
> +// no name when O_TMPFILE is supported.
> +class TempFile {
> + // Temporary name used when O_TMPFILE is not avilable.
> + std::string TmpName;
> +
> +public:
> + TempFile(StringRef Name, int FD);
> + TempFile(TempFile &&Other);
> +
> + // The open file descriptor.
> + int FD;
> +
> + Error keep(const Twine &Name);
> + ~TempFile();
> +};
> +
> +Expected<TempFile> createUniqueFile(const Twine &Directory,
> + unsigned Mode = all_read | all_write);
> +
> /// @brief Create a file in the system temporary directory.
> ///
> /// The filename is of the form prefix-random_chars.suffix. Since the
> directory
> diff --git a/lib/Support/FileOutputBuffer.cpp b/lib/Support/
> FileOutputBuffer.cpp
> index db8ff38e5b5..af61af3c32e 100644
> --- a/lib/Support/FileOutputBuffer.cpp
> +++ b/lib/Support/FileOutputBuffer.cpp
> @@ -34,9 +34,9 @@ using namespace llvm::sys;
> // with the temporary file on commit().
> class OnDiskBuffer : public FileOutputBuffer {
> public:
> - OnDiskBuffer(StringRef Path, StringRef TempPath,
> + OnDiskBuffer(StringRef Path, fs::TempFile Temp,
> std::unique_ptr<fs::mapped_file_region> Buf)
> - : FileOutputBuffer(Path), Buffer(std::move(Buf)),
> TempPath(TempPath) {}
> + : FileOutputBuffer(Path), Buffer(std::move(Buf)),
> Temp(std::move(Temp)) {}
>
> uint8_t *getBufferStart() const override { return (uint8_t
> *)Buffer->data(); }
>
> @@ -51,21 +51,18 @@ public:
> Buffer.reset();
>
> // Atomically replace the existing file with the new one.
> - auto EC = fs::rename(TempPath, FinalPath);
> - sys::DontRemoveFileOnSignal(TempPath);
> - return errorCodeToError(EC);
> + return Temp.keep(FinalPath);
> }
>
> ~OnDiskBuffer() override {
> // Close the mapping before deleting the temp file, so that the
> removal
> // succeeds.
> Buffer.reset();
> - fs::remove(TempPath);
> }
>
> private:
> std::unique_ptr<fs::mapped_file_region> Buffer;
> - std::string TempPath;
> + fs::TempFile Temp;
> };
>
> // A FileOutputBuffer which keeps data in memory and writes to the final
> @@ -110,13 +107,13 @@ createInMemoryBuffer(StringRef Path, size_t Size,
> unsigned Mode) {
>
> static Expected<std::unique_ptr<OnDiskBuffer>>
> createOnDiskBuffer(StringRef Path, size_t Size, unsigned Mode) {
> - // Create new file in same directory but with random name.
> - SmallString<128> TempPath;
> - int FD;
> - if (auto EC = fs::createUniqueFile(Path + ".tmp%%%%%%%", FD, TempPath,
> Mode))
> - return errorCodeToError(EC);
> -
> - sys::RemoveFileOnSignal(TempPath);
> + StringRef Directory = path::parent_path(Path);
> + if (Directory.empty())
> + Directory = ".";
> + Expected<fs::TempFile> FileOrErr = fs::createUniqueFile(Directory,
> Mode);
> + if (!FileOrErr)
> + return FileOrErr.takeError();
> + fs::TempFile File = std::move(*FileOrErr);
>
> #ifndef LLVM_ON_WIN32
> // On Windows, CreateFileMapping (the mmap function on Windows)
> @@ -124,18 +121,18 @@ createOnDiskBuffer(StringRef Path, size_t Size,
> unsigned Mode) {
> // extend the file beforehand. _chsize (ftruncate on Windows) is
> // pretty slow just like it writes specified amount of bytes,
> // so we should avoid calling that function.
> - if (auto EC = fs::resize_file(FD, Size))
> + if (auto EC = fs::resize_file(File.FD, Size))
> return errorCodeToError(EC);
> #endif
>
> // Mmap it.
> std::error_code EC;
> auto MappedFile = llvm::make_unique<fs::mapped_file_region>(
> - FD, fs::mapped_file_region::readwrite, Size, 0, EC);
> - close(FD);
> + File.FD, fs::mapped_file_region::readwrite, Size, 0, EC);
> if (EC)
> return errorCodeToError(EC);
> - return llvm::make_unique<OnDiskBuffer>(Path, TempPath,
> std::move(MappedFile));
> + return llvm::make_unique<OnDiskBuffer>(Path, std::move(File),
> + std::move(MappedFile));
> }
>
> // Create an instance of FileOutputBuffer.
> diff --git a/lib/Support/Path.cpp b/lib/Support/Path.cpp
> index 9692acb5283..d3a356cecde 100644
> --- a/lib/Support/Path.cpp
> +++ b/lib/Support/Path.cpp
> @@ -18,8 +18,12 @@
> #include "llvm/Support/ErrorHandling.h"
> #include "llvm/Support/FileSystem.h"
> #include "llvm/Support/Process.h"
> +#include "llvm/Support/Signals.h"
> #include <cctype>
> #include <cstring>
> +#include <fcntl.h>
> +#include <sys/stat.h>
> +#include <sys/types.h>
>
> #if !defined(_MSC_VER) && !defined(__MINGW32__)
> #include <unistd.h>
> @@ -759,6 +763,73 @@ std::error_code createUniqueFile(const Twine &Model,
> return createUniqueEntity(Model, Dummy, ResultPath, false, 0, FS_Name);
> }
>
> +TempFile::TempFile(StringRef Name, int FD) : TmpName(Name), FD(FD) {}
> +TempFile::TempFile(TempFile &&Other) {
> + TmpName = std::move(Other.TmpName);
> + FD = Other.FD;
> + Other.FD = -1;
> +}
> +
> +TempFile::~TempFile() {
> + // FIXME: error handling. Move this to a discard method?
> + if (FD != -1)
> + close(FD);
> + if (!TmpName.empty())
> + fs::remove(TmpName);
> + sys::DontRemoveFileOnSignal(TmpName);
> +}
> +
> +Error TempFile::keep(const Twine &Name) {
> + if (TmpName.empty()) {
> + // We can't do it atomically.
> + // FIXME: should we loop?
> + if (std::error_code EC = fs::remove(Name))
> + return errorCodeToError(EC);
> + SmallString<128> Storage;
> + StringRef NameP = Name.toNullTerminatedStringRef(Storage);
> + std::string ProcFD = "/proc/self/fd/" + std::to_string(FD);
> + if (linkat(AT_FDCWD, ProcFD.c_str(), AT_FDCWD, NameP.begin(),
> + AT_SYMLINK_FOLLOW) == -1) {
> + std::error_code EC(errno, std::generic_category());
> + return errorCodeToError(EC);
> + }
> + } else {
> + if (std::error_code EC = fs::rename(TmpName, Name))
> + return errorCodeToError(EC);
> + sys::DontRemoveFileOnSignal(TmpName);
> + TmpName = "";
> + }
> +
> + if (close(FD) == -1) {
> + std::error_code EC(errno, std::generic_category());
> + return errorCodeToError(EC);
> + }
> + FD = -1;
> +
> + return Error::success();
> +}
> +
> +Expected<TempFile> createUniqueFile(const Twine &Directory, unsigned
> Mode) {
> +#if 1
> + SmallString<128> Storage;
> + StringRef P = Directory.toNullTerminatedStringRef(Storage);
> + int FD = open(P.begin(), O_RDWR | O_TMPFILE | O_CLOEXEC, Mode);
> + if (FD == -1) {
> + std::error_code EC(errno, std::generic_category());
> + return errorCodeToError(EC);
> + }
> + return TempFile("", FD);
> +#else
> + int FD;
> + SmallString<128> ResultPath;
> + if (std::error_code EC =
> + createUniqueFile(Directory + "/tmp%%%%%%%", FD, ResultPath,
> Mode))
> + return errorCodeToError(EC);
> + sys::RemoveFileOnSignal(ResultPath);
> + return TempFile(ResultPath, FD);
> +#endif
> +}
> +
> static std::error_code
> createTemporaryFile(const Twine &Model, int &ResultFD,
> llvm::SmallVectorImpl<char> &ResultPath, FSEntity
> Type) {
> diff --git a/lib/Support/Unix/Path.inc b/lib/Support/Unix/Path.inc
> index 2ecb97316c8..3099b75a1b8 100644
> --- a/lib/Support/Unix/Path.inc
> +++ b/lib/Support/Unix/Path.inc
> @@ -779,6 +779,8 @@ std::error_code openFileForWrite(const Twine &Name,
> int &ResultFD,
>
> int OpenFlags = O_CREAT;
>
> + // add O_TMPFILE?
> +
> #ifdef O_CLOEXEC
> OpenFlags |= O_CLOEXEC;
> #endif
>
>
> Cheers,
> Rafael
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171109/48f36447/attachment.html>
-------------- next part --------------
#include <windows.h>
#include <string>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// Create a string with last error message
static std::string GetLastErrorStdStr(DWORD error) {
if (error == 0)
return "";
LPVOID lpMsgBuf;
DWORD bufLen = FormatMessage(
FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_FROM_SYSTEM |
FORMAT_MESSAGE_IGNORE_INSERTS,
NULL, error, MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), (LPTSTR)&lpMsgBuf,
0, NULL);
if (bufLen == 0)
return "";
LPCSTR lpMsgStr = (LPCSTR)lpMsgBuf;
std::string result(lpMsgStr, lpMsgStr + bufLen);
LocalFree(lpMsgBuf);
return result;
}
static void error(const char *func) {
DWORD error = GetLastError();
std::string err_str = GetLastErrorStdStr(error);
fprintf(stderr, "call to %s failed: %lu %s\n", func, error, err_str.c_str());
fflush(stderr);
exit(1);
}
int main(int argc, char **argv) {
printf("starting\n");
fflush(stdout);
HANDLE h = CreateFileW( //
L"outfile.tmp.txt", // lpFileName
GENERIC_READ | GENERIC_WRITE | DELETE, // dwDesiredAccess
0, // dwShareMode
NULL, // lpSecurityAttributes
CREATE_ALWAYS, // dwCreationDisposition
FILE_ATTRIBUTE_NORMAL, // dwFlagsAndAttributes
NULL // hTemplateFile
);
if (h == INVALID_HANDLE_VALUE)
error("CreateFileW");
FILE_DISPOSITION_INFO disposition;
disposition.DeleteFile = TRUE;
if (!SetFileInformationByHandle(h, FileDispositionInfo, &disposition,
sizeof(disposition)))
error("SetFileInformationByHandle");
static const char buf[] = "output data\n";
DWORD written;
if (!WriteFile(h, buf, sizeof(buf), &written, NULL))
error("WriteFile");
if (written != sizeof(buf)) {
fprintf(stderr, "short write?\n");
fflush(stderr);
exit(1);
}
if (argc > 1) {
if (strcmp("abort", argv[1]) == 0) {
fprintf(stderr, "aborting, file should be deleted\n");
fflush(stderr);
abort();
} else if (strcmp("fastfail", argv[1]) == 0) {
fprintf(stderr, "__fastfail, file should be deleted\n");
fflush(stderr);
__fastfail(FAST_FAIL_FATAL_APP_EXIT);
} else if (strcmp("derefnull", argv[1]) == 0) {
fprintf(stderr, "null deref, file should be deleted\n");
fflush(stderr);
*(volatile int *)0 = 42;
}
}
// OK, we want to keep this file.
disposition.DeleteFile = FALSE;
if (!SetFileInformationByHandle(h, FileDispositionInfo, &disposition,
sizeof(disposition)))
error("SetFileInformationByHandle");
char rename_info_buf[sizeof(FILE_RENAME_INFO) + 128 * sizeof(wchar_t)];
FILE_RENAME_INFO *rename_info =
reinterpret_cast<FILE_RENAME_INFO *>(rename_info_buf);
const wchar_t *outfile = L"C:\\src\\llvm-project\\build\\outfile.txt";
rename_info->ReplaceIfExists = TRUE;
rename_info->RootDirectory = NULL;
rename_info->FileNameLength = wcslen(outfile) * sizeof(wchar_t);
wcsncpy_s(&rename_info->FileName[0], 128, outfile, wcslen(outfile));
printf("before rename\n");
fflush(stdout);
if (!SetFileInformationByHandle(h, FileRenameInfo, rename_info,
sizeof(rename_info_buf)))
error("SetFileInformationByHandle");
printf("after rename\n");
fflush(stdout);
CloseHandle(h);
printf("post close\n");
fflush(stdout);
}
More information about the llvm-dev
mailing list