[PATCH] D37099: Added optional validation of svn sources to Dockerfiles.
Manuel Klimek via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 31 02:15:46 PDT 2017
klimek added inline comments.
================
Comment at: utils/docker/scripts/llvm_checksum/llvm_checksum_utils.py:27-28
+ is_ignored,
+ content_hasher,
+ hash_algo=hashlib.sha256):
+ """ Computes checksums for all files and symlinks under path.
----------------
ilya-biryukov wrote:
> klimek wrote:
> > Why's the algo not part of the content_hasher?
> > Can't we just use a hasher that has the exact interface we need?
> - `content_hasher` is responsible for reading files and replacing the contents of the file, while `hash_algo` is responsible for providing hash functions.
> - we don't call `content_hasher` for broken symlinks and use `hash_algo` on the symlink target instead. It does not make sense to do any replacements in there, as it's not a file.
> - it's nice that this function controls the lifetime of `hash_algo()` objects. Otherwise, there's a chance the client will create hasher only once and use that for all subsequent calls.
>
> It also seems nice to have the checksumming algorithm name as part of the function interface.
I think my main concern is that we're mixing levels of abstraction; generally, we have a strategy how to do the hashing, and we have an algorithm that does the visitation (giving project structure, whitelist, etc).
I'd probably just do a visit_project function that takes a class with perhaps 2 functions:
visit_file(path, content)
visit_symlink(path, link)
Then we'd have a Checksum class that implements these 2 functions, has a hasher member and does the hashing in the visitation, cutting at the responsibilities.
That way, we also could split this up into multiple files more easily:
project_tree.py (perhaps with a better name :) allows to visit all files in the project.
llvm_checksum.py - can now be the main method and the checksum'ing parts of the algo
================
Comment at: utils/docker/scripts/llvm_checksum/llvm_checksum_utils.py:131-133
+ elif kind == FileKind.VALID_SYMLINK:
+ hasher.update("@symlink")
+ else:
----------------
ilya-biryukov wrote:
> klimek wrote:
> > Why not just feed the content (file content or symlink target)?
> To distinguish between a broken symlink and a file with content equivalent to the broken symlink's target.
Why do we care?
https://reviews.llvm.org/D37099
More information about the llvm-commits
mailing list