[PATCH] D24933: Enable configuration files in clang
Serge Pavlov via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Oct 2 10:10:29 PDT 2017
sepavloff added a comment.
Here is a list of design solutions used in this implementation of config files.
**How config file is specified**
There are two ways to specify config file:
- To encode it into executable file name, such as `foo-clang`,
- To pass config file in command line arguments.
There were no objections to the variant `foo-clang`. It can be considered as a natural extension of the existing mechanism, in which invocation of `foo-clang` is equivalent to specifying target in command line: `clang --target=foo`. Config file allows to specify more than one option and its name is not confined to the registered targets.
As for specifying config file in command line, there are two variants:
- Use existing construct `@foo`.
- Use special command line option `--config foo`.
Each way has own advantages.
Construct `@file` allows to reuse existing command line syntax. Indeed, config file is a collection of command line arguments and `@file` is just a way to inserts such arguments from a file. Config file may include other files and it uses `@file` with the exception that `file` is resolved relative to the including file, not to current directory. Config file could be considered as an extension of existing mechanism provided by `@file`.
Using `@file` creates compatibility issues, because existing use of `@file` must not be broken. Obviously the file in `@file` may be treated as configuration only if it cannot be treated according to the existing semantics. Possible solution is to try loading `file` as configuration if it does not contain path separator and is not found in current directory.
The drawback of this syntax is that the meaning of `@foo` in the invocation `clang @foo abc.cpp` depends on the content of current directory. If it contains file `foo`, `@foo` is an expansion of response file, otherwise it specifies a config file. This behavior causes error if current directory accidentally contains a file with the same name as the specified config file.
Using dedicated option to apply config file makes the intention explicit. It also allow to use config files from arbitrary places. For instance, invocation `clang --config ./foo` allows to treat file `foo` in current directory as config file.
Although config file contains command line arguments as conventional response file, it differs from the latter:
- It resolves nested `@file` constructs differently, relative to including file, not current directory.
- File is searched for in predefined set of directories, not in the current only.
- Config file is more like a part of working environment. For instance, clang based SDK supplier could deliver a set config files as a part of their product. Response file in contrast is more close to transient build data, often generated by some tool.
- Warning about unused options are suppressed in config files.
- There was a proposal to extend syntax of config file to enable comments and using trailing backslash to split long lines, although these extensions are useful for response files as well.
So, maybe, loading config file deserves a special option. This way has advantages:
- it expresses intentions explicitly and reduce risk of accidental errors,
- it allows using absolute paths to specify config file.
**Where config files reside**
There may be several places where config files can be kept:
- Directory where clang executable resides. It is convenient for SDK developers as it simplifies packaging. User can use several instances of clang at the same time, they still may use their own set of config files without conflicts.
- System directory (for instance, /etc/llvm), that keeps config files for use by several users. Such case is interesting for OS distribution maintainers and SDK developers.
- User directory (for instance, ~/.llvm). A user can collect config file that tune compiler for their tasks in this directory and use them to select particular option set.
- Config file can be specified by path, as in `clang --config ./foo`. This is convenient for developers to ensure that particular configuration is selected.
For the sake of flexibility it make sense to enable all these locations as they are useful in different scenarios. Location of user and system directories are specified at configuration, by default they are absent. If user directory is specified, it should have higher priority over other places for search so that user could correct system supplied option sets.
**Driver mode**
If config file is encoded in executable name, such as `foo-clang`, there is concern of using different driver modes. What config file should be searched for if compiler is called as `foo-cpp`, `foo-cl` etc? These tools support different set of options, so a flexible solution should provide possibility to specify different config files for different driver modes.
Clang implements flexible scheme of tool naming, in which a tool name has components:
<arch>-<something>-<driver-mode>[-<optional version suffix>][<optional version number>]
The part of executable name that precedes the driver-mode suffix can be arbitrary. It make sense to not analyze the executable name by components but use entire name without version as a base name of config file. So executables:
i686-linux-android-g++
i686-linux-android-g++5.0
i686-linux-android-g++-release
would search for file `i686-linux-android-g++.cfg`, while
foo-clang
foo-gcc
foo-s++
would search files `foo-clang.cfg`, `foo-gcc.cfg` and `foo-s++.cfg` respectively.
On the other hand, important use of config file is tuning compiler options for cross compilations. In such case it is likely that different tools would use the same options. Cloning config file for each tool is odd, so natural solution is a single file for a target.
The flexible solution is to search long name (like `i686-linux-android-g++.cfg`) first and if it is not found, look for short name based on 'target' only (such as `i686.cfg` and `foo.cfg`).
It make sense to use the same rule for the case when config file is specified explicitly (but not as path), so that invocation `foo-clang` be equivalent to `clang --config foo`. In this case the invocation:
clang-cpp --config foo abc.c
would first search for the file `foo-clang-cpp`, then for `foo.cfg`.
**Target reloading**
Configuration file is a general mechanism but it was proposed as a solution for cross compilation problems. In this case config file holds options required to tune compilation for particular target. There is a difficulty here because some command line options like -m32 effectively change the target and the new target may require different settings than those contained in config file.
The proposed solution is to reload config file. If:
- config file starts with architecture component (like `x86_64-`),
- command line contains option(s) that effectively changes target (like `-m32`),
then the driver tries to load config file with name obtained by replacing architecture component with the actual architecture. For instance, if config file was `x86_64-clang.cfg` the driver looks for `i686-clang.cfg`. If proper config file is found, options read from the previous config file are removed and content of new config file is inserted at the beginning of effective command line. If such file is not found, it is not an error.
Effect of target reloading must be exactly the same as if actual target were initially specified. For instance, invocation `x86_64-clang -m32` must be equivalent to `i686-clang`. It means that:
- Options read from previous config file are removed entirely.
- The search for new config file is made by the same rule as original file, that is first `i686-clang.cfg` then `i686.cfg`.
**Conflicting settings**
It is possible that clang is requested to load several config files. Consider this possibility for the case of target, which represents more general case.
There are three ways to specify target for compilation:
- config file, like `clang --config i686`,
- executable prefix, like `i686-clang`,
- command line option, like `--target i686`.
All may be combined.
Using `--target` is existing way to reload target. It must be able to use it in combination with config files, the only difference is possible target reloading. It has precedence over other ways due to compatibility.
Combining executable prefix and explicit config file may create conflicting choice, for instance:
mips64-clang --config x86_64
Possible solutions are:
- emit error,
- treat command line option as having higher priority.
The latter solution looks more appropriate, as it is consistent with the way clang processes command line options, `--target` in particular. Such combination may appear if `--config` comes from the set of compilation flags, while compiler is specified in different way. User may use `--config` instead of `--target` in existing build system just to set several target specific options, so this combination can be allowed for compatibility reason.
Another conflicting case is two config files specified in command line:
clang --config mips64 --config x86_64
Again, we have possible solutions:
- treat such case as an error,
- treat the second option as having higher priority.
The latter way is consistent with the general option treatment, but we have no compatibility reason to enable it. As this combination may be a result of error, probably it is better to prohibit it.
https://reviews.llvm.org/D24933
More information about the cfe-commits
mailing list