We want to encourage a brave new world of hermetic evaluation for
source-level reproducibility, so flakes should not poke around in the
filesystem outside of their explicit dependencies.
Note that the default installation source remains impure in that it
can refer to mutable flakes, so "nix build nixpkgs.hello" still works
(and fetches the latest nixpkgs, unless it has been pinned by the
user).
A problem with pure evaluation is that builtins.currentSystem is
unavailable. For the moment, I've hard-coded "x86_64-linux" in the
nixpkgs flake. Eventually, "system" should be a flake function
argument.
For example, github:edolstra/dwarffs is more-or-less equivalent to
https://github.com/edolstra/dwarffs.git. It's a much faster way to get
GitHub repositories: it fetches tarballs rather than entire Git
repositories. It also allows fetching specific revisions by hash
without specifying a ref (e.g. a branch name):
github:edolstra/dwarffs/41c0c1bf292ea3ac3858ff393b49ca1123dbd553
Previously, plain derivation paths in the string context (e.g. those
that arose from builtins.storePath on a drv file, not those that arose
from accessing .drvPath of a derivation) were treated somewhat like
derivaiton paths derived from .drvPath, except their dependencies
weren't recursively added to the input set. With this change, such
plain derivation paths are simply treated as paths and added to the
source inputs set accordingly, simplifying context handling code and
removing the inconsistency. If drvPath-like behavior is desired, the
.drv file can be imported and then .drvPath can be accessed.
This is a backwards-incompatibility, but storePath is never used on
drv files within nixpkgs and almost never used elsewhere.
SRI hashes (https://www.w3.org/TR/SRI/) combine the hash algorithm and
a base-64 hash. This allows more concise and standard hash
specifications. For example, instead of
import <nix/fetchurl.nl> {
url = https://nixos.org/releases/nix/nix-2.1.3/nix-2.1.3.tar.xz;
sha256 = "5d22dad058d5c800d65a115f919da22938c50dd6ba98c5e3a183172d149840a4";
};
you can write
import <nix/fetchurl.nl> {
url = https://nixos.org/releases/nix/nix-2.1.3/nix-2.1.3.tar.xz;
hash = "sha256-XSLa0FjVyADWWhFfkZ2iKTjFDda6mMXjoYMXLRSYQKQ=";
};
In fixed-output derivations, the outputHashAlgo is no longer mandatory
if outputHash specifies the hash (either as an SRI or in the old
"<type>:<hash>" format).
'nix hash-{file,path}' now print hashes in SRI format by default. I
also reverted them to use SHA-256 by default because that's what we're
using most of the time in Nixpkgs.
Suggested by @zimbatm.
This is already done by coerceToString(), provided that the argument
is a path (e.g. 'fetchGit ./bla'). It fixes the handling of URLs like
git@github.com:owner/repo.git. It breaks 'fetchGit "./bla"', but that
was never intended to work anyway and is inconsistent with other
builtin functions (e.g. 'readFile "./bla"' fails).
Using a 64bit integer on 32bit systems will come with a bit of a
performance overhead, but given that Nix doesn't use a lot of integers
compared to other types, I think the overhead is negligible also
considering that 32bit systems are in decline.
The biggest advantage however is that when we use a consistent integer
size across all platforms it's less likely that we miss things that we
break due to that. One example would be:
https://github.com/NixOS/nixpkgs/pull/44233
On Hydra it will evaluate, because the evaluator runs on a 64bit
machine, but when evaluating the same on a 32bit machine it will fail,
so using 64bit integers should make that consistent.
While the change of the type in value.hh is rather easy to do, we have a
few more options available for doing the conversion in the lexer:
* Via an #ifdef on the architecture and using strtol() or strtoll()
accordingly depending on which architecture we are. For the #ifdef
we would need another AX_COMPILE_CHECK_SIZEOF in configure.ac.
* Using istringstream, which would involve copying the value.
* As we're already using boost, lexical_cast might be a good idea.
Spoiler: I went for the latter, first of all because lexical_cast does
have an overload for const char* and second of all, because it doesn't
involve copying around the input string. Also, because istringstream
seems to come with a bigger overhead than boost::lexical_cast:
https://www.boost.org/doc/libs/release/doc/html/boost_lexical_cast/performance.html
The first method (still using strtol/strtoll) also wasn't something I
pursued further, because it is also locale-aware which I doubt is what
we want, given that the regex for int is [0-9]+.
Signed-off-by: aszlig <aszlig@nix.build>
Fixes: #2339
The current usage technically works by putting multiple different
repos in to the same git directory. However, it is very slow as
Git tries very hard to find common commits between the two
repositories. If the two repositories are large (like Nixpkgs and
another long-running project,) it is maddeningly slow.
This change busts the cache for existing deployments, but users
will be promptly repaid in per-repository performance.
In EvalState::checkSourcePath, the path is checked against the list of
allowed paths first and later it's checked again *after* resolving
symlinks.
The resolving of the symlinks is done via canonPath, which also strips
out "../" and "./". However after the canonicalisation the error message
pointing out that the path is not allowed prints the symlink target in
the error message.
Even if we'd suppress the message, symlink targets could still be leaked
if the symlink target doesn't exist (in this case the error is thrown in
canonPath).
So instead, we now do canonPath() without symlink resolving first before
even checking against the list of allowed paths and then later do the
symlink resolving and checking the allowed paths again.
The first call to canonPath() should get rid of all the "../" and "./",
so in theory the only way to leak a symlink if the attacker is able to
put a symlink in one of the paths allowed by restricted evaluation mode.
For the latter I don't think this is part of the threat model, because
if the attacker can write to that path, the attack vector is even
larger.
Signed-off-by: aszlig <aszlig@nix.build>
forceValue() were called after a value is copied effectively forcing only one of the copies keeping another copy not evaluated.
This resulted in its evaluation of the same lazy value more than once (the number of hits is not big though)
This reduces the risk of object liveness misdetection. For example,
Glibc has an internal variable "mp_" that often points to a Boehm
object, keeping it alive unnecessarily. Since we don't store any
actual roots in global variables, we can just disable data segment
scanning.
With this, the max RSS doing 100 evaluations of
nixos.tests.firefox.x86_64-linux.drvPath went from 718 MiB to 455 MiB.
If the Env denotes a 'with', then values[0] may be an Expr* cast to a
Value*. For code that generically traverses Values/Envs, it's useful
to know this.
E.g. this makes
nix eval --restrict-eval -I /nix/store/foo '(builtins.readFile "/nix/store/foo/symlink/bla")'
(where /nix/store/foo/symlink is a symlink to another path in the
closure of /nix/store/foo) succeed.
This fixes a regression in Hydra compared to Nix 1.x (where there were
no restrictions at all on access to the Nix store).
Doing so prevents emacs tags from working, as well as makes the code extremely
confusing for a newbie.
In the prior state, if someone wants to find the definition of "ExprApp" for
example, a grep through the code reveals nothing. Since the definition could be
hiding in numerous ".h" files, it's really difficult to find. This personally
took me several hours to figure out.
Flex's regexes have an annoying feature: the dot matches everything
except a newline. This causes problems for expressions like:
"${0}\
"
where the backslash-newline combination matches this rule instead of the
intended one mentioned in the comment:
<STRING>\$|\\|\$\\ {
/* This can only occur when we reach EOF, otherwise the above
(...|\$[^\{\"\\]|\\.|\$\\.)+ would have triggered.
This is technically invalid, but we leave the problem to the
parser who fails with exact location. */
return STR;
}
However, the parser actually accepts the resulting token sequence
('"' DOLLAR_CURLY 0 '}' STR '"'), which is a problem because the lexer
rule didn't assign anything to yylval. Ultimately this leads to a crash
when dereferencing a NULL pointer in ExprConcatStrings::bindVars().
The fix does change the syntax of the language in some corner cases
but I think it's only turning previously invalid (or crashing) syntax
to valid syntax. E.g.
"a\
b"
and
''a''\
b''
were previously syntax errors but now both result in "a\nb".
Found by afl-fuzz.
Otherwise, running e.g.
nix-instantiate --eval -E --strict 'builtins.replaceStrings [""] ["X"] "abc"'
would just hang in an infinite loop.
Found by afl-fuzz.
First attempt of this was reverted in e2d71bd186 because it caused
another infinite loop, which is fixed now and a test added.
Otherwise, running e.g.
nix-instantiate --eval -E --strict 'builtins.replaceStrings [""] ["X"] "abc"'
would just hang in an infinite loop.
Found by afl-fuzz.
Instead of having lexicographicOrder() create a temporary sorted array
of Attr*:s and copying attr names from that, copy the attr names
first and then sort that.
builtins.path allows specifying the name of a path (which makes paths
with store-illegal names now addable), allows adding paths with flat
instead of recursive hashes, allows specifying a filter (so is a
generalization of filterSource), and allows specifying an expected
hash (enabling safe path adding in pure mode).
In this mode, the following restrictions apply:
* The builtins currentTime, currentSystem and storePath throw an
error.
* $NIX_PATH and -I are ignored.
* fetchGit and fetchMercurial require a revision hash.
* fetchurl and fetchTarball require a sha256 attribute.
* No file system access is allowed outside of the paths returned by
fetch{Git,Mercurial,url,Tarball}. Thus 'nix build -f ./foo.nix' is
not allowed.
Thus, the evaluation result is completely reproducible from the
command line arguments. E.g.
nix build --pure-eval '(
let
nix = fetchGit { url = https://github.com/NixOS/nixpkgs.git; rev = "9c927de4b179a6dd210dd88d34bda8af4b575680"; };
nixpkgs = fetchGit { url = https://github.com/NixOS/nixpkgs.git; ref = "release-17.09"; rev = "66b4de79e3841530e6d9c6baf98702aa1f7124e4"; };
in (import (nix + "/release.nix") { inherit nix nixpkgs; }).build.x86_64-linux
)'
The goal is to enable completely reproducible and traceable
evaluation. For example, a NixOS configuration could be fully
described by a single Git commit hash. 'nixos-rebuild' would do
something like
nix build --pure-eval '(
(import (fetchGit { url = file:///my-nixos-config; rev = "..."; })).system
')
where the Git repository /my-nixos-config would use further fetchGit
calls or Git externals to fetch Nixpkgs and whatever other
dependencies it has. Either way, the commit hash would uniquely
identify the NixOS configuration and allow it to reproduced.
For example, you can write
src = fetchgit ./.;
and if ./. refers to an unclean working tree, that tree will be copied
to the Nix store. This removes the need for "cleanSource".
The computation of urlHash didn't take the name into account, so
subsequent fetchurl calls with the same URL but a different name would
resolve to the same cached store path.
The "name" attribute defaults to "source", which we should use for all
similar functions (e.g. fetchTarball and in Hydra) to ensure that we
get a consistent store path regardless of how the tree is fetched.
"source" is not necessarily a correct label, but using an empty name
is problematic: you get an ugly store path ending in a dash, and it's
impossible to have a fixed-output derivation that produces that path
because ".drv" is not a valid store name.
Fixes#904.