chroot only changes the process root directory, not the mount namespace root
directory, and it is well-known that any process with chroot capability can
break out of a chroot "jail". By using pivot_root as well, and unmounting the
original mount namespace root directory, breaking out becomes impossible.
Non-root processes typically have no ability to use chroot() anyway, but they
can gain that capability through the use of clone() or unshare(). For security
reasons, these syscalls are limited in functionality when used inside a normal
chroot environment. Using pivot_root() this way does allow those syscalls to be
put to their full use.
Sodium's Ed25519 signatures are much shorter than OpenSSL's RSA
signatures. Public keys are also much shorter, so they're now
specified directly in the nix.conf option ‘binary-cache-public-keys’.
The new command ‘nix-store --generate-binary-cache-key’ generates and
prints a public and secret key.
On a system with multiple CPUs, running Nix operations through the
daemon is significantly slower than "direct" mode:
$ NIX_REMOTE= nix-instantiate '<nixos>' -A system
real 0m0.974s
user 0m0.875s
sys 0m0.088s
$ NIX_REMOTE=daemon nix-instantiate '<nixos>' -A system
real 0m2.118s
user 0m1.463s
sys 0m0.218s
The main reason seems to be that the client and the worker get moved
to a different CPU after every call to the worker. This patch adds a
hack to lock them to the same CPU. With this, the overhead of going
through the daemon is very small:
$ NIX_REMOTE=daemon nix-instantiate '<nixos>' -A system
real 0m1.074s
user 0m0.809s
sys 0m0.098s
/nix/store could be a read-only bind mount even if it is / in its own filesystem, so checking the 4th field in mountinfo is insufficient.
Signed-off-by: Shea Levy <shea@shealevy.com>
vfork() is just too weird. For instance, in this build:
http://hydra.nixos.org/build/3330487
the value fromHook.writeSide becomes corrupted in the parent, even
though the child only reads from it. At -O0 the problem goes away.
Probably the child is overriding some spilled temporary variable.
If I get bored I may implement using posix_spawn() instead.
XZ compresses significantly better than bzip2. Here are the
compression ratios and execution times (using 4 cores in parallel) on
my /var/run/current-system (3.1 GiB):
bzip2: total compressed size 849.56 MiB, 30.8% [2m08]
xz -6: total compressed size 641.84 MiB, 23.4% [6m53]
xz -7: total compressed size 621.82 MiB, 22.6% [7m19]
xz -8: total compressed size 599.33 MiB, 21.8% [7m18]
xz -9: total compressed size 588.18 MiB, 21.4% [7m40]
Note that compression takes much longer. More importantly, however,
decompression is much faster:
bzip2: 1m47.274s
xz -6: 0m55.446s
xz -7: 0m54.119s
xz -8: 0m52.388s
xz -9: 0m51.842s
The only downside to using -9 is that decompression takes a fair
amount (~65 MB) of memory.
Nix needs SQLite's foreign key constraint feature, which was
introduced in 3.6.19. Without it, the database won't be cleaned up
correctly when paths are deleted. See
e.g. http://hydra.nixos.org/build/2494142.
Nix now requires SQLite and bzip2 to be pre-installed. SQLite is
detected using pkg-config. We required DBD::SQLite anyway, so
depending on SQLite is not a big problem.
The --with-bzip2, --with-openssl and --with-sqlite flags are gone.
I was bitten one time too many by Python modifying the Nix store by
creating *.pyc files when run as root. On Linux, we can prevent this
by setting the immutable bit on files and directories (as in ‘chattr
+i’). This isn't supported by all filesystems, so it's not an error
if setting the bit fails. The immutable bit is cleared by the garbage
collector before deleting a path. The only tricky aspect is in
optimiseStore(), since it's forbidden to create hard links to an
immutable file. Thus optimiseStore() temporarily clears the immutable
bit before creating the link.
scripts.
* Include the version and architecture in the -I flag so that there is
at least a chance that a Nix binary built for one Perl version will
run on another version.
bindings to be used in Nix's own Perl scripts.
The only downside is that Perl XS and Automake/libtool don't really
like each other, so building is a bit tricky.
little RAM. Even if the memory isn't actually used, it can cause
problems with the overcommit heuristics in the kernel. So use a VM
space of 25% of RAM, up to 384 MB.
because Berkeley DB needed it on some platforms, but we don't use
BDB anymore.
On FreeBSD, if you link against pthreads, then the main thread gets
a 2 MB stack which cannot be overriden (it ignores "ulimit -s"):
http://www.mail-archive.com/freebsd-hackers@freebsd.org/msg62445.html
This is not enough for Nix. For instance, the garbage collector can
fail if there is a pathologically deep chain of references
(http://hydra.nixos.org/build/556199). 2 MB is also not enough for
many Nix expressions.
Arguably the garbage collector shouldn't use recursion, because in
NixOS unprivileged users can DOS the garbage collector by creating a
sufficiently deeply nested chain of references. But getting rid of
recursion is a bit harder.
faster than the old mode when fsyncs are enabled, because it only
performs an fsync() when doing a checkpoint, rather than at every
commit. Some timings for doing a "nix-instantiate /etc/nixos/nixos
-A system" after modifying the stdenv setup script:
42.5s - SQLite 3.6.23 with truncate mode and fsync
3.4s - SQLite 3.6.23 with truncate mode and no fsync
32.1s - SQLite 3.7.0 with truncate mode and fsync
16.8s - SQLite 3.7.0 with WAL mode and fsync, auto-checkpoint
every 1000 pages
8.3s - SQLite 3.7.0 with WAL mode and fsync, auto-checkpoint
every 8192 pages
1.7s - SQLite 3.7.0 with WAL mode and no fsync
The default is now to use WAL mode with fsyncs. Because WAL doesn't
work on remote filesystems such as NFS (as it uses shared memory),
truncate mode can be re-enabled by setting the "use-sqlite-wal"
option to false.
Defining -D_FILE_OFFSET_BITS=64 works on most platforms, but not on all (i.e.
Solaris). Also, the Autoconf macro offers the user a switch to disable the
functionality in case of problems.
would just silently store only (fileSize % 2^32) bytes.
* Use posix_fallocate if available when unpacking archives.
* Provide a better error message when trying to unpack something that
isn't a NAR archive.
bind-mounts we do are only visible to the builder process and its
children. So accidentally doing "rm -rf" on the chroot directory
won't wipe out /nix/store and other bind-mounted directories
anymore. Also, the bind-mounts in the private namespace disappear
automatically when the builder exits.
get the basename of the channel URL (e.g., nixpkgs-unstable). The
top-level Nix expression of the channel is now an attribute set, the
attributes of which are the individual channels (e.g.,
{nixpkgs_unstable = ...; strategoxt_unstable = ...}). This makes
attribute paths ("nix-env -qaA" and "nix-env -iA") more sensible,
e.g., "nix-env -iA nixpkgs_unstable.subversion".
that have to be done as root: running builders under different uids,
changing ownership of build results, and deleting paths in the store
with the wrong ownership).
* Some refactoring: put the NAR archive integer/string serialisation
code in a separate file so it can be reused by the worker protocol
implementation.
Rather, setuid support is now always compiled in (at least on
platforms that have the setresuid system call, e.g., Linux and
FreeBSD), but it must enabled by chowning/chmodding the Nix
binaries.
externals directory. This is in particular useful because though
most systems have bzip2/bunzip2, they don't always have libbz2,
which we need for bsdiff/bspatch.
implementations of MD5, SHA-1 and SHA-256. The main benefit is that
we get assembler-optimised implementations of MD5 and SHA-1 (though
not SHA-256 (at least on x86), unfortunately). OpenSSL's SHA-1
implementation on Intel is twice as fast as ours.