Merge pull request #6420 from nix-community/doc-what-is-nix

Document what Nix *is*
This commit is contained in:
Théophane Hufschmitt 2022-08-04 20:49:01 +02:00 committed by GitHub
commit 81e101345f
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
9 changed files with 477 additions and 2 deletions

2
.gitignore vendored
View file

@ -22,7 +22,7 @@ perl/Makefile.config
/doc/manual/src/SUMMARY.md /doc/manual/src/SUMMARY.md
/doc/manual/src/command-ref/new-cli /doc/manual/src/command-ref/new-cli
/doc/manual/src/command-ref/conf-file.md /doc/manual/src/command-ref/conf-file.md
/doc/manual/src/expressions/builtins.md /doc/manual/src/language/builtins.md
# /scripts/ # /scripts/
/scripts/nix-profile.sh /scripts/nix-profile.sh

View file

@ -1,5 +1,9 @@
ifeq ($(doc_generate),yes) ifeq ($(doc_generate),yes)
MANUAL_SRCS := \
$(call rwildcard, $(d)/src, *.md) \
$(call rwildcard, $(d)/src, */*.md)
# Generate man pages. # Generate man pages.
man-pages := $(foreach n, \ man-pages := $(foreach n, \
nix-env.1 nix-build.1 nix-shell.1 nix-store.1 nix-instantiate.1 \ nix-env.1 nix-build.1 nix-shell.1 nix-store.1 nix-instantiate.1 \
@ -97,7 +101,7 @@ doc/manual/generated/man1/nix3-manpages: $(d)/src/command-ref/new-cli
done done
@touch $@ @touch $@
$(docdir)/manual/index.html: $(MANUAL_SRCS) $(d)/book.toml $(d)/anchors.jq $(d)/custom.css $(d)/src/SUMMARY.md $(d)/src/command-ref/new-cli $(d)/src/command-ref/conf-file.md $(d)/src/language/builtins.md $(call rwildcard, $(d)/src, *.md) $(docdir)/manual/index.html: $(MANUAL_SRCS) $(d)/book.toml $(d)/anchors.jq $(d)/custom.css $(d)/src/SUMMARY.md $(d)/src/command-ref/new-cli $(d)/src/command-ref/conf-file.md $(d)/src/language/builtins.md
$(trace-gen) RUST_LOG=warn mdbook build doc/manual -d $(DESTDIR)$(docdir)/manual $(trace-gen) RUST_LOG=warn mdbook build doc/manual -d $(DESTDIR)$(docdir)/manual
endif endif

View file

@ -59,6 +59,12 @@
@manpages@ @manpages@
- [Files](command-ref/files.md) - [Files](command-ref/files.md)
- [nix.conf](command-ref/conf-file.md) - [nix.conf](command-ref/conf-file.md)
- [Architecture](architecture/architecture.md)
- [Store](architecture/store/store.md)
- [Closure](architecture/store/store/closure.md)
- [Build system terminology](architecture/store/store/build-system-terminology.md)
- [Store Path](architecture/store/path.md)
- [File System Object](architecture/store/fso.md)
- [Glossary](glossary.md) - [Glossary](glossary.md)
- [Contributing](contributing/contributing.md) - [Contributing](contributing/contributing.md)
- [Hacking](contributing/hacking.md) - [Hacking](contributing/hacking.md)

View file

@ -0,0 +1,79 @@
# Architecture
*(This chapter is unstable and a work in progress. Incoming links may rot.)*
This chapter describes how Nix works.
It should help users understand why Nix behaves as it does, and it should help developers understand how to modify Nix and how to write similar tools.
## Overview
Nix consists of [hierarchical layers][layer-architecture].
```
+-----------------------------------------------------------------+
| Nix |
| [ commmand line interface ]------, |
| | | |
| evaluates | |
| | manages |
| V | |
| [ configuration language ] | |
| | | |
| +-----------------------------|-------------------V-----------+ |
| | store evaluates to | |
| | | | |
| | referenced by V builds | |
| | [ build input ] ---> [ build plan ] ---> [ build result ] | |
| | | |
| +-------------------------------------------------------------+ |
+-----------------------------------------------------------------+
```
At the top is the [command line interface](../command-ref/command-ref.md), translating from invocations of Nix executables to interactions with the underlying layers.
Below that is the [Nix expression language](../expressions/expression-language.md), a [purely functional][purely-functional-programming] configuration language.
It is used to compose expressions which ultimately evaluate to self-contained *build plans*, used to derive *build results* from referenced *build inputs*.
The command line and Nix language are what users interact with most.
> **Note**
> The Nix language itself does not have a notion of *packages* or *configurations*.
> As far as we are concerned here, the inputs and results of a build plan are just data.
Underlying these is the [Nix store](./store/store.md), a mechanism to keep track of build plans, data, and references between them.
It can also execute build plans to produce new data.
A build plan is a series of *build tasks*.
Each build task has a special build input which is used as *build instructions*.
The result of a build task can be input to another build task.
```
+-----------------------------------------------------------------------------------------+
| store |
| ................................................. |
| : build plan : |
| : : |
| [ build input ]-----instructions-, : |
| : | : |
| : v : |
| [ build input ]----------->[ build task ]--instructions-, : |
| : | : |
| : | : |
| : v : |
| : [ build task ]----->[ build result ] |
| [ build input ]-----instructions-, ^ : |
| : | | : |
| : v | : |
| [ build input ]----------->[ build task ]---------------' : |
| : ^ : |
| : | : |
| [ build input ]------------------' : |
| : : |
| : : |
| :...............................................: |
| |
+-----------------------------------------------------------------------------------------+
```
[layer-architecture]: https://en.m.wikipedia.org/wiki/Multitier_architecture#Layers
[purely-functional-programming]: https://en.m.wikipedia.org/wiki/Purely_functional_programming

View file

@ -0,0 +1,69 @@
# File System Object
The Nix store uses a simple file system model for the data it holds in [store objects](store.md#store-object).
Every file system object is one of the following:
- File: an executable flag, and arbitrary data for contents
- Directory: mapping of names to child file system objects
- [Symbolic link][symlink]: may point anywhere.
We call a store object's outermost file system object the *root*.
data FileSystemObject
= File { isExecutable :: Bool, contents :: Bytes }
| Directory { entries :: Map FileName FileSystemObject }
| SymLink { target :: Path }
Examples:
- a directory with contents
/nix/store/<hash>-hello-2.10
├── bin
│   └── hello
└── share
├── info
│   └── hello.info
└── man
└── man1
└── hello.1.gz
- a directory with relative symlink and other contents
/nix/store/<hash>-go-1.16.9
├── bin -> share/go/bin
├── nix-support/
└── share/
- a directory with absolute symlink
/nix/store/d3k...-nodejs
└── nix_node -> /nix/store/f20...-nodejs-10.24.
A bare file or symlink can be a root file system object.
Examples:
/nix/store/<hash>-hello-2.10.tar.gz
/nix/store/4j5...-pkg-config-wrapper-0.29.2-doc -> /nix/store/i99...-pkg-config-0.29.2-doc
Symlinks pointing outside of their own root or to a store object without a matching reference are allowed, but might not function as intended.
Examples:
- an arbitrarily symlinked file may change or not exist at all
/nix/store/<hash>-foo
└── foo -> /home/foo
- if a symlink to a store path was not automatically created by Nix, it may be invalid or get invalidated when the store object is deleted
/nix/store/<hash>-bar
└── bar -> /nix/store/abc...-foo
Nix file system objects do not support [hard links][hardlink]:
each file system object which is not the root has exactly one parent and one name.
However, as store objects are immutable, an underlying file system can use hard links for optimization.
[symlink]: https://en.m.wikipedia.org/wiki/Symbolic_link
[hardlink]: https://en.m.wikipedia.org/wiki/Hard_link

View file

@ -0,0 +1,105 @@
# Store Path
Nix implements [references](store.md#reference) to [store objects](store.md#store-object) as *store paths*.
Store paths are pairs of
- a 20-byte [digest](#digest) for identification
- a symbolic name for people to read.
Example:
- digest: `b6gvzjyb2pg0kjfwrjmg1vfhh54ad73z`
- name: `firefox-33.1`
It is rendered to a file system path as the concatenation of
- [store directory](#store-directory)
- path-separator (`/`)
- [digest](#digest) rendered in a custom variant of [base-32](https://en.m.wikipedia.org/wiki/Base32) (20 arbitrary bytes become 32 ASCII characters)
- hyphen (`-`)
- name
Example:
/nix/store/b6gvzjyb2pg0kjfwrjmg1vfhh54ad73z-firefox-33.1
|--------| |------------------------------| |----------|
store directory digest name
## Store Directory
Every [store](./store.md) has a store directory.
If the store has a [file system representation](./store.md#files-and-processes), this directory contains the stores [file system objects](#file-system-object), which can be addressed by [store paths](#store-path).
This means a store path is not just derived from the referenced store object itself, but depends on the store the store object is in.
> **Note**
> The store directory defaults to `/nix/store`, but is in principle arbitrary.
It is important which store a given store object belongs to:
Files in the store object can contain store paths, and processes may read these paths.
Nix can only guarantee [referential integrity](store/closure.md) if store paths do not cross store boundaries.
Therefore one can only copy store objects to a different store if
- the source and target stores' directories match
or
- the store object in question has no references, that is, contains no store paths.
One cannot copy a store object to a store with a different store directory.
Instead, it has to be rebuilt, together with all its dependencies.
It is in general not enough to replace the store directory string in file contents, as this may render executables unusable by invalidating their internal offsets or checksums.
# Digest
In a [store path](#store-path), the [digest][digest] is the output of a [cryptographic hash function][hash] of either all *inputs* involved in building the referenced store object or its actual *contents*.
Store objects are therefore said to be either [input-addressed](#input-addressing) or [content-addressed](#content-addressing).
> **Historical Note**
> The 20 byte restriction is because originally digests were [SHA-1][sha-1] hashes.
> Nix now uses [SHA-256][sha-256], and longer hashes are still reduced to 20 bytes for compatibility.
[digest]: https://en.m.wiktionary.org/wiki/digest#Noun
[hash]: https://en.m.wikipedia.org/wiki/Cryptographic_hash_function
[sha-1]: https://en.m.wikipedia.org/wiki/SHA-1
[sha-256]: https://en.m.wikipedia.org/wiki/SHA-256
### Reference scanning
When a new store object is built, Nix scans its file contents for store paths to construct its set of references.
The special format of a store path's [digest](#digest) allows reliably detecting it among arbitrary data.
Nix uses the [closure](store.md#closure) of build inputs to derive the list of allowed store paths, to avoid false positives.
This way, scanning files captures run time dependencies without the user having to declare them explicitly.
Doing it at build time and persisting references in the store object avoids repeating this time-consuming operation.
> **Note**
> In practice, it is sometimes still necessary for users to declare certain dependencies explicitly, if they are to be preserved in the build result's closure.
This depends on the specifics of the software to build and run.
>
> For example, Java programs are compressed after compilation, which obfuscates any store paths they may refer to and prevents Nix from automatically detecting them.
## Input Addressing
Input addressing means that the digest derives from how the store object was produced, namely its build inputs and build plan.
To compute the hash of a store object one needs a deterministic serialisation, i.e., a binary string representation which only changes if the store object changes.
Nix has a custom serialisation format called Nix Archive (NAR)
Store object references of this sort can *not* be validated from the content of the store object.
Rather, a cryptographic signature has to be used to indicate that someone is vouching for the store object really being produced from a build plan with that digest.
## Content Addressing
Content addressing means that the digest derives from the store object's contents, namely its file system objects and references.
If one knows content addressing was used, one can recalculate the reference and thus verify the store object.
Content addressing is currently only used for the special cases of source files and "fixed-output derivations", where the contents of a store object are known in advance.
Content addressing of build results is still an [experimental feature subject to some restrictions](https://github.com/tweag/rfcs/blob/cas-rfc/rfcs/0062-content-addressed-paths.md).

View file

@ -0,0 +1,151 @@
# Store
A Nix store is a collection of *store objects* with references between them.
It supports operations to manipulate that collection.
The following concept map is a graphical outline of this chapter.
Arrows indicate suggested reading order.
```
,--------------[ store ]----------------,
| | |
v v v
[ store object ] [ closure ]--, [ operations ]
| | | | | |
v | | v v |
[ files and processes ] | | [ garbage collection ] |
/ \ | | |
v v | v v
[ file system object ] [ store path ] | [ derivation ]--->[ building ]
| ^ | | |
v | v v |
[ digest ]----' [ reference scanning ]<------------'
/ \
v v
[ input addressing ] [ content addressing ]
```
## Store Object
A store object can hold
- arbitrary *data*
- *references* to other store objects.
Store objects can be build inputs, build results, or build tasks.
Store objects are [immutable][immutable-object]: once created, they do not change until they are deleted.
## Reference
A store object reference is an [opaque][opaque-data-type], [unique identifier][unique-identifier]:
The only way to obtain references is by adding or building store objects.
A reference will always point to exactly one store object.
## Operations
A Nix store can *add*, *retrieve*, and *delete* store objects.
[ data ]
|
V
[ store ] ---> add ----> [ store' ]
|
V
[ reference ]
<!-- -->
[ reference ]
|
V
[ store ] ---> get
|
V
[ store object ]
<!-- -->
[ reference ]
|
V
[ store ] --> delete --> [ store' ]
It can *perform builds*, that is, create new store objects by transforming build inputs into build outputs, using instructions from the build tasks.
[ reference ]
|
V
[ store ] --> build --(maybe)--> [ store' ]
|
V
[ reference ]
As it keeps track of references, it can [garbage-collect][garbage-collection] unused store objects.
[ store ] --> collect garbage --> [ store' ]
## Files and Processes
Nix maps between its store model and the [Unix paradigm][unix-paradigm] of [files and processes][file-descriptor], by encoding immutable store objects and opaque identifiers as file system primitives: files and directories, and paths.
That allows processes to resolve references contained in files and thus access the contents of store objects.
Store objects are therefore implemented as the pair of
- a [file system object](fso.md) for data
- a set of [store paths](path.md) for references.
[unix-paradigm]: https://en.m.wikipedia.org/wiki/Everything_is_a_file
[file-descriptor]: https://en.m.wikipedia.org/wiki/File_descriptor
The following diagram shows a radical simplification of how Nix interacts with the operating system:
It uses files as build inputs, and build outputs are files again.
On the operating system, files can be run as processes, which in turn operate on files.
A build function also amounts to an operating system process (not depicted).
```
+-----------------------------------------------------------------+
| Nix |
| [ commmand line interface ]------, |
| | | |
| evaluates | |
| | manages |
| V | |
| [ configuration language ] | |
| | | |
| +-----------------------------|-------------------V-----------+ |
| | store evaluates to | |
| | | | |
| | referenced by V builds | |
| | [ build input ] ---> [ build plan ] ---> [ build result ] | |
| | ^ | | |
| +---------|----------------------------------------|----------+ |
+-----------|----------------------------------------|------------+
| |
file system object store path
| |
+-----------|----------------------------------------|------------+
| operating system +------------+ | |
| '------------ | | <-----------' |
| | file | |
| ,-- | | <-, |
| | +------------+ | |
| execute as | | read, write, execute |
| | +------------+ | |
| '-> | process | --' |
| +------------+ |
+-----------------------------------------------------------------+
```
There exist different types of stores, which all follow this model.
Examples:
- store on the local file system
- remote store accessible via SSH
- binary cache store accessible via HTTP
To make store objects accessible to processes, stores ultimately have to expose store objects through the file system.

View file

@ -0,0 +1,32 @@
# A [Rosetta stone][rosetta-stone] for build system terminology
The Nix store's design is comparable to other build systems.
Usage of terms is, for historic reasons, not entirely consistent within the Nix ecosystem, and still subject to slow change.
The following translation table points out similarities and equivalent terms, to help clarify their meaning and inform consistent use in the future.
| generic build system | Nix | [Bazel][bazel] | [Build Systems à la Carte][bsalc] | programming language |
| -------------------------------- | ---------------- | -------------------------------------------------------------------- | --------------------------------- | ------------------------ |
| data (build input, build result) | store object | [artifact][bazel-artifact] | value | value |
| build instructions | builder | ([depends on action type][bazel-actions]) | function | function |
| build task | derivation | [action][bazel-action] | `Task` | [thunk][thunk] |
| build plan | derivation graph | [action graph][bazel-action-graph], [build graph][bazel-build-graph] | `Tasks` | [call graph][call-graph] |
| build | build | build | application of `Build` | evaluation |
| persistence layer | store | [action cache][bazel-action-cache] | `Store` | heap |
All of these systems share features of [declarative programming][declarative-programming] languages, a key insight first put forward by Eelco Dolstra et al. in [Imposing a Memory Management Discipline on Software Deployment][immdsd] (2004), elaborated in his PhD thesis [The Purely Functional Software Deployment Model][phd-thesis] (2006), and further refined by Andrey Mokhov et al. in [Build Systems à la Carte][bsalc] (2018).
[rosetta-stone]: https://en.m.wikipedia.org/wiki/Rosetta_Stone
[bazel]: https://bazel.build/start/bazel-intro
[bazel-artifact]: https://bazel.build/reference/glossary#artifact
[bazel-actions]: https://docs.bazel.build/versions/main/skylark/lib/actions.html
[bazel-action]: https://bazel.build/reference/glossary#action
[bazel-action-graph]: https://bazel.build/reference/glossary#action-graph
[bazel-build-graph]: https://bazel.build/reference/glossary#build-graph
[bazel-action-cache]: https://bazel.build/reference/glossary#action-cache
[thunk]: https://en.m.wikipedia.org/wiki/Thunk
[call-graph]: https://en.m.wikipedia.org/wiki/Call_graph
[declarative-programming]: https://en.m.wikipedia.org/wiki/Declarative_programming
[immdsd]: https://edolstra.github.io/pubs/immdsd-icse2004-final.pdf
[phd-thesis]: https://edolstra.github.io/pubs/phd-thesis.pdf
[bsalc]: https://www.microsoft.com/en-us/research/uploads/prod/2018/03/build-systems.pdf

View file

@ -0,0 +1,29 @@
# Closure
Nix stores ensure [referential integrity][referential-integrity]: for each store object in the store, all the store objects it references must also be in the store.
The set of all store objects reachable by following references from a given initial set of store objects is called a *closure*.
Adding, building, copying and deleting store objects must be done in a way that preserves referential integrity:
- A newly added store object cannot have references, unless it is a build task.
- Build results must only refer to store objects in the closure of the build inputs.
Building a store object will add appropriate references, according to the build task.
- Store objects being copied must refer to objects already in the destination store.
Recursive copying must either proceed in dependency order or be atomic.
- We can only safely delete store objects which are not reachable from any reference still in use.
<!-- more details in section on garbage collection, link to it once it exists -->
[referential-integrity]: https://en.m.wikipedia.org/wiki/Referential_integrity
[garbage-collection]: https://en.m.wikipedia.org/wiki/Garbage_collection_(computer_science)
[immutable-object]: https://en.m.wikipedia.org/wiki/Immutable_object
[opaque-data-type]: https://en.m.wikipedia.org/wiki/Opaque_data_type
[unique-identifier]: https://en.m.wikipedia.org/wiki/Unique_identifier