Commit graph

817 commits

Author SHA1 Message Date
Eelco Dolstra 71adb090f0 When using a build hook, only copy missing paths 2014-02-17 22:58:21 +01:00
Eelco Dolstra 69fe6c58fa Move some code around
In particular, do replacing of valid paths during repair later.  This
prevents us from replacing a valid path after the build fails.
2014-02-17 22:25:15 +01:00
Eelco Dolstra 00d30496ca Heuristically detect if a build may have failed due to a full disk
This will allow Hydra to detect that a build should not be marked as
"permanently failed", allowing it to be retried later.
2014-02-17 14:15:56 +01:00
Eelco Dolstra dba33d4018 Minor style fixes 2014-02-14 11:48:42 +01:00
Shea Levy 38c3beac1a Move StoreApi::serve into opServe
Signed-off-by: Shea Levy <shea@shealevy.com>
2014-02-10 06:52:48 -05:00
Shea Levy 1614603165 Pass in params by const ref
Signed-off-by: Shea Levy <shea@shealevy.com>
2014-02-10 06:49:37 -05:00
Shea Levy 64e23d0a38 Add download-via-ssh substituter
This substituter connects to a remote host, runs nix-store --serve
there, and then forwards substituter commands on to the remote host and
sends their results to the calling program. The ssh-substituter-hosts
option can be specified as a list of hosts to try.

This is an initial implementation and, while it works, it has some
limitations:

* Only the first host is used
* There is no caching of query results (all queries are sent to the
  remote machine)
* There is no informative output (such as progress bars)
* Some failure modes may cause unhelpful error messages
* There is no concept of trusted-ssh-substituter-hosts

Signed-off-by: Shea Levy <shea@shealevy.com>
2014-02-08 00:13:33 -05:00
Shea Levy 5671188eb2 nix-store --serve: Flush out after every loop
Signed-off-by: Shea Levy <shea@shealevy.com>
2014-02-08 00:13:33 -05:00
Shea Levy 73874629ef nix-store --serve: Use dump instead of export
Also remove signing support

Signed-off-by: Shea Levy <shea@shealevy.com>
2014-02-08 00:13:33 -05:00
Shea Levy 188f96500b nix-store --serve: Don't fail if asked for info about non-valid path
Signed-off-by: Shea Levy <shea@shealevy.com>
2014-02-08 00:13:33 -05:00
Shea Levy 9488447594 nix-store --serve: Don't loop forever
nix-store --export takes a tmproot, which can only release by exiting.
Substituters don't currently work in a way that could take advantage of
the looping, anyway.

Signed-off-by: Shea Levy <shea@shealevy.com>
2014-02-08 00:13:32 -05:00
Shea Levy 3a38d0f356 Add the nix-store --serve command
This is essentially the substituter API operating on the local store,
which will be used by the ssh substituter. It runs in a loop rather than
just taking one command so that in the future nix will be able to keep
one connection open for multiple instances of the substituter.

Signed-off-by: Shea Levy <shea@shealevy.com>
2014-02-08 00:13:32 -05:00
Eelco Dolstra d210cdc435 Fix assertion failure in ‘nix-store --load-db’
Namely:

  nix-store: derivations.cc:242: nix::Hash nix::hashDerivationModulo(nix::StoreAPI&, nix::Derivation): Assertion `store.isValidPath(i->first)' failed.

This happened because of the derivation output correctness check being
applied before the references of a derivation are valid.
2014-02-03 22:36:07 +01:00
Eelco Dolstra d6582c04c1 Give a friendly error message if the DB directory is not writable
Previously we would say "error: setting synchronous mode: unable to
open database file" which isn't very helpful.
2014-02-01 16:57:38 +01:00
Eelco Dolstra 6ef32bddc1 Fix "make dist" 2014-02-01 14:38:12 +01:00
Eelco Dolstra 0c6d62cf27 Remove Automakefiles 2014-02-01 13:54:38 +01:00
Eelco Dolstra 16e7d69209 Update Makefile variable names 2014-02-01 13:54:38 +01:00
Eelco Dolstra e0234dfddc Rename Makefile -> local.mk 2014-01-30 12:11:06 +01:00
Eelco Dolstra 94f9c14d52 Fix some clang warnings 2014-01-21 18:29:55 +01:00
Eelco Dolstra 81628a6ccc Merge branch 'master' into make
Conflicts:
	src/libexpr/eval.cc
2014-01-21 15:30:01 +01:00
Eelco Dolstra b1db599dd0 Generate schema.sql.hh 2014-01-09 22:10:35 +01:00
Eelco Dolstra b4c684e0f9 Update Makefiles 2014-01-09 16:53:47 +01:00
Eelco Dolstra 11cb4bfb25 Fix checking of NAR hashes
*headdesk*
*headdesk*
*headdesk*

So since commit 22144afa8d, Nix hasn't
actually checked whether the content of a downloaded NAR matches the
hash specified in the manifest / NAR info file.  Urghhh...
2014-01-08 17:35:49 +01:00
Domen Kožar 485f4740ee wording 2014-01-06 11:38:24 +01:00
Eelco Dolstra a6add93d73 Garbage collector: Release locks on temporary root files
This allows processes waiting for such locks to proceed during the
trash deletion phase of the garbage collector.
2013-12-10 13:13:59 +01:00
Eelco Dolstra c5b8fe3151 Print a trace message if a build fails due to the platform being unknown 2013-12-05 14:31:57 -05:00
Eelco Dolstra 7ce0e05ad8 Rename Makefile.new -> Makefile 2013-11-25 15:25:13 +00:00
Eelco Dolstra 2bd0fcc966 Use libnix as a prefix for all Nix libraries
In particular "libutil" was always a problem because it collides with
Glibc's libutil.  Even if we install into $(libdir)/nix, the linker
sometimes got confused (e.g. if a program links against libstore but
not libutil, then ld would report undefined symbols in libstore
because it was looking at Glibc's libutil).
2013-11-23 23:53:41 +00:00
Eelco Dolstra 90dfb37f14 Allow (dynamic) libraries to depend on other libraries 2013-11-23 20:11:02 +00:00
Eelco Dolstra 6b5f89f2cf Drop the dependency on Automake 2013-11-22 19:30:24 +00:00
Eelco Dolstra 754c05ed6c Rename $(here) to $(d) for brevity, and remove trailing slash 2013-11-22 16:45:52 +00:00
Eelco Dolstra 62e35cc3a8 Add ‘make dist’ support 2013-11-22 16:42:25 +01:00
Eelco Dolstra b8e9efc476 New non-recursive, plain Make-based build system 2013-11-22 15:54:18 +01:00
Eelco Dolstra 709cbe4e76 Include <cstring> for memset
This should fix building on Illumos.
2013-11-22 10:00:43 +00:00
Eelco Dolstra a478e8a7bb Remove nix-setuid-helper
AFAIK, nobody uses it, it's not maintained, and it has no tests.
2013-11-14 11:57:37 +01:00
Eelco Dolstra 89e6781cc5 Make function calls show up in stack traces again
Note that adding --show-trace prevents functions calls from being
tail-recursive, so an expression that evaluates without --show-trace
may fail with a stack overflow if --show-trace is given.
2013-11-12 12:51:59 +01:00
Eelco Dolstra c086183843 For auto roots, show the intermediate link
I.e. "nix-store -q --roots" will now show (for example)

  /home/eelco/Dev/nixpkgs/result

rather than

  /nix/var/nix/gcroots/auto/53222qsppi12s2hkap8dm2lg8xhhyk6v
2013-10-22 11:39:10 +02:00
Eelco Dolstra a737f51fd9 Retry all SQLite operations
To deal with SQLITE_PROTOCOL, we also need to retry read-only
operations.
2013-10-16 15:58:20 +02:00
Eelco Dolstra ff02f5336c Fix a race in registerFailedPath()
Registering the path as failed can fail if another process does the
same thing after the call to hasPathFailed().  This is extremely
unlikely though.
2013-10-16 14:55:53 +02:00
Eelco Dolstra 4bd5282573 Convenience macros for retrying a SQLite transaction 2013-10-16 14:46:35 +02:00
Eelco Dolstra bce14d0f61 Don't wrap read-only queries in a transaction
There is no risk of getting an inconsistent result here: if the ID
returned by queryValidPathId() is deleted from the database
concurrently, subsequent queries involving that ID will simply fail
(since IDs are never reused).
2013-10-16 14:36:53 +02:00
Eelco Dolstra 7cdefdbe73 Print a distinct warning for SQLITE_PROTOCOL 2013-10-16 14:27:36 +02:00
Eelco Dolstra d05bf04444 Treat SQLITE_PROTOCOL as SQLITE_BUSY
In the Hydra build farm we fairly regularly get SQLITE_PROTOCOL errors
(e.g., "querying path in database: locking protocol").  The docs for
this error code say that it "is returned if some other process is
messing with file locks and has violated the file locking protocol
that SQLite uses on its rollback journal files."  However, the SQLite
source code reveals that this error can also occur under high load:

  if( cnt>5 ){
    int nDelay = 1;                      /* Pause time in microseconds */
    if( cnt>100 ){
      VVA_ONLY( pWal->lockError = 1; )
      return SQLITE_PROTOCOL;
    }
    if( cnt>=10 ) nDelay = (cnt-9)*238;  /* Max delay 21ms. Total delay 996ms */
    sqlite3OsSleep(pWal->pVfs, nDelay);
  }

i.e. if certain locks cannot be not acquired, SQLite will retry a
number of times before giving up and returing SQLITE_PROTOCOL.  The
comments say:

  Circumstances that cause a RETRY should only last for the briefest
  instances of time.  No I/O or other system calls are done while the
  locks are held, so the locks should not be held for very long. But
  if we are unlucky, another process that is holding a lock might get
  paged out or take a page-fault that is time-consuming to resolve,
  during the few nanoseconds that it is holding the lock.  In that case,
  it might take longer than normal for the lock to free.
  ...
  The total delay time before giving up is less than 1 second.

On a heavily loaded machine like lucifer (the main Hydra server),
which often has dozens of processes waiting for I/O, it seems to me
that a page fault could easily take more than a second to resolve.
So, let's treat SQLITE_PROTOCOL as SQLITE_BUSY and retry the
transaction.

Issue NixOS/hydra#14.
2013-10-16 14:19:59 +02:00
Eelco Dolstra 936f9d45ba Don't apply the CPU affinity hack to nix-shell (and other Perl programs)
As discovered by Todd Veldhuizen, the shell started by nix-shell has
its affinity set to a single CPU.  This is because nix-shell connects
to the Nix daemon, which causes the affinity hack to be applied.  So
we turn this off for Perl programs.
2013-09-06 16:36:56 +02:00
Eelco Dolstra b29d3f4aee Only show trace messages when tracing is enabled 2013-09-02 12:01:04 +02:00
Eelco Dolstra efe4289464 Add an option to limit the log output of builders
This is mostly useful for Hydra to deal with builders that get stuck
in an infinite loop writing data to stdout/stderr.
2013-09-02 11:58:18 +02:00
Ivan Kozik 34bb806f74 Fix typos, especially those that end up in the Nix manual 2013-08-26 11:15:22 +02:00
Gergely Risko c6c024ca6f Fix personality switching from x86_64 to i686
On Linux, Nix can build i686 packages even on x86_64 systems.  It's not
enough to recognize this situation by settings.thisSystem, we also have
to consult uname().  E.g. we can be running on a i686 Debian with an
amd64 kernel.  In that situation settings.thisSystem is i686-linux, but
we still need to change personality to i686 to make builds consistent.
2013-08-26 11:12:35 +02:00
Eelco Dolstra a583a2bc59 Run the daemon worker on the same CPU as the client
On a system with multiple CPUs, running Nix operations through the
daemon is significantly slower than "direct" mode:

$ NIX_REMOTE= nix-instantiate '<nixos>' -A system
real    0m0.974s
user    0m0.875s
sys     0m0.088s

$ NIX_REMOTE=daemon nix-instantiate '<nixos>' -A system
real    0m2.118s
user    0m1.463s
sys     0m0.218s

The main reason seems to be that the client and the worker get moved
to a different CPU after every call to the worker.  This patch adds a
hack to lock them to the same CPU.  With this, the overhead of going
through the daemon is very small:

$ NIX_REMOTE=daemon nix-instantiate '<nixos>' -A system
real    0m1.074s
user    0m0.809s
sys     0m0.098s
2013-08-07 14:02:04 +02:00
Eelco Dolstra a4921b8ceb Revert "build-remote.pl: Enforce timeouts locally"
This reverts commit 69b8f9980f.

The timeout should be enforced remotely.  Otherwise, if the garbage
collector is running either locally or remotely, if will block the
build or closure copying for some time.  If the garbage collector
takes too long, the build may time out, which is not what we want.
Also, on heavily loaded systems, copying large paths to and from the
remote machine can take a long time, also potentially resulting in a
timeout.
2013-07-18 12:52:29 +02:00