This didn't work anymore since decompression was only done in the
non-coroutine case.
Decompressors are now sinks, just like compressors.
Also fixed a bug in bzip2 API handling (we have to handle BZ_RUN_OK
rather than BZ_OK), which we didn't notice because there was a missing
'throw':
if (ret != BZ_OK)
CompressionError("error while compressing bzip2 file");
This reduces memory consumption of
nix copy --from file://... --to ~/my-nix /nix/store/95cwv4q54dc6giaqv6q6p4r02ia2km35-blender-2.79
from 514 MiB to 18 MiB for an uncompressed binary cache, and from 192
MiB to 53 MiB for a bzipped binary cache. It may also be faster
because fetching can happen concurrently with decompression/writing.
Continuation of 48662d151b.
Issue https://github.com/NixOS/nix/issues/1681.
This allows specifying the AWS configuration profile to use. E.g.
nix copy --from s3://my-cache?profile=aws-dev-account /nix/store/cf3isrlqavvd5w7rpky1fa8j9lcnlggm-...
As far as we're concerned, not being able to access a file just means
the file is missing. Plus, AWS explicitly goes out of its way to
return a 403 if the file is missing and the requester doesn't have
permission to list the bucket.
Also getting rid of an old hack that Eelco said was only relevant
to an older AWS SDK.
Recently aws-sdk-cpp quietly switched to using S3 virtual host URIs
(https://github.com/aws/aws-sdk-cpp/commit/69d9c53882), i.e. it sends
requests to http://<bucket>.<region>.s3.amazonaws.com rather than
http://<region>.s3.amazonaws.com/<bucket>. However this interacts
badly with curl connection reuse. For example, if we do the following:
1) Check whether a bucket exists using GetBucketLocation.
2) If it doesn't, create it using CreateBucket.
3) Do operations on the bucket.
then 3) will fail for a minute or so with a NoSuchBucket exception,
presumably because the server being hit is a fallback for cases when
buckets don't exist.
Disabling the use of virtual hosts ensures that 3) succeeds
immediately. (I don't know what S3's consistency guarantees are for
bucket creation, but in practice buckets appear to be available
immediately.)
Newer versions of aws-sdk-cpp call CalculateDelayBeforeNextRetry()
even for non-retriable errors (like NoSuchKey) whih causes log spam in
hydra-queue-runner.
The typical use is to inherit Config and add Setting<T> members:
class MyClass : private Config
{
Setting<int> foo{this, 123, "foo", "the number of foos to use"};
Setting<std::string> bar{this, "blabla", "bar", "the name of the bar"};
MyClass() : Config(readConfigFile("/etc/my-app.conf"))
{
std::cout << foo << "\n"; // will print 123 unless overriden
}
};
Currently, this is used by Store and its subclasses for store
parameters. You now get a warning if you specify a non-existant store
parameter in a store URI.
So if "text-compression=br", the .ls file in S3 will get a
Content-Encoding of "br". Brotli appears to compress better than xz
for this kind of file and is natively supported by browsers.
You can now set the store parameter "text-compression=br" to compress
textual files in the binary cache (i.e. narinfo and logs) using
Brotli. This sets the Content-Encoding header; the extension of
compressed files is unchanged.
You can separately specify the compression of log files using
"log-compression=br". This is useful when you don't want to compress
narinfo files for backward compatibility.
This adds support for s3:// URIs in all places where Nix allows URIs,
e.g. in builtins.fetchurl, builtins.fetchTarball, <nix/fetchurl.nix>
and NIX_PATH. It allows fetching resources from private S3 buckets,
using credentials obtained from the standard places (i.e. AWS_*
environment variables, ~/.aws/credentials and the EC2 metadata
server). This may not be super-useful in general, but since we already
depend on aws-sdk-cpp, it's a cheap feature to add.
Because config.h can #define things like _FILE_OFFSET_BITS=64 and not
every compilation unit includes config.h, we currently compile half of
Nix with _FILE_OFFSET_BITS=64 and other half with _FILE_OFFSET_BITS
unset. This causes major havoc with the Settings class on e.g. 32-bit ARM,
where different compilation units disagree with the struct layout.
E.g.:
diff --git a/src/libstore/globals.cc b/src/libstore/globals.cc
@@ -166,6 +166,8 @@ void Settings::update()
_get(useSubstitutes, "build-use-substitutes");
+ fprintf(stderr, "at Settings::update(): &useSubstitutes = %p\n", &nix::settings.useSubstitutes);
_get(buildUsersGroup, "build-users-group");
diff --git a/src/libstore/remote-store.cc b/src/libstore/remote-store.cc
+++ b/src/libstore/remote-store.cc
@@ -138,6 +138,8 @@ void RemoteStore::initConnection(Connection & conn)
void RemoteStore::setOptions(Connection & conn)
{
+ fprintf(stderr, "at RemoteStore::setOptions(): &useSubstitutes = %p\n", &nix::settings.useSubstitutes);
conn.to << wopSetOptions
Gave me:
at Settings::update(): &useSubstitutes = 0xb6e5c5cb
at RemoteStore::setOptions(): &useSubstitutes = 0xb6e5c5c7
That was not a fun one to debug!
It failed with
AWS error uploading ‘6gaxphsyhg66mz0a00qghf9nqf7majs2.ls.xz’: Unable to parse ExceptionName: MissingContentLength Message: You must provide the Content-Length HTTP header.
possibly because the istringstream_nocopy introduced in
0d2ebb4373 doesn't supply the seek
method that the AWS library expects. So bring back the old version,
but only for S3BinaryCacheStore.
The fact that queryPathInfo() is synchronous meant that we needed a
thread for every concurrent binary cache lookup, even though they end
up being handled by the same download thread. Requiring hundreds of
threads is not a good idea. So now there is an asynchronous version of
queryPathInfo() that takes a callback function to process the
result. Similarly, enqueueDownload() now takes a callback rather than
returning a future.
Thus, a command like
nix path-info --store https://cache.nixos.org/ -r /nix/store/slljrzwmpygy1daay14kjszsr9xix063-nixos-16.09beta231.dccf8c5
that returns 4941 paths now takes 1.87s using only 2 threads (the main
thread and the downloader thread). (This is with a prewarmed
CloudFront.)
This allows commands like "nix verify --all" or "nix path-info --all"
to work on S3 caches.
Unfortunately, this requires some ugly hackery: when querying the
contents of the bucket, we don't want to have to read every .narinfo
file. But the S3 bucket keys only include the hash part of each store
path, not the name part. So as a special exception
queryAllValidPaths() can now return store paths *without* the name
part, and queryPathInfo() accepts such store paths (returning a
ValidPathInfo object containing the full name).