lix/src/libstore/ca-specific-schema.sql
Sergei Trofimovich edfc5b2f12 ca-specific-schema.sql: add index on RealisationsRefs(referrer) and (outputPath)
For a typical desktop system (~2K packages) we can easily get 100K
entries in RealisationsRefs. Without indices query for RealisationsRefs
requires linear scan.

RealisationsRefs(referrer)
--------------------------

Inefficiency is seen as a 100% CPU load of nix-daemon for the following
scenario:

    $ nix edit -f . bash # add unused environment variable, like FOO="1"
    # populate RealisationsRefs, build fresh system
    $ nix build -f nixos system --arg config '{ contentAddressedByDefault = true; }'
    $ nix edit -f . bash # add unused environment variable, like FOO="2"
    $ time nix build -f nixos system --arg config '{ contentAddressedByDefault = true; }'

In this case `bash `will be rebuilt a few times and then rest of CPU
time is spent on scanning RealisationsRefs table (about 5 CPU-minutes
on my machine).

Before the change:

    $ time nix build -f nixos system ... # step 4 above
    real    34m3,613s
    user    0m5,232s
    sys     0m0,758s

Of all this time about 29.5 minutes are taken by nix-daemon's CPU time.

After the change:

    $ time nix build -f nixos system ... # step 4 above
    real    4m50,061s
    user    0m5,038s
    sys     0m0,677s

Of all this time about 1 minute is taken by nix-daemon's CPU time.
Most of the time is spent polling for non-existent realisations on
cache-nixos.org.

Realisations(outputPath)
------------------------

After running CA system for two weeks I got ~1M entries in Realisations
table. `nix-collect-garbage` became very slow (seemingly 100 path deletions
per second). It happens due to a slow cascading delete from Realisations
triggered by deletion from ValidPaths.

The fix is to add an index on primary key from ValidPaths(id) that
triggers cascading deletions.

Before the change:
    $ time nix-collect-garbage -d --max-freed 100G
    <interrupted before finish, took too long>
    real    23m32.411s
    user    17m49.679s
    sys     4m50.609s

Most of time was spent in re-scanning Realisations table on each path deletion.

After the change:
    $ time nix-collect-garbage -d --max-freed 100G

    real    8m43.226s
    user    6m16.317s
    sys     1m40.188s

Time is spent scanning sqlite indices and in kernel when unlinking directories.
2021-11-10 08:32:05 +00:00

26 lines
1.1 KiB
SQL

-- Extension of the sql schema for content-addressed derivations.
-- Won't be loaded unless the experimental feature `ca-derivations`
-- is enabled
create table if not exists Realisations (
id integer primary key autoincrement not null,
drvPath text not null,
outputName text not null, -- symbolic output id, usually "out"
outputPath integer not null,
signatures text, -- space-separated list
foreign key (outputPath) references ValidPaths(id) on delete cascade
);
create index if not exists IndexRealisations on Realisations(drvPath, outputName);
create table if not exists RealisationsRefs (
referrer integer not null,
realisationReference integer,
foreign key (referrer) references Realisations(id) on delete cascade,
foreign key (realisationReference) references Realisations(id) on delete restrict
);
-- used by QueryRealisationReferences
create index if not exists IndexRealisationsRefs on RealisationsRefs(referrer);
-- used by cascade deletion when ValidPaths is deleted
create index if not exists IndexRealisationsRefsOnOutputPath on Realisations(outputPath);