lix/src/libutil/url-parts.hh
Maximilian Bosch 93a8a005de
libstore/openStore: fix stores with IPv6 addresses
In `nixStable` (2.3.7 to be precise) it's possible to connect to stores
using an IPv6 address:

  nix ping-store --store ssh://root@2001:db8::1

This is also useful for `nixops(1)` where you could specify an IPv6
address in `deployment.targetHost`.

However, this behavior is broken on `nixUnstable` and fails with the
following error:

  $ nix store ping --store ssh://root@2001:db8::1
  don't know how to open Nix store 'ssh://root@2001:db8::1'

This happened because `openStore` from `libstore` uses the `parseURL`
function from `libfetchers` which expects a valid URL as defined in
RFC2732. However, this is unsupported by `ssh(1)`:

  $ nix store ping --store 'ssh://root@[2001:db8::1]'
  cannot connect to 'root@[2001:db8::1]'

This patch now allows both ways of specifying a store (`root@2001:db8::1`) and
also `root@[2001:db8::1]` since the latter one is useful to pass query
parameters to the remote store.

In order to achieve this, the following changes were made:

* The URL regex from `url-parts.hh` now allows an IPv6 address in the
  form `2001:db8::1` and also `[2001:db8::1]`.

* In `libstore`, a new function named `extractConnStr` ensures that a
  proper URL is passed to e.g. `ssh(1)`:

  * If a URL looks like either `[2001:db8::1]` or `root@[2001:db8::1]`,
    the brackets will be removed using a regex. No additional validation
    is done here as only strings parsed by `parseURL` are expected.

  * In any other case, the string will be left untouched.

* The rules above only apply for `LegacySSHStore` and `SSHStore` (a.k.a
  `ssh://` and `ssh-ng://`).

Unresolved questions:

* I'm not really sure whether we want to allow both variants of IPv6
  addresses in the URL parser. However it should be noted that both seem
  to be possible according to RFC2732:

  > This document incudes an update to the generic syntax for Uniform
  > Resource Identifiers defined in RFC 2396 [URL].  It defines a syntax
  > for IPv6 addresses and allows the use of "[" and "]" within a URI
  > explicitly for this reserved purpose.

* Currently, it's not supported to specify a port number behind the
  hostname, however it seems as this is not really supported by the URL
  parser. Hence, this is probably out of scope here.
2020-12-09 12:23:29 +01:00

46 lines
2.3 KiB
C++

#pragma once
#include <string>
#include <regex>
namespace nix {
// URI stuff.
const static std::string pctEncoded = "(?:%[0-9a-fA-F][0-9a-fA-F])";
const static std::string schemeRegex = "(?:[a-z][a-z0-9+.-]*)";
const static std::string ipv6AddressSegmentRegex = "[0-9a-fA-F:]+";
const static std::string ipv6AddressRegex = "(?:\\[" + ipv6AddressSegmentRegex + "\\]|" + ipv6AddressSegmentRegex + ")";
const static std::string unreservedRegex = "(?:[a-zA-Z0-9-._~])";
const static std::string subdelimsRegex = "(?:[!$&'\"()*+,;=])";
const static std::string hostnameRegex = "(?:(?:" + unreservedRegex + "|" + pctEncoded + "|" + subdelimsRegex + ")*)";
const static std::string hostRegex = "(?:" + ipv6AddressRegex + "|" + hostnameRegex + ")";
const static std::string userRegex = "(?:(?:" + unreservedRegex + "|" + pctEncoded + "|" + subdelimsRegex + "|:)*)";
const static std::string authorityRegex = "(?:" + userRegex + "@)?" + hostRegex + "(?::[0-9]+)?";
const static std::string pcharRegex = "(?:" + unreservedRegex + "|" + pctEncoded + "|" + subdelimsRegex + "|[:@])";
const static std::string queryRegex = "(?:" + pcharRegex + "|[/? \"])*";
const static std::string segmentRegex = "(?:" + pcharRegex + "+)";
const static std::string absPathRegex = "(?:(?:/" + segmentRegex + ")*/?)";
const static std::string pathRegex = "(?:" + segmentRegex + "(?:/" + segmentRegex + ")*/?)";
// A Git ref (i.e. branch or tag name).
const static std::string refRegexS = "[a-zA-Z0-9][a-zA-Z0-9_.-]*"; // FIXME: check
extern std::regex refRegex;
// Instead of defining what a good Git Ref is, we define what a bad Git Ref is
// This is because of the definition of a ref in refs.c in https://github.com/git/git
// See tests/fetchGitRefs.sh for the full definition
const static std::string badGitRefRegexS = "//|^[./]|/\\.|\\.\\.|[[:cntrl:][:space:]:?^~\[]|\\\\|\\*|\\.lock$|\\.lock/|@\\{|[/.]$|^@$|^$";
extern std::regex badGitRefRegex;
// A Git revision (a SHA-1 commit hash).
const static std::string revRegexS = "[0-9a-fA-F]{40}";
extern std::regex revRegex;
// A ref or revision, or a ref followed by a revision.
const static std::string refAndOrRevRegex = "(?:(" + revRegexS + ")|(?:(" + refRegexS + ")(?:/(" + revRegexS + "))?))";
const static std::string flakeIdRegexS = "[a-zA-Z][a-zA-Z0-9_-]*";
extern std::regex flakeIdRegex;
}