Flex's regexes have an annoying feature: the dot matches everything
except a newline. This causes problems for expressions like:
"${0}\
"
where the backslash-newline combination matches this rule instead of the
intended one mentioned in the comment:
<STRING>\$|\\|\$\\ {
/* This can only occur when we reach EOF, otherwise the above
(...|\$[^\{\"\\]|\\.|\$\\.)+ would have triggered.
This is technically invalid, but we leave the problem to the
parser who fails with exact location. */
return STR;
}
However, the parser actually accepts the resulting token sequence
('"' DOLLAR_CURLY 0 '}' STR '"'), which is a problem because the lexer
rule didn't assign anything to yylval. Ultimately this leads to a crash
when dereferencing a NULL pointer in ExprConcatStrings::bindVars().
The fix does change the syntax of the language in some corner cases
but I think it's only turning previously invalid (or crashing) syntax
to valid syntax. E.g.
"a\
b"
and
''a''\
b''
were previously syntax errors but now both result in "a\nb".
Found by afl-fuzz.
With catch-all rules, we hide potential errors.
It turns out that a4744254 made one cath-all useless. Flex detected that
is was impossible to reach.
The other is more subtle, as it can only trigger on unfinished escapes
in unfinished strings, which only occurs at EOF.
Now, in addition to a."${b}".c, you can write a.${b}.c (applicable
wherever dynamic attributes are valid).
Signed-off-by: Shea Levy <shea@shealevy.com>
In Nixpkgs, the attribute in all-packages.nix corresponding to a
package is usually equal to the package name. However, this doesn't
work if the package contains a dash, which is fairly common. The
convention is to replace the dash with an underscore (e.g. "dbus-lib"
becomes "dbus_glib"), but that's annoying. So now dashes are valid in
variable / attribute names, allowing you to write:
dbus-glib = callPackage ../development/libraries/dbus-glib { };
and
buildInputs = [ dbus-glib ];
Since we don't have a negation or subtraction operation in Nix, this
is unambiguous.
brackets, e.g.
import <nixpkgs/pkgs/lib>
are resolved by looking them up relative to the elements listed in
the search path. This allows us to get rid of hacks like
import "${builtins.getEnv "NIXPKGS_ALL"}/pkgs/lib"
The search path can be specified through the ‘-I’ command-line flag
and through the colon-separated ‘NIX_PATH’ environment variable,
e.g.,
$ nix-build -I /etc/nixos ...
If a file is not found in the search path, an error message is
lazily thrown.
errors with position info.
* For all positions, use the position of the first character of the
first token, rather than the last character of the first token plus
one.
in attribute set pattern matches. This allows defining a function
that takes *at least* the listed attributes, while ignoring
additional attributes. For instance,
{stdenv, fetchurl, fuse, ...}:
stdenv.mkDerivation {
...
};
defines a function that requires an attribute set that contains the
specified attributes but ignores others. The main advantage is that
we can then write in all-packages.nix
aefs = import ../bla/aefs pkgs;
instead of
aefs = import ../bla/aefs {
inherit stdenv fetchurl fuse;
};
This saves a lot of typing (not to mention not having to update
all-packages.nix with purely mechanical changes). It saves as much
typing as the "args: with args;" style, but has the advantage that
the function arguments are properly declared (not implicit in what
the body of the "with" uses).
single quotes. Example (from NixOS):
job = ''
start on network-interfaces
start script
rm -f /var/run/opengl-driver
${if videoDriver == "nvidia"
then "ln -sf ${nvidiaDrivers} /var/run/opengl-driver"
else if cfg.driSupport
then "ln -sf ${mesa} /var/run/opengl-driver"
else ""
}
rm -f /var/log/slim.log
end script
'';
This style has two big advantages:
- \, ' and " aren't special, only '' and ${. So you get a lot less
escaping in shell scripts / configuration files in Nixpkgs/NixOS.
The delimiter '' is rare in scripts (and can usually be written as
""). ${ is also fairly rare.
Other delimiters such as <<...>>, {{...}} and <|...|> were also
considered but this one appears to have the fewest drawbacks
(thanks Martin).
- Indentation is intelligently stripped so that multi-line strings
can follow the nesting structure of the containing Nix
expression. E.g. in the example above 6 spaces are stripped from
the start of each line. This prevents unnecessary indentation in
generated files (which sometimes even breaks things).
See tests/lang/eval-okay-ind-string.nix for some examples.
concatenation and string coercion. This was a big mess (see
e.g. NIX-67). Contexts are now folded into strings, so that they
don't cause evaluation errors when they're not expected. The
semantics of paths has been clarified (see nixexpr-ast.def).
toString() and coerceToString() have been merged.
Semantic change: paths are now copied to the store when they're in a
concatenation (and in most other situations - that's the
formalisation of the meaning of a path). So
"foo " + ./bla
evaluates to "foo /nix/store/hash...-bla", not "foo
/path/to/current-dir/bla". This prevents accidental impurities, and
is more consistent with the treatment of derivation outputs, e.g.,
`"foo " + bla' where `bla' is a derivation. (Here `bla' would be
replaced by the output path of `bla'.)