Compare commits

...

12 commits

Author SHA1 Message Date
eldritch horrors 112fd6c971 rewrite the parser with pegtl instead of flex/bison
this gives about 20% performance improvements on pure parsing. obviously
it'll be less on full eval, but depending on how much parsing has to be
done (eg whether nixpkgs haskell modules are included or not) it ranges
anywhere from 4% to 10% in our tests.

this has been tested with thousands of core hours of fuzz testing to
ensure that the ASTs produced by the new parser are exactly the same as
the ones produced by the old parser. error messages will
change (sometimes a lot) and are currently not perfect, but we'd rather
leave that open for improvement than having this work rot forever.

Change-Id: Ie66ec2d045dec964632c6541e25f8f0797319ee2
2024-03-16 18:07:01 +01:00
eldritch horrors 8225284df3 add expr memory management
with the prepatory work done this mostly means turning plain pointers
into unique_ptrs, with all the associated churn that necessitates.

Change-Id: I0c238c118617420650432f4ed45569baa3e3f413
2024-03-16 15:44:20 +01:00
eldritch horrors 840e9a0113 pass Exprs as references, not pointers
almost all places where Exprs are passed as pointers expect the pointers
to be non-null. pass them as references instead to encode this
constraint in types.

Change-Id: Ia98f166fec3c23151f906e13acb4a0954a5980a2
2024-03-16 15:40:14 +01:00
eldritch horrors f86eafa3b6 store ExprConcatStrings elements as direct vector
storing a pointer only adds an unnecessary indirection and memory allocation.

Change-Id: If06dd05effdf1ccb0df0873580f50c775608925d
2024-03-16 15:18:17 +01:00
eldritch horrors 4971f6838a don't immediately throw parser errors
now that destructors are hooked up we want to give the C skeleton a
chance to actually run them. since bison does not call destructors on
values that have been passed to semantic actions even when the action
causes an abort we will also have to do some manual deleting.

partially reverts e8d9de967f.

Change-Id: Ia22bdaa9e969b74e17a6c496e35e6c2d86b7d750
2024-03-16 15:15:13 +01:00
eldritch horrors 99e03f5661 hook up bison destructors for state objects
this doesn't help much yet since the state objects themselves also leak
all memory they are given.

Change-Id: I80245b0c747308e80923e7f18ce4e1a4898f93b0
2024-03-16 15:11:53 +01:00
pennae 6b0620f387 use byte indexed locations for PosIdx
we now keep not a table of all positions, but a table of all origins and
their sizes. position indices are now direct pointers into the virtual
concatenation of all parsed contents. this slightly reduces memory usage
and time spent in the parser, at the cost of not being able to report
positions if the total input size exceeds 4GiB. this limit is not unique
to nix though, rustc and clang also limit their input to 4GiB (although
at least clang refuses to process inputs that are larger, we will not).

this new 4GiB limit probably will not cause any problems for quite a
while, all of nixpkgs together is less than 100MiB in size and already
needs over 700MiB of memory and multiple seconds just to parse. 4GiB
worth of input will easily take multiple minutes and over 30GiB of
memory without even evaluating anything. if problems *do* arise we can
probably recover the old table-based system by adding some tracking to
Pos::Origin (or increasing the size of PosIdx outright), but for time
being this looks like more complexity than it's worth.

since we now need to read the entire input again to determine the
line/column of a position we'll make unsafeGetAttrPos slightly lazy:
mostly the set it returns is only used to determine the file of origin
of an attribute, not its exact location. the thunks do not add
measurable runtime overhead.

notably this change is necessary to allow changing the parser since
apparently nothing supports nix's very idiosyncratic line ending choice
of "anything goes", making it very hard to calculate line/column
positions in the parser (while byte offsets are very easy).

(cherry picked from commit 5d9fdab3de0ee17c71369ad05806b9ea06dfceda)
Change-Id: Ie0b2430cb120c09097afa8c0101884d94f4bbf34
2024-03-15 19:28:25 +01:00
pennae b24bac3a8f diagnose "unexpected EOF" at EOF
this needs a string comparison because there seems to be no other way to
get that information out of bison. usually the location info is going to
be correct (pointing at a bad token), but since EOF isn't a token as
such it'll be wrong in that this case.

this hasn't shown up much so far because a single line ending *is* a
token, so any file formatted in the usual manner (ie, ending in a line
ending) would have its EOF position reported correctly.

(cherry picked from commit 855fd5a1bb781e4f722c1d757ba43e866d370132)
Change-Id: I120c56a962f4286b1ae3b71da7b71ce8ec3e0535
2024-03-15 19:28:23 +01:00
pennae bcef14cb71 match line endings used by parser and error reports
the parser treats a plain \r as a newline, error reports do not. this
can lead to interesting divergences if anything makes use of this
feature, with error reports pointing to wrong locations in the input (or
even outside the input altogether).

(cherry picked from commit 2be6b143289e5479cc4a2667bb84e879116c2447)
Change-Id: Ieb7f7655bac8cb0cf5734c60bd41723388f2973c
2024-03-15 19:28:20 +01:00
pennae 46e8caabb1 report inherit attr errors at the duplicate name
previously we reported the error at the beginning of the binding
block (for plain inherits) or the beginning of the attr list (for
inherit-from), effectively hiding where exactly the error happened.

this also carries over to runtime positions of attributes in sets as
reported by unsafeGetAttrPos. we're not worried about this changing
observable eval behavior because it *is* marked unsafe, and the new
behavior is much more useful.

(cherry picked from commit 1edd6fada53553b89847ac3981ac28025857ca02)
Change-Id: I2f50eb9f3dc3977db4eb3e3da96f1cb37ccd5174
2024-03-15 19:28:17 +01:00
pennae eac8c6e280 normalize formal order on ExprLambda::show
we already normalize attr order to lexicographic, doing the same for
formals makes sense. doubly so because the order of formals would
otherwise depend on the context of the expression, which is not quite as
useful as one might expect.

(cherry picked from commit 4147ecfb1c51f3fe3b4adcbd4e753fd487dab645)
Change-Id: I3fd0dbdef3ac7447a3a03ff20bb514a0d0f23fb1
2024-03-15 19:28:13 +01:00
pennae ad708d3de3 keep copies of parser inputs that are in-memory only
the parser modifies its inputs, which means that sharing them between
the error context reporting system and the parser itself can confuse the
reporting system. usually this led to early truncation of error context
reports which, while not dangerous, can be quite confusing.

(cherry picked from commit d384ecd553aa997270b79ee98d02f7cf7e1849e6)
Change-Id: I677646b5675b12b2faa787943646aa36dc6e6ee3
2024-03-15 19:26:59 +01:00
58 changed files with 2190 additions and 1131 deletions

View file

@ -342,6 +342,16 @@ AC_SUBST(doc_generate)
# Look for lowdown library.
PKG_CHECK_MODULES([LOWDOWN], [lowdown >= 0.9.0], [CXXFLAGS="$LOWDOWN_CFLAGS $CXXFLAGS"])
# Look for pegtl.
# pegtl has only cmake support, no pkg-config.
AC_ARG_VAR([PEGTL_HEADERS], [include path of pegtl headers])
AC_LANG_PUSH(C++)
AC_SUBST(PEGTL_HEADERS)
[CXXFLAGS="-I $PEGTL_HEADERS $CXXFLAGS"]
AC_CHECK_HEADER(tao/pegtl.hpp, [], [AC_MSG_ERROR([PEGTL not found.])])
AC_LANG_POP(C++)
# Setuid installations.
AC_CHECK_FUNCS([setresuid setreuid lchown])

View file

@ -0,0 +1,7 @@
---
synopsis: consistent order of lambda formals in printed expressions
prs: 9874
---
Always print lambda formals in lexicographic order rather than the internal, creation-time based symbol order.
This makes printed formals independent of the context they appear in.

View file

@ -0,0 +1,6 @@
---
synopsis: fix duplicate attribute error positions for `inherit`
prs: 9874
---
When an inherit caused a duplicate attribute error the position of the error was not reported correctly, placing the error with the inherit itself or at the start of the bindings block instead of the offending attribute name.

View file

@ -192,6 +192,25 @@
boehmgc = final.boehmgc-nix;
busybox-sandbox-shell = final.busybox-sandbox-shell or final.default-busybox-sandbox-shell;
};
pegtl = final.callPackage (
{ stdenv, cmake, ninja }:
stdenv.mkDerivation {
pname = "pegtl";
version = "3.2.7";
src = final.fetchFromGitHub {
repo = "PEGTL";
owner = "taocpp";
rev = "refs/tags/3.2.7";
hash = "sha256-IV5YNGE4EWVrmg2Sia/rcU8jCuiBynQGJM6n3DCWTQU=";
};
nativeBuildInputs = [ cmake ninja ];
}
) {};
};
in {

View file

@ -7,7 +7,6 @@
aws-sdk-cpp,
boehmgc,
nlohmann_json,
bison,
changelog-d,
boost,
brotli,
@ -16,7 +15,6 @@
doxygen,
editline,
fileset,
flex,
git,
gtest,
jq,
@ -29,6 +27,7 @@
mdbook-linkcheck,
mercurial,
openssl,
pegtl,
pkg-config,
rapidcheck,
sqlite,
@ -127,9 +126,6 @@ in stdenv.mkDerivation (finalAttrs: {
dontBuild = false;
nativeBuildInputs = [
bison
flex
] ++ [
(lib.getBin lowdown)
mdbook
mdbook-linkcheck
@ -212,7 +208,8 @@ in stdenv.mkDerivation (finalAttrs: {
++ lib.optionals (finalAttrs.doCheck || internalApiDocs) testConfigureFlags
++ lib.optional (!canRunInstalled) "--disable-doc-gen"
++ [ (lib.enableFeature internalApiDocs "internal-api-docs") ]
++ lib.optional (!forDevShell) "--sysconfdir=/etc";
++ lib.optional (!forDevShell) "--sysconfdir=/etc"
++ [ "PEGTL_HEADERS=${lib.getDev pegtl}/include" ];
installTargets = lib.optional internalApiDocs "internal-api-html";

View file

@ -234,7 +234,7 @@ void SourceExprCommand::completeInstallable(std::string_view prefix)
evalSettings.pureEval = false;
auto state = getEvalState();
Expr *e = state->parseExprFromFile(
Expr & e = state->parseExprFromFile(
resolveExprPath(state->checkSourcePath(lookupFileArg(*state, *file)))
);
@ -444,13 +444,13 @@ Installables SourceExprCommand::parseInstallables(
auto vFile = state->allocValue();
if (file == "-") {
auto e = state->parseStdin();
auto & e = state->parseStdin();
state->eval(e, *vFile);
}
else if (file)
state->evalFile(lookupFileArg(*state, *file), *vFile);
else {
auto e = state->parseExprFromString(*expr, state->rootPath(CanonPath::fromCwd()));
auto & e = state->parseExprFromString(*expr, state->rootPath(CanonPath::fromCwd()));
state->eval(e, *vFile);
}

View file

@ -95,7 +95,7 @@ struct NixRepl
void reloadFiles();
void addAttrsToScope(Value & attrs);
void addVarToScope(const Symbol name, Value & v);
Expr * parseString(std::string s);
Expr & parseString(std::string s);
void evalString(std::string s, Value & v);
void loadDebugTraceEnv(DebugTrace & dt);
@ -297,9 +297,9 @@ StringSet NixRepl::completePrefix(const std::string & prefix)
auto expr = cur.substr(0, dot);
auto cur2 = cur.substr(dot + 1);
Expr * e = parseString(expr);
Expr & e = parseString(expr);
Value v;
e->eval(*state, *env, v);
e.eval(*state, *env, v);
state->forceAttrs(v, noPos, "while evaluating an attrset for the purpose of completion (this error should not be displayed; file an issue?)");
for (auto & i : *v.attrs) {
@ -660,7 +660,7 @@ ProcessLineResult NixRepl::processLine(std::string line)
line[p + 1] != '=' &&
isVarName(name = removeWhitespace(line.substr(0, p))))
{
Expr * e = parseString(line.substr(p + 1));
Expr & e = parseString(line.substr(p + 1));
Value & v(*state->allocValue());
v.mkThunk(env, e);
addVarToScope(state->symbols.create(name), v);
@ -776,7 +776,7 @@ void NixRepl::addVarToScope(const Symbol name, Value & v)
}
Expr * NixRepl::parseString(std::string s)
Expr & NixRepl::parseString(std::string s)
{
return state->parseExprFromString(std::move(s), state->rootPath(CanonPath::fromCwd()), staticEnv);
}
@ -784,8 +784,8 @@ Expr * NixRepl::parseString(std::string s)
void NixRepl::evalString(std::string s, Value & v)
{
Expr * e = parseString(s);
e->eval(*state, *env, v);
Expr & e = parseString(s);
e.eval(*state, *env, v);
state->forceValue(v, v.determinePos(noPos));
}

View file

@ -86,11 +86,11 @@ void EvalState::forceValue(Value & v, const PosIdx pos)
{
if (v.isThunk()) {
Env * env = v.thunk.env;
Expr * expr = v.thunk.expr;
Expr & expr = *v.thunk.expr;
try {
v.mkBlackhole();
//checkInterrupt();
expr->eval(*this, *env, v);
expr.eval(*this, *env, v);
} catch (...) {
v.mkThunk(env, expr);
tryFixupBlackHolePos(v, pos);

View file

@ -18,7 +18,6 @@
#include "gc-small-vector.hh"
#include "fetch-to-store.hh"
#include "flake/flakeref.hh"
#include "parser-tab.hh"
#include <algorithm>
#include <chrono>
@ -925,14 +924,14 @@ void EvalState::mkList(Value & v, size_t size)
unsigned long nrThunks = 0;
static inline void mkThunk(Value & v, Env & env, Expr * expr)
static inline void mkThunk(Value & v, Env & env, Expr & expr)
{
v.mkThunk(&env, expr);
nrThunks++;
}
void EvalState::mkThunk_(Value & v, Expr * expr)
void EvalState::mkThunk_(Value & v, Expr & expr)
{
mkThunk(v, baseEnv, expr);
}
@ -940,12 +939,11 @@ void EvalState::mkThunk_(Value & v, Expr * expr)
void EvalState::mkPos(Value & v, PosIdx p)
{
auto pos = positions[p];
if (auto path = std::get_if<SourcePath>(&pos.origin)) {
auto origin = positions.originOf(p);
if (auto path = std::get_if<SourcePath>(&origin)) {
auto attrs = buildBindings(3);
attrs.alloc(sFile).mkString(path->path.abs());
attrs.alloc(sLine).mkInt(pos.line);
attrs.alloc(sColumn).mkInt(pos.column);
makePositionThunks(*this, p, attrs.alloc(sLine), attrs.alloc(sColumn));
v.mkAttrs(attrs);
} else
v.mkNull();
@ -1034,7 +1032,7 @@ void EvalState::mkSingleDerivedPathString(
Value * Expr::maybeThunk(EvalState & state, Env & env)
{
Value * v = state.allocValue();
mkThunk(*v, env, this);
mkThunk(*v, env, *this);
return v;
}
@ -1098,7 +1096,7 @@ void EvalState::evalFile(const SourcePath & path_, Value & v, bool mustBeTrivial
e = j->second;
if (!e)
e = parseExprFromFile(checkSourcePath(resolvedPath));
e = &parseExprFromFile(checkSourcePath(resolvedPath));
cacheFile(path, resolvedPath, e, v, mustBeTrivial);
}
@ -1135,7 +1133,7 @@ void EvalState::cacheFile(
if (mustBeTrivial &&
!(dynamic_cast<ExprAttrs *>(e)))
error<EvalError>("file '%s' must be an attribute set", path).debugThrow();
eval(e, v);
eval(*e, v);
} catch (Error & e) {
addErrorTrace(e, "while evaluating the file '%1%':", resolvedPath.to_string());
throw;
@ -1146,23 +1144,23 @@ void EvalState::cacheFile(
}
void EvalState::eval(Expr * e, Value & v)
void EvalState::eval(Expr & e, Value & v)
{
e->eval(*this, baseEnv, v);
e.eval(*this, baseEnv, v);
}
inline bool EvalState::evalBool(Env & env, Expr * e, const PosIdx pos, std::string_view errorCtx)
inline bool EvalState::evalBool(Env & env, Expr & e, const PosIdx pos, std::string_view errorCtx)
{
try {
Value v;
e->eval(*this, env, v);
e.eval(*this, env, v);
if (v.type() != nBool)
error<TypeError>(
"expected a Boolean but found %1%: %2%",
showType(v),
ValuePrinter(*this, v, errorPrintOptions)
).atPos(pos).withFrame(env, *e).debugThrow();
).atPos(pos).withFrame(env, e).debugThrow();
return v.boolean;
} catch (Error & e) {
e.addTrace(positions[pos], errorCtx);
@ -1171,16 +1169,16 @@ inline bool EvalState::evalBool(Env & env, Expr * e, const PosIdx pos, std::stri
}
inline void EvalState::evalAttrs(Env & env, Expr * e, Value & v, const PosIdx pos, std::string_view errorCtx)
inline void EvalState::evalAttrs(Env & env, Expr & e, Value & v, const PosIdx pos, std::string_view errorCtx)
{
try {
e->eval(*this, env, v);
e.eval(*this, env, v);
if (v.type() != nAttrs)
error<TypeError>(
"expected a set but found %1%: %2%",
showType(v),
ValuePrinter(*this, v, errorPrintOptions)
).withFrame(env, *e).debugThrow();
).withFrame(env, e).debugThrow();
} catch (Error & e) {
e.addTrace(positions[pos], errorCtx);
throw;
@ -1223,7 +1221,7 @@ Env * ExprAttrs::buildInheritFromEnv(EvalState & state, Env & up)
inheritEnv.up = &up;
Displacement displ = 0;
for (auto from : *inheritFromExprs)
for (auto & from : *inheritFromExprs)
inheritEnv.values[displ++] = from->maybeThunk(state, up);
return &inheritEnv;
@ -1253,7 +1251,7 @@ void ExprAttrs::eval(EvalState & state, Env & env, Value & v)
Value * vAttr;
if (hasOverrides && i.second.kind != AttrDef::Kind::Inherited) {
vAttr = state.allocValue();
mkThunk(*vAttr, *i.second.chooseByKind(&env2, &env, inheritEnv), i.second.e);
mkThunk(*vAttr, *i.second.chooseByKind(&env2, &env, inheritEnv), *i.second.e);
} else
vAttr = i.second.e->maybeThunk(state, *i.second.chooseByKind(&env2, &env, inheritEnv));
env2.values[displ++] = vAttr;
@ -1848,13 +1846,13 @@ void ExprWith::eval(EvalState & state, Env & env, Value & v)
void ExprIf::eval(EvalState & state, Env & env, Value & v)
{
// We cheat in the parser, and pass the position of the condition as the position of the if itself.
(state.evalBool(env, cond, pos, "while evaluating a branch condition") ? then : else_)->eval(state, env, v);
(state.evalBool(env, *cond, pos, "while evaluating a branch condition") ? *then : *else_).eval(state, env, v);
}
void ExprAssert::eval(EvalState & state, Env & env, Value & v)
{
if (!state.evalBool(env, cond, pos, "in the condition of the assert statement")) {
if (!state.evalBool(env, *cond, pos, "in the condition of the assert statement")) {
std::ostringstream out;
cond->show(state.symbols, out);
state.error<AssertionError>("assertion '%1%' failed", out.str()).atPos(pos).withFrame(env, *this).debugThrow();
@ -1865,7 +1863,7 @@ void ExprAssert::eval(EvalState & state, Env & env, Value & v)
void ExprOpNot::eval(EvalState & state, Env & env, Value & v)
{
v.mkBool(!state.evalBool(env, e, getPos(), "in the argument of the not operator")); // XXX: FIXME: !
v.mkBool(!state.evalBool(env, *e, getPos(), "in the argument of the not operator")); // XXX: FIXME: !
}
@ -1887,27 +1885,27 @@ void ExprOpNEq::eval(EvalState & state, Env & env, Value & v)
void ExprOpAnd::eval(EvalState & state, Env & env, Value & v)
{
v.mkBool(state.evalBool(env, e1, pos, "in the left operand of the AND (&&) operator") && state.evalBool(env, e2, pos, "in the right operand of the AND (&&) operator"));
v.mkBool(state.evalBool(env, *e1, pos, "in the left operand of the AND (&&) operator") && state.evalBool(env, *e2, pos, "in the right operand of the AND (&&) operator"));
}
void ExprOpOr::eval(EvalState & state, Env & env, Value & v)
{
v.mkBool(state.evalBool(env, e1, pos, "in the left operand of the OR (||) operator") || state.evalBool(env, e2, pos, "in the right operand of the OR (||) operator"));
v.mkBool(state.evalBool(env, *e1, pos, "in the left operand of the OR (||) operator") || state.evalBool(env, *e2, pos, "in the right operand of the OR (||) operator"));
}
void ExprOpImpl::eval(EvalState & state, Env & env, Value & v)
{
v.mkBool(!state.evalBool(env, e1, pos, "in the left operand of the IMPL (->) operator") || state.evalBool(env, e2, pos, "in the right operand of the IMPL (->) operator"));
v.mkBool(!state.evalBool(env, *e1, pos, "in the left operand of the IMPL (->) operator") || state.evalBool(env, *e2, pos, "in the right operand of the IMPL (->) operator"));
}
void ExprOpUpdate::eval(EvalState & state, Env & env, Value & v)
{
Value v1, v2;
state.evalAttrs(env, e1, v1, pos, "in the left operand of the update (//) operator");
state.evalAttrs(env, e2, v2, pos, "in the right operand of the update (//) operator");
state.evalAttrs(env, *e1, v1, pos, "in the left operand of the update (//) operator");
state.evalAttrs(env, *e2, v2, pos, "in the right operand of the update (//) operator");
state.nrOpUpdates++;
@ -2011,10 +2009,10 @@ void ExprConcatStrings::eval(EvalState & state, Env & env, Value & v)
};
// List of returned strings. References to these Values must NOT be persisted.
SmallTemporaryValueVector<conservativeStackReservation> values(es->size());
SmallTemporaryValueVector<conservativeStackReservation> values(es.size());
Value * vTmpP = values.data();
for (auto & [i_pos, i] : *es) {
for (auto & [i_pos, i] : es) {
Value & vTmp = *vTmpP++;
i->eval(state, env, vTmp);
@ -2044,7 +2042,7 @@ void ExprConcatStrings::eval(EvalState & state, Env & env, Value & v)
} else
state.error<EvalError>("cannot add %1% to a float", showType(vTmp)).atPos(i_pos).withFrame(env, *this).debugThrow();
} else {
if (s.empty()) s.reserve(es->size());
if (s.empty()) s.reserve(es.size());
/* skip canonization of first path, which would only be not
canonized in the first place if it's coming from a ./${foo} type
path */
@ -2728,43 +2726,49 @@ SourcePath resolveExprPath(SourcePath path)
}
Expr * EvalState::parseExprFromFile(const SourcePath & path)
Expr & EvalState::parseExprFromFile(const SourcePath & path)
{
return parseExprFromFile(path, staticBaseEnv);
}
Expr * EvalState::parseExprFromFile(const SourcePath & path, std::shared_ptr<StaticEnv> & staticEnv)
Expr & EvalState::parseExprFromFile(const SourcePath & path, std::shared_ptr<StaticEnv> & staticEnv)
{
auto buffer = path.readFile();
// readFile hopefully have left some extra space for terminators
buffer.append("\0\0", 2);
return parse(buffer.data(), buffer.size(), Pos::Origin(path), path.parent(), staticEnv);
return *parse(buffer.data(), buffer.size(), Pos::Origin(path), path.parent(), staticEnv);
}
Expr * EvalState::parseExprFromString(std::string s_, const SourcePath & basePath, std::shared_ptr<StaticEnv> & staticEnv)
Expr & EvalState::parseExprFromString(std::string s_, const SourcePath & basePath, std::shared_ptr<StaticEnv> & staticEnv)
{
auto s = make_ref<std::string>(std::move(s_));
s->append("\0\0", 2);
return parse(s->data(), s->size(), Pos::String{.source = s}, basePath, staticEnv);
// NOTE this method (and parseStdin) must take care to *fully copy* their input
// into their respective Pos::Origin until the parser stops overwriting its input
// data.
auto s = make_ref<std::string>(s_);
s_.append("\0\0", 2);
return *parse(s_.data(), s_.size(), Pos::String{.source = s}, basePath, staticEnv);
}
Expr * EvalState::parseExprFromString(std::string s, const SourcePath & basePath)
Expr & EvalState::parseExprFromString(std::string s, const SourcePath & basePath)
{
return parseExprFromString(std::move(s), basePath, staticBaseEnv);
}
Expr * EvalState::parseStdin()
Expr & EvalState::parseStdin()
{
// NOTE this method (and parseExprFromString) must take care to *fully copy* their
// input into their respective Pos::Origin until the parser stops overwriting its
// input data.
//Activity act(*logger, lvlTalkative, "parsing standard input");
auto buffer = drainFD(0);
// drainFD should have left some extra space for terminators
auto s = make_ref<std::string>(buffer);
buffer.append("\0\0", 2);
auto s = make_ref<std::string>(std::move(buffer));
return parse(s->data(), s->size(), Pos::Stdin{.source = s}, rootPath(CanonPath::fromCwd()), staticBaseEnv);
return *parse(buffer.data(), buffer.size(), Pos::Stdin{.source = s}, rootPath(CanonPath::fromCwd()), staticBaseEnv);
}
@ -2853,21 +2857,6 @@ std::optional<std::string> EvalState::resolveSearchPathPath(const SearchPath::Pa
}
Expr * EvalState::parse(
char * text,
size_t length,
Pos::Origin origin,
const SourcePath & basePath,
std::shared_ptr<StaticEnv> & staticEnv)
{
auto result = parseExprFromBuf(text, length, origin, basePath, symbols, positions, exprSymbols);
result->bindVars(*this, staticEnv);
return result;
}
std::string ExternalValueBase::coerceToString(EvalState & state, const PosIdx & pos, NixStringContext & context, bool copyMore, bool copyToStore) const
{
state.error<TypeError>(

View file

@ -337,16 +337,16 @@ public:
/**
* Parse a Nix expression from the specified file.
*/
Expr * parseExprFromFile(const SourcePath & path);
Expr * parseExprFromFile(const SourcePath & path, std::shared_ptr<StaticEnv> & staticEnv);
Expr & parseExprFromFile(const SourcePath & path);
Expr & parseExprFromFile(const SourcePath & path, std::shared_ptr<StaticEnv> & staticEnv);
/**
* Parse a Nix expression from the specified string.
*/
Expr * parseExprFromString(std::string s, const SourcePath & basePath, std::shared_ptr<StaticEnv> & staticEnv);
Expr * parseExprFromString(std::string s, const SourcePath & basePath);
Expr & parseExprFromString(std::string s, const SourcePath & basePath, std::shared_ptr<StaticEnv> & staticEnv);
Expr & parseExprFromString(std::string s, const SourcePath & basePath);
Expr * parseStdin();
Expr & parseStdin();
/**
* Evaluate an expression read from the given file to normal
@ -387,15 +387,15 @@ public:
*
* @param [out] v The resulting is stored here.
*/
void eval(Expr * e, Value & v);
void eval(Expr & e, Value & v);
/**
* Evaluation the expression, then verify that it has the expected
* type.
*/
inline bool evalBool(Env & env, Expr * e);
inline bool evalBool(Env & env, Expr * e, const PosIdx pos, std::string_view errorCtx);
inline void evalAttrs(Env & env, Expr * e, Value & v, const PosIdx pos, std::string_view errorCtx);
inline bool evalBool(Env & env, Expr & e);
inline bool evalBool(Env & env, Expr & e, const PosIdx pos, std::string_view errorCtx);
inline void evalAttrs(Env & env, Expr & e, Value & v, const PosIdx pos, std::string_view errorCtx);
/**
* If `v` is a thunk, enter it and overwrite `v` with the result
@ -616,7 +616,7 @@ public:
}
void mkList(Value & v, size_t length);
void mkThunk_(Value & v, Expr * expr);
void mkThunk_(Value & v, Expr & expr);
void mkPos(Value & v, PosIdx pos);
/**

View file

@ -227,11 +227,10 @@ static Flake getFlake(
.sourceInfo = std::make_shared<fetchers::Tree>(std::move(sourceInfo))
};
// NOTE evalFile forces vInfo to be an attrset because mustBeTrivial is true.
Value vInfo;
state.evalFile(CanonPath(flakeFile), vInfo, true); // FIXME: symlink attack
expectType(state, nAttrs, vInfo, state.positions.add({CanonPath(flakeFile)}, 1, 1));
if (auto description = vInfo.attrs->get(state.sDescription)) {
expectType(state, nString, *description->value, description->pos);
flake.description = description->value->string.s;

View file

@ -1,313 +0,0 @@
%option reentrant bison-bridge bison-locations
%option align
%option noyywrap
%option never-interactive
%option stack
%option nodefault
%option nounput noyy_top_state
%s DEFAULT
%x STRING
%x IND_STRING
%x INPATH
%x INPATH_SLASH
%x PATH_START
%{
#ifdef __clang__
#pragma clang diagnostic ignored "-Wunneeded-internal-declaration"
#endif
// yacc generates code that uses unannotated fallthrough.
#pragma GCC diagnostic ignored "-Wimplicit-fallthrough"
#ifdef __clang__
#pragma clang diagnostic ignored "-Wimplicit-fallthrough"
#endif
#include <boost/lexical_cast.hpp>
#include "nixexpr.hh"
#include "parser-tab.hh"
using namespace nix;
namespace nix {
#define CUR_POS state->at(*yylloc)
static void initLoc(YYLTYPE * loc)
{
loc->first_line = loc->last_line = 1;
loc->first_column = loc->last_column = 1;
}
static void adjustLoc(YYLTYPE * loc, const char * s, size_t len)
{
loc->stash();
loc->first_line = loc->last_line;
loc->first_column = loc->last_column;
for (size_t i = 0; i < len; i++) {
switch (*s++) {
case '\r':
if (*s == '\n') { /* cr/lf */
i++;
s++;
}
/* fall through */
case '\n':
++loc->last_line;
loc->last_column = 1;
break;
default:
++loc->last_column;
}
}
}
// we make use of the fact that the parser receives a private copy of the input
// string and can munge around in it.
static StringToken unescapeStr(SymbolTable & symbols, char * s, size_t length)
{
char * result = s;
char * t = s;
char c;
// the input string is terminated with *two* NULs, so we can safely take
// *one* character after the one being checked against.
while ((c = *s++)) {
if (c == '\\') {
c = *s++;
if (c == 'n') *t = '\n';
else if (c == 'r') *t = '\r';
else if (c == 't') *t = '\t';
else *t = c;
}
else if (c == '\r') {
/* Normalise CR and CR/LF into LF. */
*t = '\n';
if (*s == '\n') s++; /* cr/lf */
}
else *t = c;
t++;
}
return {result, size_t(t - result)};
}
}
#define YY_USER_INIT initLoc(yylloc)
#define YY_USER_ACTION adjustLoc(yylloc, yytext, yyleng);
#define PUSH_STATE(state) yy_push_state(state, yyscanner)
#define POP_STATE() yy_pop_state(yyscanner)
%}
ANY .|\n
ID [a-zA-Z\_][a-zA-Z0-9\_\'\-]*
INT [0-9]+
FLOAT (([1-9][0-9]*\.[0-9]*)|(0?\.[0-9]+))([Ee][+-]?[0-9]+)?
PATH_CHAR [a-zA-Z0-9\.\_\-\+]
PATH {PATH_CHAR}*(\/{PATH_CHAR}+)+\/?
PATH_SEG {PATH_CHAR}*\/
HPATH \~(\/{PATH_CHAR}+)+\/?
HPATH_START \~\/
SPATH \<{PATH_CHAR}+(\/{PATH_CHAR}+)*\>
URI [a-zA-Z][a-zA-Z0-9\+\-\.]*\:[a-zA-Z0-9\%\/\?\:\@\&\=\+\$\,\-\_\.\!\~\*\']+
%%
if { return IF; }
then { return THEN; }
else { return ELSE; }
assert { return ASSERT; }
with { return WITH; }
let { return LET; }
in { return IN; }
rec { return REC; }
inherit { return INHERIT; }
or { return OR_KW; }
\.\.\. { return ELLIPSIS; }
\=\= { return EQ; }
\!\= { return NEQ; }
\<\= { return LEQ; }
\>\= { return GEQ; }
\&\& { return AND; }
\|\| { return OR; }
\-\> { return IMPL; }
\/\/ { return UPDATE; }
\+\+ { return CONCAT; }
{ID} { yylval->id = {yytext, (size_t) yyleng}; return ID; }
{INT} { errno = 0;
try {
yylval->n = boost::lexical_cast<int64_t>(yytext);
} catch (const boost::bad_lexical_cast &) {
throw ParseError(ErrorInfo{
.msg = HintFmt("invalid integer '%1%'", yytext),
.pos = state->positions[CUR_POS],
});
}
return INT;
}
{FLOAT} { errno = 0;
yylval->nf = strtod(yytext, 0);
if (errno != 0)
throw ParseError(ErrorInfo{
.msg = HintFmt("invalid float '%1%'", yytext),
.pos = state->positions[CUR_POS],
});
return FLOAT;
}
\$\{ { PUSH_STATE(DEFAULT); return DOLLAR_CURLY; }
\} { /* State INITIAL only exists at the bottom of the stack and is
used as a marker. DEFAULT replaces it everywhere else.
Popping when in INITIAL state causes an empty stack exception,
so don't */
if (YYSTATE != INITIAL)
POP_STATE();
return '}';
}
\{ { PUSH_STATE(DEFAULT); return '{'; }
\" { PUSH_STATE(STRING); return '"'; }
<STRING>([^\$\"\\]|\$[^\{\"\\]|\\{ANY}|\$\\{ANY})*\$/\" |
<STRING>([^\$\"\\]|\$[^\{\"\\]|\\{ANY}|\$\\{ANY})+ {
/* It is impossible to match strings ending with '$' with one
regex because trailing contexts are only valid at the end
of a rule. (A sane but undocumented limitation.) */
yylval->str = unescapeStr(state->symbols, yytext, yyleng);
return STR;
}
<STRING>\$\{ { PUSH_STATE(DEFAULT); return DOLLAR_CURLY; }
<STRING>\" { POP_STATE(); return '"'; }
<STRING>\$|\\|\$\\ {
/* This can only occur when we reach EOF, otherwise the above
(...|\$[^\{\"\\]|\\.|\$\\.)+ would have triggered.
This is technically invalid, but we leave the problem to the
parser who fails with exact location. */
return EOF;
}
\'\'(\ *\n)? { PUSH_STATE(IND_STRING); return IND_STRING_OPEN; }
<IND_STRING>([^\$\']|\$[^\{\']|\'[^\'\$])+ {
yylval->str = {yytext, (size_t) yyleng, true};
return IND_STR;
}
<IND_STRING>\'\'\$ |
<IND_STRING>\$ {
yylval->str = {"$", 1};
return IND_STR;
}
<IND_STRING>\'\'\' {
yylval->str = {"''", 2};
return IND_STR;
}
<IND_STRING>\'\'\\{ANY} {
yylval->str = unescapeStr(state->symbols, yytext + 2, yyleng - 2);
return IND_STR;
}
<IND_STRING>\$\{ { PUSH_STATE(DEFAULT); return DOLLAR_CURLY; }
<IND_STRING>\'\' { POP_STATE(); return IND_STRING_CLOSE; }
<IND_STRING>\' {
yylval->str = {"'", 1};
return IND_STR;
}
{PATH_SEG}\$\{ |
{HPATH_START}\$\{ {
PUSH_STATE(PATH_START);
yyless(0);
yylloc->unstash();
}
<PATH_START>{PATH_SEG} {
POP_STATE();
PUSH_STATE(INPATH_SLASH);
yylval->path = {yytext, (size_t) yyleng};
return PATH;
}
<PATH_START>{HPATH_START} {
POP_STATE();
PUSH_STATE(INPATH_SLASH);
yylval->path = {yytext, (size_t) yyleng};
return HPATH;
}
{PATH} {
if (yytext[yyleng-1] == '/')
PUSH_STATE(INPATH_SLASH);
else
PUSH_STATE(INPATH);
yylval->path = {yytext, (size_t) yyleng};
return PATH;
}
{HPATH} {
if (yytext[yyleng-1] == '/')
PUSH_STATE(INPATH_SLASH);
else
PUSH_STATE(INPATH);
yylval->path = {yytext, (size_t) yyleng};
return HPATH;
}
<INPATH,INPATH_SLASH>\$\{ {
POP_STATE();
PUSH_STATE(INPATH);
PUSH_STATE(DEFAULT);
return DOLLAR_CURLY;
}
<INPATH,INPATH_SLASH>{PATH}|{PATH_SEG}|{PATH_CHAR}+ {
POP_STATE();
if (yytext[yyleng-1] == '/')
PUSH_STATE(INPATH_SLASH);
else
PUSH_STATE(INPATH);
yylval->str = {yytext, (size_t) yyleng};
return STR;
}
<INPATH>{ANY} |
<INPATH><<EOF>> {
/* if we encounter a non-path character we inform the parser that the path has
ended with a PATH_END token and re-parse this character in the default
context (it may be ')', ';', or something of that sort) */
POP_STATE();
yyless(0);
yylloc->unstash();
return PATH_END;
}
<INPATH_SLASH>{ANY} |
<INPATH_SLASH><<EOF>> {
throw ParseError(ErrorInfo{
.msg = HintFmt("path has a trailing slash"),
.pos = state->positions[CUR_POS],
});
}
{SPATH} { yylval->path = {yytext, (size_t) yyleng}; return SPATH; }
{URI} { yylval->uri = {yytext, (size_t) yyleng}; return URI; }
[ \t\r\n]+ /* eat up whitespace */
\#[^\r\n]* /* single-line comments */
\/\*([^*]|\*+[^*/])*\*+\/ /* long comments */
{ANY} {
/* Don't return a negative number, as this will cause
Bison to stop parsing without an error. */
return (unsigned char) yytext[0];
}
%%

View file

@ -7,12 +7,12 @@ libexpr_DIR := $(d)
libexpr_SOURCES := \
$(wildcard $(d)/*.cc) \
$(wildcard $(d)/value/*.cc) \
$(wildcard $(d)/parser/*.cc) \
$(wildcard $(d)/primops/*.cc) \
$(wildcard $(d)/flake/*.cc) \
$(d)/lexer-tab.cc \
$(d)/parser-tab.cc
$(wildcard $(d)/flake/*.cc)
libexpr_CXXFLAGS += -I src/libutil -I src/libstore -I src/libfetchers -I src/libmain -I src/libexpr
libexpr_CXXFLAGS += -I src/libutil -I src/libstore -I src/libfetchers -I src/libmain -I src/libexpr \
-I pegtl/include
libexpr_LIBS = libutil libstore libfetchers
@ -26,16 +26,6 @@ endif
# because inline functions in libexpr's header files call libgc.
libexpr_LDFLAGS_PROPAGATED = $(BDW_GC_LIBS)
libexpr_ORDER_AFTER := $(d)/parser-tab.cc $(d)/parser-tab.hh $(d)/lexer-tab.cc $(d)/lexer-tab.hh
$(d)/parser-tab.cc $(d)/parser-tab.hh: $(d)/parser.y
$(trace-gen) bison -v -o $(libexpr_DIR)/parser-tab.cc $< -d
$(d)/lexer-tab.cc $(d)/lexer-tab.hh: $(d)/lexer.l
$(trace-gen) flex --outfile $(libexpr_DIR)/lexer-tab.cc --header-file=$(libexpr_DIR)/lexer-tab.hh $<
clean-files += $(d)/parser-tab.cc $(d)/parser-tab.hh $(d)/lexer-tab.cc $(d)/lexer-tab.hh
$(eval $(call install-file-in, $(buildprefix)$(d)/nix-expr.pc, $(libdir)/pkgconfig, 0644))
$(foreach i, $(wildcard src/libexpr/value/*.hh), \

View file

@ -78,7 +78,7 @@ void ExprAttrs::showBindings(const SymbolTable & symbols, std::ostream & str) co
return sa < sb;
});
std::vector<Symbol> inherits;
std::map<ExprInheritFrom *, std::vector<Symbol>> inheritsFrom;
std::map<Displacement, std::vector<Symbol>> inheritsFrom;
for (auto & i : sorted) {
switch (i->second.kind) {
case AttrDef::Kind::Plain:
@ -89,7 +89,7 @@ void ExprAttrs::showBindings(const SymbolTable & symbols, std::ostream & str) co
case AttrDef::Kind::InheritedFrom: {
auto & select = dynamic_cast<ExprSelect &>(*i->second.e);
auto & from = dynamic_cast<ExprInheritFrom &>(*select.e);
inheritsFrom[&from].push_back(i->first);
inheritsFrom[from.displ].push_back(i->first);
break;
}
}
@ -101,7 +101,7 @@ void ExprAttrs::showBindings(const SymbolTable & symbols, std::ostream & str) co
}
for (const auto & [from, syms] : inheritsFrom) {
str << "inherit (";
(*inheritFromExprs)[from->displ]->show(symbols, str);
(*inheritFromExprs)[from]->show(symbols, str);
str << ")";
for (auto sym : syms) str << " " << symbols[sym];
str << "; ";
@ -147,7 +147,10 @@ void ExprLambda::show(const SymbolTable & symbols, std::ostream & str) const
if (hasFormals()) {
str << "{ ";
bool first = true;
for (auto & i : formals->formals) {
// the natural Symbol ordering is by creation time, which can lead to the
// same expression being printed in two different ways depending on its
// context. always use lexicographic ordering to avoid this.
for (const Formal & i : formals->lexicographicOrder(symbols)) {
if (first) first = false; else str << ", ";
str << symbols[i.name];
if (i.def) {
@ -172,7 +175,7 @@ void ExprCall::show(const SymbolTable & symbols, std::ostream & str) const
{
str << '(';
fun->show(symbols, str);
for (auto e : args) {
for (auto & e : args) {
str << ' ';
e->show(symbols, str);
}
@ -227,7 +230,7 @@ void ExprConcatStrings::show(const SymbolTable & symbols, std::ostream & str) co
{
bool first = true;
str << "(";
for (auto & i : *es) {
for (auto & i : es) {
if (first) first = false; else str << " + ";
i.second->show(symbols, str);
}
@ -371,7 +374,7 @@ std::shared_ptr<const StaticEnv> ExprAttrs::bindInheritSources(
// not even *have* an expr that grabs anything from this env since it's fully
// invisible, but the evaluator does not allow for this yet.
auto inner = std::make_shared<StaticEnv>(nullptr, env.get(), 0);
for (auto from : *inheritFromExprs)
for (auto & from : *inheritFromExprs)
from->bindVars(es, env);
return inner;
@ -458,7 +461,7 @@ void ExprCall::bindVars(EvalState & es, const std::shared_ptr<const StaticEnv> &
es.exprEnvs.insert(std::make_pair(this, env));
fun->bindVars(es, env);
for (auto e : args)
for (auto & e : args)
e->bindVars(es, env);
}
@ -543,7 +546,7 @@ void ExprConcatStrings::bindVars(EvalState & es, const std::shared_ptr<const Sta
if (es.debugRepl)
es.exprEnvs.insert(std::make_pair(this, env));
for (auto & i : *this->es)
for (auto & i : this->es)
i.second->bindVars(es, env);
}
@ -578,6 +581,39 @@ std::string ExprLambda::showNamePos(const EvalState & state) const
/* Position table. */
Pos PosTable::operator[](PosIdx p) const
{
auto origin = resolve(p);
if (!origin)
return {};
const auto offset = origin->offsetOf(p);
Pos result{0, 0, origin->origin};
auto lines = this->lines.lock();
auto linesForInput = (*lines)[origin->offset];
if (linesForInput.empty()) {
auto source = result.getSource().value_or("");
const char * begin = source.data();
for (Pos::LinesIterator it(source), end; it != end; it++)
linesForInput.push_back(it->data() - begin);
if (linesForInput.empty())
linesForInput.push_back(0);
}
// as above: the first line starts at byte 0 and is always present
auto lineStartOffset = std::prev(
std::upper_bound(linesForInput.begin(), linesForInput.end(), offset));
result.line = 1 + (lineStartOffset - linesForInput.begin());
result.column = 1 + (offset - *lineStartOffset);
return result;
}
/* Symbol table. */
size_t SymbolTable::totalSize() const

View file

@ -7,7 +7,6 @@
#include "value.hh"
#include "symbol-table.hh"
#include "error.hh"
#include "chunked-vector.hh"
#include "position.hh"
#include "eval-error.hh"
#include "pos-idx.hh"
@ -29,9 +28,9 @@ struct StaticEnv;
struct AttrName
{
Symbol symbol;
Expr * expr;
std::unique_ptr<Expr> expr;
AttrName(Symbol s) : symbol(s) {};
AttrName(Expr * e) : expr(e) {};
AttrName(std::unique_ptr<Expr> e) : expr(std::move(e)) {};
};
typedef std::vector<AttrName> AttrPath;
@ -43,12 +42,21 @@ std::string showAttrPath(const SymbolTable & symbols, const AttrPath & attrPath)
struct Expr
{
protected:
Expr(Expr &&) = default;
Expr & operator=(Expr &&) = default;
public:
struct AstSymbols {
Symbol sub, lessThan, mul, div, or_, findFile, nixPath, body;
};
Expr() = default;
Expr(const Expr &) = delete;
Expr & operator=(const Expr &) = delete;
virtual ~Expr() { };
virtual void show(const SymbolTable & symbols, std::ostream & str) const;
virtual void bindVars(EvalState & es, const std::shared_ptr<const StaticEnv> & env);
virtual void eval(EvalState & state, Env & env, Value & v);
@ -149,19 +157,19 @@ struct ExprInheritFrom : ExprVar
struct ExprSelect : Expr
{
PosIdx pos;
Expr * e, * def;
std::unique_ptr<Expr> e, def;
AttrPath attrPath;
ExprSelect(const PosIdx & pos, Expr * e, AttrPath attrPath, Expr * def) : pos(pos), e(e), def(def), attrPath(std::move(attrPath)) { };
ExprSelect(const PosIdx & pos, Expr * e, Symbol name) : pos(pos), e(e), def(0) { attrPath.push_back(AttrName(name)); };
ExprSelect(const PosIdx & pos, std::unique_ptr<Expr> e, AttrPath attrPath, std::unique_ptr<Expr> def) : pos(pos), e(std::move(e)), def(std::move(def)), attrPath(std::move(attrPath)) { };
ExprSelect(const PosIdx & pos, std::unique_ptr<Expr> e, Symbol name) : pos(pos), e(std::move(e)) { attrPath.push_back(AttrName(name)); };
PosIdx getPos() const override { return pos; }
COMMON_METHODS
};
struct ExprOpHasAttr : Expr
{
Expr * e;
std::unique_ptr<Expr> e;
AttrPath attrPath;
ExprOpHasAttr(Expr * e, AttrPath attrPath) : e(e), attrPath(std::move(attrPath)) { };
ExprOpHasAttr(std::unique_ptr<Expr> e, AttrPath attrPath) : e(std::move(e)), attrPath(std::move(attrPath)) { };
PosIdx getPos() const override { return e->getPos(); }
COMMON_METHODS
};
@ -181,11 +189,11 @@ struct ExprAttrs : Expr
};
Kind kind;
Expr * e;
std::unique_ptr<Expr> e;
PosIdx pos;
Displacement displ; // displacement
AttrDef(Expr * e, const PosIdx & pos, Kind kind = Kind::Plain)
: kind(kind), e(e), pos(pos) { };
AttrDef(std::unique_ptr<Expr> e, const PosIdx & pos, Kind kind = Kind::Plain)
: kind(kind), e(std::move(e)), pos(pos) { };
AttrDef() { };
template<typename T>
@ -204,12 +212,12 @@ struct ExprAttrs : Expr
};
typedef std::map<Symbol, AttrDef> AttrDefs;
AttrDefs attrs;
std::unique_ptr<std::vector<Expr *>> inheritFromExprs;
std::unique_ptr<std::vector<std::unique_ptr<Expr>>> inheritFromExprs;
struct DynamicAttrDef {
Expr * nameExpr, * valueExpr;
std::unique_ptr<Expr> nameExpr, valueExpr;
PosIdx pos;
DynamicAttrDef(Expr * nameExpr, Expr * valueExpr, const PosIdx & pos)
: nameExpr(nameExpr), valueExpr(valueExpr), pos(pos) { };
DynamicAttrDef(std::unique_ptr<Expr> nameExpr, std::unique_ptr<Expr> valueExpr, const PosIdx & pos)
: nameExpr(std::move(nameExpr)), valueExpr(std::move(valueExpr)), pos(pos) { };
};
typedef std::vector<DynamicAttrDef> DynamicAttrDefs;
DynamicAttrDefs dynamicAttrs;
@ -226,7 +234,7 @@ struct ExprAttrs : Expr
struct ExprList : Expr
{
std::vector<Expr *> elems;
std::vector<std::unique_ptr<Expr>> elems;
ExprList() { };
COMMON_METHODS
Value * maybeThunk(EvalState & state, Env & env) override;
@ -241,7 +249,7 @@ struct Formal
{
PosIdx pos;
Symbol name;
Expr * def;
std::unique_ptr<Expr> def;
};
struct Formals
@ -257,9 +265,9 @@ struct Formals
return it != formals.end() && it->name == arg;
}
std::vector<Formal> lexicographicOrder(const SymbolTable & symbols) const
std::vector<std::reference_wrapper<const Formal>> lexicographicOrder(const SymbolTable & symbols) const
{
std::vector<Formal> result(formals.begin(), formals.end());
std::vector<std::reference_wrapper<const Formal>> result(formals.begin(), formals.end());
std::sort(result.begin(), result.end(),
[&] (const Formal & a, const Formal & b) {
std::string_view sa = symbols[a.name], sb = symbols[b.name];
@ -274,14 +282,14 @@ struct ExprLambda : Expr
PosIdx pos;
Symbol name;
Symbol arg;
Formals * formals;
Expr * body;
ExprLambda(PosIdx pos, Symbol arg, Formals * formals, Expr * body)
: pos(pos), arg(arg), formals(formals), body(body)
std::unique_ptr<Formals> formals;
std::unique_ptr<Expr> body;
ExprLambda(PosIdx pos, Symbol arg, std::unique_ptr<Formals> formals, std::unique_ptr<Expr> body)
: pos(pos), arg(arg), formals(std::move(formals)), body(std::move(body))
{
};
ExprLambda(PosIdx pos, Formals * formals, Expr * body)
: pos(pos), formals(formals), body(body)
ExprLambda(PosIdx pos, std::unique_ptr<Formals> formals, std::unique_ptr<Expr> body)
: pos(pos), formals(std::move(formals)), body(std::move(body))
{
}
void setName(Symbol name) override;
@ -293,11 +301,11 @@ struct ExprLambda : Expr
struct ExprCall : Expr
{
Expr * fun;
std::vector<Expr *> args;
std::unique_ptr<Expr> fun;
std::vector<std::unique_ptr<Expr>> args;
PosIdx pos;
ExprCall(const PosIdx & pos, Expr * fun, std::vector<Expr *> && args)
: fun(fun), args(args), pos(pos)
ExprCall(const PosIdx & pos, std::unique_ptr<Expr> fun, std::vector<std::unique_ptr<Expr>> && args)
: fun(std::move(fun)), args(std::move(args)), pos(pos)
{ }
PosIdx getPos() const override { return pos; }
COMMON_METHODS
@ -305,19 +313,19 @@ struct ExprCall : Expr
struct ExprLet : Expr
{
ExprAttrs * attrs;
Expr * body;
ExprLet(ExprAttrs * attrs, Expr * body) : attrs(attrs), body(body) { };
std::unique_ptr<ExprAttrs> attrs;
std::unique_ptr<Expr> body;
ExprLet(std::unique_ptr<ExprAttrs> attrs, std::unique_ptr<Expr> body) : attrs(std::move(attrs)), body(std::move(body)) { };
COMMON_METHODS
};
struct ExprWith : Expr
{
PosIdx pos;
Expr * attrs, * body;
std::unique_ptr<Expr> attrs, body;
size_t prevWith;
ExprWith * parentWith;
ExprWith(const PosIdx & pos, Expr * attrs, Expr * body) : pos(pos), attrs(attrs), body(body) { };
ExprWith(const PosIdx & pos, std::unique_ptr<Expr> attrs, std::unique_ptr<Expr> body) : pos(pos), attrs(std::move(attrs)), body(std::move(body)) { };
PosIdx getPos() const override { return pos; }
COMMON_METHODS
};
@ -325,8 +333,8 @@ struct ExprWith : Expr
struct ExprIf : Expr
{
PosIdx pos;
Expr * cond, * then, * else_;
ExprIf(const PosIdx & pos, Expr * cond, Expr * then, Expr * else_) : pos(pos), cond(cond), then(then), else_(else_) { };
std::unique_ptr<Expr> cond, then, else_;
ExprIf(const PosIdx & pos, std::unique_ptr<Expr> cond, std::unique_ptr<Expr> then, std::unique_ptr<Expr> else_) : pos(pos), cond(std::move(cond)), then(std::move(then)), else_(std::move(else_)) { };
PosIdx getPos() const override { return pos; }
COMMON_METHODS
};
@ -334,16 +342,16 @@ struct ExprIf : Expr
struct ExprAssert : Expr
{
PosIdx pos;
Expr * cond, * body;
ExprAssert(const PosIdx & pos, Expr * cond, Expr * body) : pos(pos), cond(cond), body(body) { };
std::unique_ptr<Expr> cond, body;
ExprAssert(const PosIdx & pos, std::unique_ptr<Expr> cond, std::unique_ptr<Expr> body) : pos(pos), cond(std::move(cond)), body(std::move(body)) { };
PosIdx getPos() const override { return pos; }
COMMON_METHODS
};
struct ExprOpNot : Expr
{
Expr * e;
ExprOpNot(Expr * e) : e(e) { };
std::unique_ptr<Expr> e;
ExprOpNot(std::unique_ptr<Expr> e) : e(std::move(e)) { };
PosIdx getPos() const override { return e->getPos(); }
COMMON_METHODS
};
@ -352,9 +360,9 @@ struct ExprOpNot : Expr
struct name : Expr \
{ \
PosIdx pos; \
Expr * e1, * e2; \
name(Expr * e1, Expr * e2) : e1(e1), e2(e2) { }; \
name(const PosIdx & pos, Expr * e1, Expr * e2) : pos(pos), e1(e1), e2(e2) { }; \
std::unique_ptr<Expr> e1, e2; \
name(std::unique_ptr<Expr> e1, std::unique_ptr<Expr> e2) : e1(std::move(e1)), e2(std::move(e2)) { }; \
name(const PosIdx & pos, std::unique_ptr<Expr> e1, std::unique_ptr<Expr> e2) : pos(pos), e1(std::move(e1)), e2(std::move(e2)) { }; \
void show(const SymbolTable & symbols, std::ostream & str) const override \
{ \
str << "("; e1->show(symbols, str); str << " " s " "; e2->show(symbols, str); str << ")"; \
@ -379,9 +387,9 @@ struct ExprConcatStrings : Expr
{
PosIdx pos;
bool forceString;
std::vector<std::pair<PosIdx, Expr *>> * es;
ExprConcatStrings(const PosIdx & pos, bool forceString, std::vector<std::pair<PosIdx, Expr *>> * es)
: pos(pos), forceString(forceString), es(es) { };
std::vector<std::pair<PosIdx, std::unique_ptr<Expr>>> es;
ExprConcatStrings(const PosIdx & pos, bool forceString, std::vector<std::pair<PosIdx, std::unique_ptr<Expr>>> es)
: pos(pos), forceString(forceString), es(std::move(es)) { };
PosIdx getPos() const override { return pos; }
COMMON_METHODS
};

View file

@ -1,447 +0,0 @@
%glr-parser
%define api.pure
%locations
%define parse.error verbose
%defines
/* %no-lines */
%parse-param { void * scanner }
%parse-param { nix::ParserState * state }
%lex-param { void * scanner }
%lex-param { nix::ParserState * state }
%expect 1
%expect-rr 1
%code requires {
#ifndef BISON_HEADER
#define BISON_HEADER
#include <variant>
#include "finally.hh"
#include "util.hh"
#include "nixexpr.hh"
#include "eval.hh"
#include "eval-settings.hh"
#include "globals.hh"
#include "parser-state.hh"
#define YYLTYPE ::nix::ParserLocation
#define YY_DECL int yylex \
(YYSTYPE * yylval_param, YYLTYPE * yylloc_param, yyscan_t yyscanner, nix::ParserState * state)
namespace nix {
Expr * parseExprFromBuf(
char * text,
size_t length,
Pos::Origin origin,
const SourcePath & basePath,
SymbolTable & symbols,
PosTable & positions,
const Expr::AstSymbols & astSymbols);
}
#endif
}
%{
#include "parser-tab.hh"
#include "lexer-tab.hh"
YY_DECL;
using namespace nix;
#define CUR_POS state->at(*yylocp)
void yyerror(YYLTYPE * loc, yyscan_t scanner, ParserState * state, const char * error)
{
throw ParseError({
.msg = HintFmt(error),
.pos = state->positions[state->at(*loc)]
});
}
%}
%union {
// !!! We're probably leaking stuff here.
nix::Expr * e;
nix::ExprList * list;
nix::ExprAttrs * attrs;
nix::Formals * formals;
nix::Formal * formal;
nix::NixInt n;
nix::NixFloat nf;
nix::StringToken id; // !!! -> Symbol
nix::StringToken path;
nix::StringToken uri;
nix::StringToken str;
std::vector<nix::AttrName> * attrNames;
std::vector<std::pair<nix::PosIdx, nix::Expr *>> * string_parts;
std::vector<std::pair<nix::PosIdx, std::variant<nix::Expr *, nix::StringToken>>> * ind_string_parts;
}
%type <e> start expr expr_function expr_if expr_op
%type <e> expr_select expr_simple expr_app
%type <list> expr_list
%type <attrs> binds
%type <formals> formals
%type <formal> formal
%type <attrNames> attrs attrpath
%type <string_parts> string_parts_interpolated
%type <ind_string_parts> ind_string_parts
%type <e> path_start string_parts string_attr
%type <id> attr
%token <id> ID
%token <str> STR IND_STR
%token <n> INT
%token <nf> FLOAT
%token <path> PATH HPATH SPATH PATH_END
%token <uri> URI
%token IF THEN ELSE ASSERT WITH LET IN REC INHERIT EQ NEQ AND OR IMPL OR_KW
%token DOLLAR_CURLY /* == ${ */
%token IND_STRING_OPEN IND_STRING_CLOSE
%token ELLIPSIS
%right IMPL
%left OR
%left AND
%nonassoc EQ NEQ
%nonassoc '<' '>' LEQ GEQ
%right UPDATE
%left NOT
%left '+' '-'
%left '*' '/'
%right CONCAT
%nonassoc '?'
%nonassoc NEGATE
%%
start: expr { state->result = $1; };
expr: expr_function;
expr_function
: ID ':' expr_function
{ $$ = new ExprLambda(CUR_POS, state->symbols.create($1), 0, $3); }
| '{' formals '}' ':' expr_function
{ $$ = new ExprLambda(CUR_POS, state->validateFormals($2), $5); }
| '{' formals '}' '@' ID ':' expr_function
{
auto arg = state->symbols.create($5);
$$ = new ExprLambda(CUR_POS, arg, state->validateFormals($2, CUR_POS, arg), $7);
}
| ID '@' '{' formals '}' ':' expr_function
{
auto arg = state->symbols.create($1);
$$ = new ExprLambda(CUR_POS, arg, state->validateFormals($4, CUR_POS, arg), $7);
}
| ASSERT expr ';' expr_function
{ $$ = new ExprAssert(CUR_POS, $2, $4); }
| WITH expr ';' expr_function
{ $$ = new ExprWith(CUR_POS, $2, $4); }
| LET binds IN expr_function
{ if (!$2->dynamicAttrs.empty())
throw ParseError({
.msg = HintFmt("dynamic attributes not allowed in let"),
.pos = state->positions[CUR_POS]
});
$$ = new ExprLet($2, $4);
}
| expr_if
;
expr_if
: IF expr THEN expr ELSE expr { $$ = new ExprIf(CUR_POS, $2, $4, $6); }
| expr_op
;
expr_op
: '!' expr_op %prec NOT { $$ = new ExprOpNot($2); }
| '-' expr_op %prec NEGATE { $$ = new ExprCall(CUR_POS, new ExprVar(state->s.sub), {new ExprInt(0), $2}); }
| expr_op EQ expr_op { $$ = new ExprOpEq($1, $3); }
| expr_op NEQ expr_op { $$ = new ExprOpNEq($1, $3); }
| expr_op '<' expr_op { $$ = new ExprCall(state->at(@2), new ExprVar(state->s.lessThan), {$1, $3}); }
| expr_op LEQ expr_op { $$ = new ExprOpNot(new ExprCall(state->at(@2), new ExprVar(state->s.lessThan), {$3, $1})); }
| expr_op '>' expr_op { $$ = new ExprCall(state->at(@2), new ExprVar(state->s.lessThan), {$3, $1}); }
| expr_op GEQ expr_op { $$ = new ExprOpNot(new ExprCall(state->at(@2), new ExprVar(state->s.lessThan), {$1, $3})); }
| expr_op AND expr_op { $$ = new ExprOpAnd(state->at(@2), $1, $3); }
| expr_op OR expr_op { $$ = new ExprOpOr(state->at(@2), $1, $3); }
| expr_op IMPL expr_op { $$ = new ExprOpImpl(state->at(@2), $1, $3); }
| expr_op UPDATE expr_op { $$ = new ExprOpUpdate(state->at(@2), $1, $3); }
| expr_op '?' attrpath { $$ = new ExprOpHasAttr($1, std::move(*$3)); delete $3; }
| expr_op '+' expr_op
{ $$ = new ExprConcatStrings(state->at(@2), false, new std::vector<std::pair<PosIdx, Expr *> >({{state->at(@1), $1}, {state->at(@3), $3}})); }
| expr_op '-' expr_op { $$ = new ExprCall(state->at(@2), new ExprVar(state->s.sub), {$1, $3}); }
| expr_op '*' expr_op { $$ = new ExprCall(state->at(@2), new ExprVar(state->s.mul), {$1, $3}); }
| expr_op '/' expr_op { $$ = new ExprCall(state->at(@2), new ExprVar(state->s.div), {$1, $3}); }
| expr_op CONCAT expr_op { $$ = new ExprOpConcatLists(state->at(@2), $1, $3); }
| expr_app
;
expr_app
: expr_app expr_select {
if (auto e2 = dynamic_cast<ExprCall *>($1)) {
e2->args.push_back($2);
$$ = $1;
} else
$$ = new ExprCall(CUR_POS, $1, {$2});
}
| expr_select
;
expr_select
: expr_simple '.' attrpath
{ $$ = new ExprSelect(CUR_POS, $1, std::move(*$3), nullptr); delete $3; }
| expr_simple '.' attrpath OR_KW expr_select
{ $$ = new ExprSelect(CUR_POS, $1, std::move(*$3), $5); delete $3; }
| /* Backwards compatibility: because Nixpkgs has a rarely used
function named or, allow stuff like map or [...]. */
expr_simple OR_KW
{ $$ = new ExprCall(CUR_POS, $1, {new ExprVar(CUR_POS, state->s.or_)}); }
| expr_simple
;
expr_simple
: ID {
std::string_view s = "__curPos";
if ($1.l == s.size() && strncmp($1.p, s.data(), s.size()) == 0)
$$ = new ExprPos(CUR_POS);
else
$$ = new ExprVar(CUR_POS, state->symbols.create($1));
}
| INT { $$ = new ExprInt($1); }
| FLOAT { $$ = new ExprFloat($1); }
| '"' string_parts '"' { $$ = $2; }
| IND_STRING_OPEN ind_string_parts IND_STRING_CLOSE {
$$ = state->stripIndentation(CUR_POS, std::move(*$2));
delete $2;
}
| path_start PATH_END
| path_start string_parts_interpolated PATH_END {
$2->insert($2->begin(), {state->at(@1), $1});
$$ = new ExprConcatStrings(CUR_POS, false, $2);
}
| SPATH {
std::string path($1.p + 1, $1.l - 2);
$$ = new ExprCall(CUR_POS,
new ExprVar(state->s.findFile),
{new ExprVar(state->s.nixPath),
new ExprString(std::move(path))});
}
| URI {
static bool noURLLiterals = experimentalFeatureSettings.isEnabled(Xp::NoUrlLiterals);
if (noURLLiterals)
throw ParseError({
.msg = HintFmt("URL literals are disabled"),
.pos = state->positions[CUR_POS]
});
$$ = new ExprString(std::string($1));
}
| '(' expr ')' { $$ = $2; }
/* Let expressions `let {..., body = ...}' are just desugared
into `(rec {..., body = ...}).body'. */
| LET '{' binds '}'
{ $3->recursive = true; $$ = new ExprSelect(noPos, $3, state->s.body); }
| REC '{' binds '}'
{ $3->recursive = true; $$ = $3; }
| '{' binds '}'
{ $$ = $2; }
| '[' expr_list ']' { $$ = $2; }
;
string_parts
: STR { $$ = new ExprString(std::string($1)); }
| string_parts_interpolated { $$ = new ExprConcatStrings(CUR_POS, true, $1); }
| { $$ = new ExprString(""); }
;
string_parts_interpolated
: string_parts_interpolated STR
{ $$ = $1; $1->emplace_back(state->at(@2), new ExprString(std::string($2))); }
| string_parts_interpolated DOLLAR_CURLY expr '}' { $$ = $1; $1->emplace_back(state->at(@2), $3); }
| DOLLAR_CURLY expr '}' { $$ = new std::vector<std::pair<PosIdx, Expr *>>; $$->emplace_back(state->at(@1), $2); }
| STR DOLLAR_CURLY expr '}' {
$$ = new std::vector<std::pair<PosIdx, Expr *>>;
$$->emplace_back(state->at(@1), new ExprString(std::string($1)));
$$->emplace_back(state->at(@2), $3);
}
;
path_start
: PATH {
Path path(absPath({$1.p, $1.l}, state->basePath.path.abs()));
/* add back in the trailing '/' to the first segment */
if ($1.p[$1.l-1] == '/' && $1.l > 1)
path += "/";
$$ = new ExprPath(path);
}
| HPATH {
if (evalSettings.pureEval) {
throw Error(
"the path '%s' can not be resolved in pure mode",
std::string_view($1.p, $1.l)
);
}
Path path(getHome() + std::string($1.p + 1, $1.l - 1));
$$ = new ExprPath(path);
}
;
ind_string_parts
: ind_string_parts IND_STR { $$ = $1; $1->emplace_back(state->at(@2), $2); }
| ind_string_parts DOLLAR_CURLY expr '}' { $$ = $1; $1->emplace_back(state->at(@2), $3); }
| { $$ = new std::vector<std::pair<PosIdx, std::variant<Expr *, StringToken>>>; }
;
binds
: binds attrpath '=' expr ';' { $$ = $1; state->addAttr($$, std::move(*$2), $4, state->at(@2)); delete $2; }
| binds INHERIT attrs ';'
{ $$ = $1;
for (auto & i : *$3) {
if ($$->attrs.find(i.symbol) != $$->attrs.end())
state->dupAttr(i.symbol, state->at(@3), $$->attrs[i.symbol].pos);
auto pos = state->at(@3);
$$->attrs.emplace(
i.symbol,
ExprAttrs::AttrDef(new ExprVar(CUR_POS, i.symbol), pos, ExprAttrs::AttrDef::Kind::Inherited));
}
delete $3;
}
| binds INHERIT '(' expr ')' attrs ';'
{ $$ = $1;
if (!$$->inheritFromExprs)
$$->inheritFromExprs = std::make_unique<std::vector<Expr *>>();
$$->inheritFromExprs->push_back($4);
auto from = new nix::ExprInheritFrom(state->at(@4), $$->inheritFromExprs->size() - 1);
for (auto & i : *$6) {
if ($$->attrs.find(i.symbol) != $$->attrs.end())
state->dupAttr(i.symbol, state->at(@6), $$->attrs[i.symbol].pos);
$$->attrs.emplace(
i.symbol,
ExprAttrs::AttrDef(
new ExprSelect(CUR_POS, from, i.symbol),
state->at(@6),
ExprAttrs::AttrDef::Kind::InheritedFrom));
}
delete $6;
}
| { $$ = new ExprAttrs(state->at(@0)); }
;
attrs
: attrs attr { $$ = $1; $1->push_back(AttrName(state->symbols.create($2))); }
| attrs string_attr
{ $$ = $1;
ExprString * str = dynamic_cast<ExprString *>($2);
if (str) {
$$->push_back(AttrName(state->symbols.create(str->s)));
delete str;
} else
throw ParseError({
.msg = HintFmt("dynamic attributes not allowed in inherit"),
.pos = state->positions[state->at(@2)]
});
}
| { $$ = new AttrPath; }
;
attrpath
: attrpath '.' attr { $$ = $1; $1->push_back(AttrName(state->symbols.create($3))); }
| attrpath '.' string_attr
{ $$ = $1;
ExprString * str = dynamic_cast<ExprString *>($3);
if (str) {
$$->push_back(AttrName(state->symbols.create(str->s)));
delete str;
} else
$$->push_back(AttrName($3));
}
| attr { $$ = new std::vector<AttrName>; $$->push_back(AttrName(state->symbols.create($1))); }
| string_attr
{ $$ = new std::vector<AttrName>;
ExprString *str = dynamic_cast<ExprString *>($1);
if (str) {
$$->push_back(AttrName(state->symbols.create(str->s)));
delete str;
} else
$$->push_back(AttrName($1));
}
;
attr
: ID
| OR_KW { $$ = {"or", 2}; }
;
string_attr
: '"' string_parts '"' { $$ = $2; }
| DOLLAR_CURLY expr '}' { $$ = $2; }
;
expr_list
: expr_list expr_select { $$ = $1; $1->elems.push_back($2); /* !!! dangerous */ }
| { $$ = new ExprList; }
;
formals
: formal ',' formals
{ $$ = $3; $$->formals.emplace_back(*$1); delete $1; }
| formal
{ $$ = new Formals; $$->formals.emplace_back(*$1); $$->ellipsis = false; delete $1; }
|
{ $$ = new Formals; $$->ellipsis = false; }
| ELLIPSIS
{ $$ = new Formals; $$->ellipsis = true; }
;
formal
: ID { $$ = new Formal{CUR_POS, state->symbols.create($1), 0}; }
| ID '?' expr { $$ = new Formal{CUR_POS, state->symbols.create($1), $3}; }
;
%%
#include "eval.hh"
namespace nix {
Expr * parseExprFromBuf(
char * text,
size_t length,
Pos::Origin origin,
const SourcePath & basePath,
SymbolTable & symbols,
PosTable & positions,
const Expr::AstSymbols & astSymbols)
{
yyscan_t scanner;
ParserState state {
.symbols = symbols,
.positions = positions,
.basePath = basePath,
.origin = {origin},
.s = astSymbols,
};
yylex_init(&scanner);
Finally _destroy([&] { yylex_destroy(scanner); });
yy_scan_buffer(text, length, scanner);
yyparse(scanner, &state);
return state.result;
}
}

View file

@ -0,0 +1,65 @@
#pragma once
#include <tao/pegtl.hpp>
namespace nix::parser {
// modified copy of change_state, as the manual suggest for more involved
// state manipulation. we want to change only the first state parameter,
// and we care about the *initial* position of a rule application (not the
// past-the-end position as pegtl change_state provides)
template<typename NewState>
struct change_head : tao::pegtl::maybe_nothing
{
template<
typename Rule,
tao::pegtl::apply_mode A,
tao::pegtl::rewind_mode M,
template<typename...> class Action,
template<typename...> class Control,
typename ParseInput,
typename State,
typename... States
>
[[nodiscard]] static bool match(ParseInput & in, State && st, States &&... sts)
{
const auto begin = in.iterator();
if constexpr (std::is_constructible_v<NewState, State, States...>) {
NewState s(st, sts...);
if (tao::pegtl::match<Rule, A, M, Action, Control>(in, s, sts...)) {
if constexpr (A == tao::pegtl::apply_mode::action) {
_success<Action<Rule>>(0, begin, in, s, st, sts...);
}
return true;
}
return false;
} else if constexpr (std::is_default_constructible_v<NewState>) {
NewState s;
if (tao::pegtl::match<Rule, A, M, Action, Control>(in, s, sts...)) {
if constexpr (A == tao::pegtl::apply_mode::action) {
_success<Action<Rule>>(0, begin, in, s, st, sts...);
}
return true;
}
return false;
} else {
static_assert(decltype(sizeof(NewState))(), "unable to instantiate new state");
}
}
template<typename Target, typename ParseInput, typename... S>
static void _success(void *, auto & begin, ParseInput & in, S & ... sts)
{
const typename ParseInput::action_t at(begin, in);
Target::success(at, sts...);
}
template<typename Target, typename... S>
static void _success(decltype(Target::success0(std::declval<S &>()...), 0), auto &, auto &, S & ... sts)
{
Target::success0(sts...);
}
};
}

View file

@ -0,0 +1,706 @@
#pragma once
#include "tao/pegtl.hpp"
#include <type_traits>
#include <variant>
#include <boost/container/small_vector.hpp>
// NOTE
// nix line endings are \n, \r\n, \r. the grammar does not use eol or
// eolf rules in favor of reproducing the old flex lexer as faithfully as
// possible, and deferring calculation of positions to downstream users.
namespace nix::parser::grammar {
using namespace tao::pegtl;
namespace p = tao::pegtl;
// character classes
namespace c {
struct path : sor<
ranges<'a', 'z', 'A', 'Z', '0', '9'>,
one<'.', '_', '-', '+'>
> {};
struct path_sep : one<'/'> {};
struct id_first : ranges<'a', 'z', 'A', 'Z', '_'> {};
struct id_rest : sor<
ranges<'a', 'z', 'A', 'Z', '0', '9'>,
one<'_', '\'', '-'>
> {};
struct uri_scheme_first : ranges<'a', 'z', 'A', 'Z'> {};
struct uri_scheme_rest : sor<
ranges<'a', 'z', 'A', 'Z', '0', '9'>,
one<'+', '-', '.'>
> {};
struct uri_sep : one<':'> {};
struct uri_rest : sor<
ranges<'a', 'z', 'A', 'Z', '0', '9'>,
one<'%', '/', '?', ':', '@', '&', '=', '+', '$', ',', '-', '_', '.', '!', '~', '*', '\''>
> {};
}
// "tokens". PEGs don't really care about tokens, we merely use them as a convenient
// way of writing down keywords and a couple complicated syntax rules.
namespace t {
struct _extend_as_path : seq<
star<c::path>,
not_at<TAO_PEGTL_STRING("/*")>,
not_at<TAO_PEGTL_STRING("//")>,
c::path_sep,
sor<c::path, TAO_PEGTL_STRING("${")>
> {};
struct _extend_as_uri : seq<
star<c::uri_scheme_rest>,
c::uri_sep,
c::uri_rest
> {};
// keywords might be extended to identifiers, paths, or uris.
// NOTE this assumes that keywords are a-zA-Z only, otherwise uri schemes would never
// match correctly.
// NOTE not a simple seq<...> because this would report incorrect positions for
// keywords used inside must<> if a prefix of the keyword matches.
template<typename S>
struct _keyword : sor<
seq<
S,
not_at<c::id_rest>,
not_at<_extend_as_path>,
not_at<_extend_as_uri>
>,
failure
> {};
struct kw_if : _keyword<TAO_PEGTL_STRING("if")> {};
struct kw_then : _keyword<TAO_PEGTL_STRING("then")> {};
struct kw_else : _keyword<TAO_PEGTL_STRING("else")> {};
struct kw_assert : _keyword<TAO_PEGTL_STRING("assert")> {};
struct kw_with : _keyword<TAO_PEGTL_STRING("with")> {};
struct kw_let : _keyword<TAO_PEGTL_STRING("let")> {};
struct kw_in : _keyword<TAO_PEGTL_STRING("in")> {};
struct kw_rec : _keyword<TAO_PEGTL_STRING("rec")> {};
struct kw_inherit : _keyword<TAO_PEGTL_STRING("inherit")> {};
struct kw_or : _keyword<TAO_PEGTL_STRING("or")> {};
// `-` can be a unary prefix op, a binary infix op, or the first character
// of a path or -> (ex 1->1--1)
// `/` can be a path leader or an operator (ex a?a /a)
struct op_minus : seq<one<'-'>, not_at<one<'>'>>, not_at<_extend_as_path>> {};
struct op_div : seq<one<'/'>, not_at<c::path>> {};
// match a rule, making sure we are not matching it where a keyword would match.
// using minus like this is a lot faster than flipping the order and using seq.
template<typename... Rules>
struct _not_at_any_keyword : minus<
seq<Rules...>,
sor<
TAO_PEGTL_STRING("inherit"),
TAO_PEGTL_STRING("assert"),
TAO_PEGTL_STRING("else"),
TAO_PEGTL_STRING("then"),
TAO_PEGTL_STRING("with"),
TAO_PEGTL_STRING("let"),
TAO_PEGTL_STRING("rec"),
TAO_PEGTL_STRING("if"),
TAO_PEGTL_STRING("in"),
TAO_PEGTL_STRING("or")
>
> {};
// identifiers are kind of horrid:
//
// - uri_scheme_first ⊂ id_first
// - uri_scheme_first ⊂ uri_scheme_rest ⊂ path
// - id_first ⊂ id_rest { ' } ⊂ path
// - id_first ∩ (path uri_scheme_first) = { _ }
// - uri_sep ∉ { id_first, id_rest, uri_scheme_first, uri_scheme_rest, path }
// - path_sep ∉ { id_first, id_rest, uri_scheme_first, uri_scheme_rest }
//
// and we want, without reading the input more than once, a string that
// matches (id_first id_rest*) and is not followed by any number of
// characters such that the extended string matches path or uri rules.
//
// since the first character must be either _ or a uri scheme character
// we can ignore path-like bits at the beginning. uri_sep cannot appear anywhere
// in an identifier, so it's only needed in lookahead checks at the uri-like
// prefix. likewise path_sep cannot appear anywhere in the idenfier, so it's
// only needed in lookahead checks in the path-like prefix.
//
// in total that gives us a decomposition of
//
// (uri-scheme-like? (?! continues-as-uri) | _)
// (path-segment-like? (?! continues-as-path))
// id_rest*
struct identifier : _not_at_any_keyword<
// we don't use (at<id_rest>, ...) matches here because identifiers are
// a really hot path and rewinding as needed by at<> isn't entirely free.
sor<
seq<
c::uri_scheme_first,
star<ranges<'a', 'z', 'A', 'Z', '0', '9', '-'>>,
not_at<_extend_as_uri>
>,
one<'_'>
>,
star<sor<ranges<'a', 'z', 'A', 'Z', '0', '9'>, one<'_', '-'>>>,
not_at<_extend_as_path>,
star<c::id_rest>
> {};
// floats may extend ints, thus these rules are very similar.
struct integer : seq<
sor<
seq<range<'1', '9'>, star<digit>, not_at<one<'.'>>>,
seq<one<'0'>, not_at<one<'.'>, digit>, star<digit>>
>,
not_at<_extend_as_path>
> {};
struct floating : seq<
sor<
seq<range<'1', '9'>, star<digit>, one<'.'>, star<digit>>,
seq<opt<one<'0'>>, one<'.'>, plus<digit>>
>,
opt<one<'E', 'e'>, opt<one<'+', '-'>>, plus<digit>>,
not_at<_extend_as_path>
> {};
struct uri : seq<
c::uri_scheme_first,
star<c::uri_scheme_rest>,
c::uri_sep,
plus<c::uri_rest>
> {};
struct sep : sor<
plus<one<' ', '\t', '\r', '\n'>>,
seq<one<'#'>, star<not_one<'\r', '\n'>>>,
seq<string<'/', '*'>, until<string<'*', '/'>>>
> {};
}
using seps = star<t::sep>;
// marker for semantic rules. not handling one of these in an action that cares about
// semantics is probably an error.
struct semantic {};
struct expr;
struct _string {
template<typename... Inner>
struct literal : semantic, seq<Inner...> {};
struct cr_lf : semantic, seq<one<'\r'>, opt<one<'\n'>>> {};
struct interpolation : semantic, seq<
p::string<'$', '{'>, seps,
must<expr>, seps,
must<one<'}'>>
> {};
struct escape : semantic, must<any> {};
};
struct string : _string, seq<
one<'"'>,
star<
sor<
_string::literal<plus<not_one<'$', '"', '\\', '\r'>>>,
_string::cr_lf,
_string::interpolation,
_string::literal<one<'$'>, opt<one<'$'>>>,
seq<one<'\\'>, _string::escape>
>
>,
must<one<'"'>>
> {};
struct _ind_string {
template<bool Indented, typename... Inner>
struct literal : semantic, seq<Inner...> {};
struct interpolation : semantic, seq<
p::string<'$', '{'>, seps,
must<expr>, seps,
must<one<'}'>>
> {};
struct escape : semantic, must<any> {};
};
struct ind_string : _ind_string, seq<
TAO_PEGTL_STRING("''"),
opt<star<one<' '>>, one<'\n'>>,
star<
sor<
_ind_string::literal<
true,
plus<
sor<
not_one<'$', '\''>,
seq<one<'$'>, not_one<'{', '\''>>,
seq<one<'\''>, not_one<'\'', '$'>>
>
>
>,
_ind_string::interpolation,
_ind_string::literal<false, one<'$'>>,
_ind_string::literal<false, one<'\''>, not_at<one<'\''>>>,
seq<one<'\''>, _ind_string::literal<false, p::string<'\'', '\''>>>,
seq<
p::string<'\'', '\''>,
sor<
_ind_string::literal<false, one<'$'>>,
seq<one<'\\'>, _ind_string::escape>
>
>
>
>,
must<TAO_PEGTL_STRING("''")>
> {};
struct _path {
// legacy lexer rules. extra l_ to avoid reserved c++ identifiers.
struct _l_PATH : seq<star<c::path>, plus<c::path_sep, plus<c::path>>, opt<c::path_sep>> {};
struct _l_PATH_SEG : seq<star<c::path>, c::path_sep> {};
struct _l_HPATH : seq<one<'~'>, plus<c::path_sep, plus<c::path>>, opt<c::path_sep>> {};
struct _l_HPATH_START : TAO_PEGTL_STRING("~/") {};
struct _path_str : sor<_l_PATH, _l_PATH_SEG, plus<c::path>> {};
// modern rules
template<typename... Inner>
struct literal : semantic, seq<Inner...> {};
struct interpolation : semantic, seq<
p::string<'$', '{'>, seps,
must<expr>, seps,
must<one<'}'>>
> {};
struct anchor : semantic, sor<
_l_PATH,
seq<_l_PATH_SEG, at<TAO_PEGTL_STRING("${")>>
> {};
struct home_anchor : semantic, sor<
_l_HPATH,
seq<_l_HPATH_START, at<TAO_PEGTL_STRING("${")>>
> {};
struct searched_path : semantic, list<plus<c::path>, c::path_sep> {};
struct forbid_prefix_triple_slash : sor<not_at<c::path_sep>, failure> {};
struct forbid_prefix_double_slash_no_interp : sor<
not_at<c::path_sep, star<c::path>, not_at<TAO_PEGTL_STRING("${")>>,
failure
> {};
// legacy parser rules
struct _str_rest : seq<
must<forbid_prefix_double_slash_no_interp>,
opt<literal<_path_str>>,
must<forbid_prefix_triple_slash>,
star<
sor<
literal<_path_str>,
interpolation
>
>
> {};
};
struct path : _path, sor<
seq<
sor<_path::anchor, _path::home_anchor>,
_path::_str_rest
>,
seq<one<'<'>, _path::searched_path, one<'>'>>
> {};
struct _formal {
struct name : semantic, t::identifier {};
struct default_value : semantic, must<expr> {};
};
struct formal : semantic, _formal, seq<
_formal::name,
opt<seps, one<'?'>, seps, _formal::default_value>
> {};
struct _formals {
struct ellipsis : semantic, p::ellipsis {};
};
struct formals : semantic, _formals, seq<
one<'{'>, seps,
// formals and attrsets share a two-token head sequence ('{' <id>).
// this rule unrolls the formals list a bit to provide better error messages than
// "expected '='" at the first ',' if formals are incorrect.
sor<
one<'}'>,
seq<_formals::ellipsis, seps, must<one<'}'>>>,
seq<
formal, seps,
if_then_else<
at<one<','>>,
seq<
star<one<','>, seps, formal, seps>,
opt<one<','>, seps, opt<_formals::ellipsis, seps>>,
must<one<'}'>>
>,
one<'}'>
>
>
>
> {};
struct _attr {
struct simple : semantic, sor<t::identifier, t::kw_or> {};
struct string : semantic, seq<grammar::string> {};
struct expr : semantic, seq<
TAO_PEGTL_STRING("${"), seps,
must<grammar::expr>, seps,
must<one<'}'>>
> {};
};
struct attr : _attr, sor<
_attr::simple,
_attr::string,
_attr::expr
> {};
struct attrpath : list<attr, one<'.'>, t::sep> {};
struct _inherit {
struct from : semantic, must<expr> {};
struct attrs : list<attr, seps> {};
};
struct inherit : _inherit, seq<
t::kw_inherit, seps,
opt<one<'('>, seps, _inherit::from, seps, must<one<')'>>, seps>,
opt<_inherit::attrs, seps>,
must<one<';'>>
> {};
struct _binding {
struct path : semantic, attrpath {};
struct equal : one<'='> {};
struct value : semantic, must<expr> {};
};
struct binding : _binding, seq<
_binding::path, seps,
must<_binding::equal>, seps,
_binding::value, seps,
must<one<';'>>
> {};
struct bindings : opt<list<sor<inherit, binding>, seps>> {};
struct op {
enum class kind {
// NOTE non-associativity is *NOT* handled in the grammar structure.
// handling it in the grammar itself instead of in semantic actions
// slows down the parser significantly and makes the rules *much*
// harder to read. maybe this will be different at some point when
// ! does not sit between two binary precedence levels.
nonAssoc,
leftAssoc,
rightAssoc,
unary,
};
template<typename Rule, unsigned Precedence, kind Kind = kind::leftAssoc>
struct _op : Rule {
static constexpr unsigned precedence = Precedence;
static constexpr op::kind kind = Kind;
};
struct unary_minus : _op<t::op_minus, 3, kind::unary> {};
// treating this like a unary postfix operator is sketchy, but that's
// the most reasonable way to implement the operator precedence set forth
// by the language way back. it'd be much better if `.` and `?` had the same
// precedence, but alas.
struct has_attr : _op<seq<one<'?'>, seps, must<attrpath>>, 4> {};
struct concat : _op<TAO_PEGTL_STRING("++"), 5, kind::rightAssoc> {};
struct mul : _op<one<'*'>, 6> {};
struct div : _op<t::op_div, 6> {};
struct plus : _op<one<'+'>, 7> {};
struct minus : _op<t::op_minus, 7> {};
struct not_ : _op<one<'!'>, 8, kind::unary> {};
struct update : _op<TAO_PEGTL_STRING("//"), 9, kind::rightAssoc> {};
struct less_eq : _op<TAO_PEGTL_STRING("<="), 10, kind::nonAssoc> {};
struct greater_eq : _op<TAO_PEGTL_STRING(">="), 10, kind::nonAssoc> {};
struct less : _op<one<'<'>, 10, kind::nonAssoc> {};
struct greater : _op<one<'>'>, 10, kind::nonAssoc> {};
struct equals : _op<TAO_PEGTL_STRING("=="), 11, kind::nonAssoc> {};
struct not_equals : _op<TAO_PEGTL_STRING("!="), 11, kind::nonAssoc> {};
struct and_ : _op<TAO_PEGTL_STRING("&&"), 12> {};
struct or_ : _op<TAO_PEGTL_STRING("||"), 13> {};
struct implies : _op<TAO_PEGTL_STRING("->"), 14, kind::rightAssoc> {};
};
struct _expr {
template<template<typename...> class OpenMod = seq, typename... Init>
struct _attrset : seq<
Init...,
OpenMod<one<'{'>>, seps,
bindings, seps,
must<one<'}'>>
> {};
struct select;
struct id : semantic, t::identifier {};
struct int_ : semantic, t::integer {};
struct float_ : semantic, t::floating {};
struct string : semantic, seq<grammar::string> {};
struct ind_string : semantic, seq<grammar::ind_string> {};
struct path : semantic, seq<grammar::path> {};
struct uri : semantic, t::uri {};
struct ancient_let : semantic, _attrset<must, t::kw_let, seps> {};
struct rec_set : semantic, _attrset<must, t::kw_rec, seps> {};
struct set : semantic, _attrset<> {};
struct _list {
struct entry : semantic, seq<select> {};
};
struct list : semantic, _list, seq<
one<'['>, seps,
opt<p::list<_list::entry, seps>, seps>,
must<one<']'>>
> {};
struct _simple : sor<
id,
int_,
float_,
string,
ind_string,
path,
uri,
seq<one<'('>, seps, must<expr>, seps, must<one<')'>>>,
ancient_let,
rec_set,
set,
list
> {};
struct _select {
struct head : _simple {};
struct attr : semantic, seq<attrpath> {};
struct attr_or : semantic, must<select> {};
struct as_app_or : semantic, t::kw_or {};
};
struct _app {
struct first_arg : semantic, seq<select> {};
struct another_arg : semantic, seq<select> {};
// can be used to stash a position of the application head node
struct select_or_fn : seq<select> {};
};
struct select : _select, seq<
_select::head, seps,
opt<
sor<
seq<
one<'.'>, seps, _select::attr,
opt<seps, t::kw_or, seps, _select::attr_or>
>,
_select::as_app_or
>
>
> {};
struct app : _app, seq<
_app::select_or_fn,
opt<seps, _app::first_arg, star<seps, _app::another_arg>>
> {};
template<typename Op>
struct operator_ : semantic, Op {};
struct unary : seq<
star<sor<operator_<op::not_>, operator_<op::unary_minus>>, seps>,
app
> {};
struct _binary_operator : sor<
operator_<op::implies>,
operator_<op::update>,
operator_<op::concat>,
operator_<op::plus>,
operator_<op::minus>,
operator_<op::mul>,
operator_<op::div>,
operator_<op::less_eq>,
operator_<op::greater_eq>,
operator_<op::less>,
operator_<op::greater>,
operator_<op::equals>,
operator_<op::not_equals>,
operator_<op::or_>,
operator_<op::and_>
> {};
struct _binop : seq<
unary,
star<
seps,
sor<
seq<_binary_operator, seps, must<unary>>,
operator_<op::has_attr>
>
>
> {};
struct _lambda {
struct arg : semantic, t::identifier {};
};
struct lambda : semantic, _lambda, sor<
seq<
_lambda::arg, seps,
sor<
seq<one<':'>, seps, must<expr>>,
seq<one<'@'>, seps, must<formals, seps, one<':'>, seps, expr>>
>
>,
seq<
formals, seps,
sor<
seq<one<':'>, seps, must<expr>>,
seq<one<'@'>, seps, must<_lambda::arg, seps, one<':'>, seps, expr>>
>
>
> {};
struct assert_ : semantic, seq<
t::kw_assert, seps,
must<expr>, seps,
must<one<';'>>, seps,
must<expr>
> {};
struct with : semantic, seq<
t::kw_with, seps,
must<expr>, seps,
must<one<';'>>, seps,
must<expr>
> {};
struct let : seq<
t::kw_let, seps,
not_at<one<'{'>>, // exclude ancient_let so we can must<kw_in>
bindings, seps,
must<t::kw_in>, seps,
must<expr>
> {};
struct if_ : semantic, seq<
t::kw_if, seps,
must<expr>, seps,
must<t::kw_then>, seps,
must<expr>, seps,
must<t::kw_else>, seps,
must<expr>
> {};
};
struct expr : semantic, _expr, sor<
_expr::lambda,
_expr::assert_,
_expr::with,
_expr::let,
_expr::if_,
_expr::_binop
> {};
// legacy support: \0 terminates input if passed from flex to bison as a token
struct eof : sor<p::eof, one<0>> {};
struct root : must<seps, expr, seps, eof> {};
template<typename Rule>
struct nothing : p::nothing<Rule> {
static_assert(!std::is_base_of_v<semantic, Rule>);
};
template<typename Self, typename OpCtx, typename AttrPathT, typename ExprT>
struct operator_semantics {
struct has_attr : grammar::op::has_attr {
AttrPathT path;
};
struct OpEntry {
OpCtx ctx;
uint8_t prec;
grammar::op::kind assoc;
std::variant<
grammar::op::not_,
grammar::op::unary_minus,
grammar::op::implies,
grammar::op::or_,
grammar::op::and_,
grammar::op::equals,
grammar::op::not_equals,
grammar::op::less_eq,
grammar::op::greater_eq,
grammar::op::update,
grammar::op::concat,
grammar::op::less,
grammar::op::greater,
grammar::op::plus,
grammar::op::minus,
grammar::op::mul,
grammar::op::div,
has_attr
> op;
};
// statistics here are taken from nixpkgs commit de502c4d0ba96261e5de803e4d1d1925afd3e22f.
// over 99.9% of contexts in nixpkgs need at most 4 slots, ~85% need only 1
boost::container::small_vector<ExprT, 4> exprs;
// over 99.9% of contexts in nixpkgs need at most 2 slots, ~85% need only 1
boost::container::small_vector<OpEntry, 2> ops;
// derived class is expected to define members:
//
// ExprT applyOp(OpCtx & pos, auto & op, auto &... args);
// [[noreturn]] static void badOperator(OpCtx & pos, auto &... args);
void reduce(uint8_t toPrecedence, auto &... args) {
while (!ops.empty()) {
auto & [ctx, precedence, kind, op] = ops.back();
// NOTE this relies on associativity not being mixed within a precedence level.
if ((precedence > toPrecedence)
|| (kind != grammar::op::kind::leftAssoc && precedence == toPrecedence))
break;
std::visit([&, ctx=std::move(ctx)] (auto & op) {
exprs.push_back(static_cast<Self &>(*this).applyOp(ctx, op, args...));
}, op);
ops.pop_back();
}
}
ExprT popExpr()
{
auto r = std::move(exprs.back());
exprs.pop_back();
return r;
}
void pushOp(OpCtx ctx, auto o, auto &... args)
{
if (o.kind != grammar::op::kind::unary)
reduce(o.precedence, args...);
if (!ops.empty() && o.kind == grammar::op::kind::nonAssoc) {
auto & [_pos, _prec, _kind, _o] = ops.back();
if (_kind == o.kind && _prec == o.precedence)
Self::badOperator(ctx, args...);
}
ops.emplace_back(ctx, o.precedence, o.kind, std::move(o));
}
ExprT finish(auto &... args)
{
reduce(255, args...);
return popExpr();
}
};
}

View file

@ -0,0 +1,852 @@
#include "attr-set.hh"
#include "error.hh"
#include "eval.hh"
#include "eval-settings.hh"
#include "finally.hh"
#include "nixexpr.hh"
#include "symbol-table.hh"
#include "change_head.hh"
#include "grammar.hh"
#include "state.hh"
#include <charconv>
#include <clocale>
#include <memory>
// flip this define when doing parser development to enable some g checks.
#if 0
#include <tao/pegtl/contrib/analyze.hpp>
#define ANALYZE_GRAMMAR \
([] { \
const std::size_t issues = tao::pegtl::analyze<grammar::root>(); \
assert(issues == 0); \
})()
#else
#define ANALYZE_GRAMMAR ((void) 0)
#endif
namespace p = tao::pegtl;
namespace nix::parser {
namespace {
template<typename>
inline constexpr const char * error_message = nullptr;
#define error_message_for(...) \
template<> inline constexpr auto error_message<__VA_ARGS__>
error_message_for(p::one<'{'>) = "expecting '{'";
error_message_for(p::one<'}'>) = "expecting '}'";
error_message_for(p::one<'"'>) = "expecting '\"'";
error_message_for(p::one<';'>) = "expecting ';'";
error_message_for(p::one<')'>) = "expecting ')'";
error_message_for(p::one<'='>) = "expecting '='";
error_message_for(p::one<']'>) = "expecting ']'";
error_message_for(p::one<':'>) = "expecting ':'";
error_message_for(p::string<'\'', '\''>) = "expecting \"''\"";
error_message_for(p::any) = "expecting any character";
error_message_for(grammar::eof) = "expecting end of file";
error_message_for(grammar::seps) = "expecting separators";
error_message_for(grammar::path::forbid_prefix_triple_slash) = "too many slashes in path";
error_message_for(grammar::path::forbid_prefix_double_slash_no_interp) = "path has a trailing slash";
error_message_for(grammar::expr) = "expecting expression";
error_message_for(grammar::expr::unary) = "expecting expression";
error_message_for(grammar::binding::equal) = "expecting '='";
error_message_for(grammar::expr::lambda::arg) = "expecting identifier";
error_message_for(grammar::formals) = "expecting formals";
error_message_for(grammar::attrpath) = "expecting attribute path";
error_message_for(grammar::expr::select) = "expecting selection expression";
error_message_for(grammar::t::kw_then) = "expecting 'then'";
error_message_for(grammar::t::kw_else) = "expecting 'else'";
error_message_for(grammar::t::kw_in) = "expecting 'in'";
struct SyntaxErrors
{
template<typename Rule>
static constexpr auto message = error_message<Rule>;
template<typename Rule>
static constexpr bool raise_on_failure = false;
};
template<typename Rule>
struct Control : p::must_if<SyntaxErrors>::control<Rule>
{
template<typename ParseInput, typename... States>
[[noreturn]] static void raise(const ParseInput & in, States &&... st)
{
if (in.empty()) {
std::string expected;
if constexpr (constexpr auto msg = error_message<Rule>)
expected = fmt(", %s", msg);
throw p::parse_error("unexpected end of file" + expected, in);
}
p::must_if<SyntaxErrors>::control<Rule>::raise(in, st...);
}
};
struct ExprState : grammar::operator_semantics<ExprState, PosIdx, AttrPath, std::unique_ptr<Expr>>
{
template<typename Op, typename... Args>
std::unique_ptr<Expr> applyUnary(Args &&... args) {
return std::make_unique<Op>(popExpr(), std::forward<Args>(args)...);
}
template<typename Op>
std::unique_ptr<Expr> applyBinary(PosIdx pos) {
auto right = popExpr(), left = popExpr();
return std::make_unique<Op>(pos, std::move(left), std::move(right));
}
std::unique_ptr<Expr> call(PosIdx pos, Symbol fn, bool flip = false)
{
std::vector<std::unique_ptr<Expr>> args(2);
args[flip ? 0 : 1] = popExpr();
args[flip ? 1 : 0] = popExpr();
return std::make_unique<ExprCall>(pos, std::make_unique<ExprVar>(fn), std::move(args));
}
std::unique_ptr<Expr> order(PosIdx pos, bool less, State & state)
{
return call(pos, state.s.lessThan, !less);
}
std::unique_ptr<Expr> concatStrings(PosIdx pos)
{
std::vector<std::pair<PosIdx, std::unique_ptr<Expr>>> args(2);
args[1].second = popExpr();
args[0].second = popExpr();
return std::make_unique<ExprConcatStrings>(pos, false, std::move(args));
}
std::unique_ptr<Expr> negate(PosIdx pos, State & state)
{
std::vector<std::unique_ptr<Expr>> args(2);
args[0] = std::make_unique<ExprInt>(0);
args[1] = popExpr();
return std::make_unique<ExprCall>(pos, std::make_unique<ExprVar>(state.s.sub), std::move(args));
}
std::unique_ptr<Expr> applyOp(PosIdx pos, auto & op, State & state) {
using Op = grammar::op;
auto not_ = [] (auto e) {
return std::make_unique<ExprOpNot>(std::move(e));
};
return (overloaded {
[&] (Op::implies) { return applyBinary<ExprOpImpl>(pos); },
[&] (Op::or_) { return applyBinary<ExprOpOr>(pos); },
[&] (Op::and_) { return applyBinary<ExprOpAnd>(pos); },
[&] (Op::equals) { return applyBinary<ExprOpEq>(pos); },
[&] (Op::not_equals) { return applyBinary<ExprOpNEq>(pos); },
[&] (Op::less) { return order(pos, true, state); },
[&] (Op::greater_eq) { return not_(order(pos, true, state)); },
[&] (Op::greater) { return order(pos, false, state); },
[&] (Op::less_eq) { return not_(order(pos, false, state)); },
[&] (Op::update) { return applyBinary<ExprOpUpdate>(pos); },
[&] (Op::not_) { return applyUnary<ExprOpNot>(); },
[&] (Op::plus) { return concatStrings(pos); },
[&] (Op::minus) { return call(pos, state.s.sub); },
[&] (Op::mul) { return call(pos, state.s.mul); },
[&] (Op::div) { return call(pos, state.s.div); },
[&] (Op::concat) { return applyBinary<ExprOpConcatLists>(pos); },
[&] (has_attr & a) { return applyUnary<ExprOpHasAttr>(std::move(a.path)); },
[&] (Op::unary_minus) { return negate(pos, state); },
})(op);
}
// always_inline is needed, otherwise pushOp slows down considerably
[[noreturn, gnu::always_inline]]
static void badOperator(PosIdx pos, State & state)
{
throw ParseError({
.msg = HintFmt("syntax error, unexpected operator"),
.pos = state.positions[pos]
});
}
template<typename Expr, typename... Args>
Expr & pushExpr(Args && ... args)
{
auto p = std::make_unique<Expr>(std::forward<Args>(args)...);
auto & result = *p;
exprs.emplace_back(std::move(p));
return result;
}
};
struct SubexprState {
private:
ExprState * up;
public:
explicit SubexprState(ExprState & up, auto &...) : up(&up) {}
operator ExprState &() { return *up; }
ExprState * operator->() { return up; }
};
template<typename Rule>
struct BuildAST : grammar::nothing<Rule> {};
struct LambdaState : SubexprState {
using SubexprState::SubexprState;
Symbol arg;
std::unique_ptr<Formals> formals;
};
struct FormalsState : SubexprState {
using SubexprState::SubexprState;
Formals formals{};
Formal formal{};
};
template<> struct BuildAST<grammar::formal::name> {
static void apply(const auto & in, FormalsState & s, State & ps) {
s.formal = {
.pos = ps.at(in),
.name = ps.symbols.create(in.string_view()),
};
}
};
template<> struct BuildAST<grammar::formal> {
static void apply0(FormalsState & s, State &) {
s.formals.formals.emplace_back(std::move(s.formal));
}
};
template<> struct BuildAST<grammar::formal::default_value> {
static void apply0(FormalsState & s, State & ps) {
s.formal.def = s->popExpr();
}
};
template<> struct BuildAST<grammar::formals::ellipsis> {
static void apply0(FormalsState & s, State &) {
s.formals.ellipsis = true;
}
};
template<> struct BuildAST<grammar::formals> : change_head<FormalsState> {
static void success0(FormalsState & f, LambdaState & s, State &) {
s.formals = std::make_unique<Formals>(std::move(f.formals));
}
};
struct AttrState : SubexprState {
using SubexprState::SubexprState;
std::vector<AttrName> attrs;
void pushAttr(auto && attr, PosIdx) { attrs.emplace_back(std::move(attr)); }
};
template<> struct BuildAST<grammar::attr::simple> {
static void apply(const auto & in, auto & s, State & ps) {
s.pushAttr(ps.symbols.create(in.string_view()), ps.at(in));
}
};
template<> struct BuildAST<grammar::attr::string> {
static void apply(const auto & in, auto & s, State & ps) {
auto e = s->popExpr();
if (auto str = dynamic_cast<ExprString *>(e.get()))
s.pushAttr(ps.symbols.create(str->s), ps.at(in));
else
s.pushAttr(std::move(e), ps.at(in));
}
};
template<> struct BuildAST<grammar::attr::expr> : BuildAST<grammar::attr::string> {};
struct BindingsState : SubexprState {
using SubexprState::SubexprState;
ExprAttrs attrs;
AttrPath path;
std::unique_ptr<Expr> value;
};
struct InheritState : SubexprState {
using SubexprState::SubexprState;
std::vector<std::pair<AttrName, PosIdx>> attrs;
std::unique_ptr<Expr> from;
PosIdx fromPos;
void pushAttr(auto && attr, PosIdx pos) { attrs.emplace_back(std::move(attr), pos); }
};
template<> struct BuildAST<grammar::inherit::from> {
static void apply(const auto & in, InheritState & s, State & ps) {
s.from = s->popExpr();
s.fromPos = ps.at(in);
}
};
template<> struct BuildAST<grammar::inherit> : change_head<InheritState> {
static void success0(InheritState & s, BindingsState & b, State & ps) {
auto & attrs = b.attrs.attrs;
// TODO this should not reuse generic attrpath rules.
for (auto & [i, iPos] : s.attrs) {
if (i.symbol)
continue;
if (auto str = dynamic_cast<ExprString *>(i.expr.get()))
i = AttrName(ps.symbols.create(str->s));
else {
throw ParseError({
.msg = HintFmt("dynamic attributes not allowed in inherit"),
.pos = ps.positions[iPos]
});
}
}
if (auto fromE = std::move(s.from)) {
if (!b.attrs.inheritFromExprs)
b.attrs.inheritFromExprs = std::make_unique<std::vector<std::unique_ptr<Expr>>>();
b.attrs.inheritFromExprs->push_back(std::move(fromE));
for (auto & [i, iPos] : s.attrs) {
if (attrs.find(i.symbol) != attrs.end())
ps.dupAttr(i.symbol, iPos, attrs[i.symbol].pos);
auto from = std::make_unique<ExprInheritFrom>(s.fromPos, b.attrs.inheritFromExprs->size() - 1);
attrs.emplace(
i.symbol,
ExprAttrs::AttrDef(
std::make_unique<ExprSelect>(iPos, std::move(from), i.symbol),
iPos,
ExprAttrs::AttrDef::Kind::InheritedFrom));
}
} else {
for (auto & [i, iPos] : s.attrs) {
if (attrs.find(i.symbol) != attrs.end())
ps.dupAttr(i.symbol, iPos, attrs[i.symbol].pos);
attrs.emplace(
i.symbol,
ExprAttrs::AttrDef(
std::make_unique<ExprVar>(iPos, i.symbol),
iPos,
ExprAttrs::AttrDef::Kind::Inherited));
}
}
}
};
template<> struct BuildAST<grammar::binding::path> : change_head<AttrState> {
static void success0(AttrState & a, BindingsState & s, State & ps) {
s.path = std::move(a.attrs);
}
};
template<> struct BuildAST<grammar::binding::value> {
static void apply0(BindingsState & s, State & ps) {
s.value = s->popExpr();
}
};
template<> struct BuildAST<grammar::binding> {
static void apply(const auto & in, BindingsState & s, State & ps) {
ps.addAttr(&s.attrs, std::move(s.path), std::move(s.value), ps.at(in));
}
};
template<> struct BuildAST<grammar::expr::id> {
static void apply(const auto & in, ExprState & s, State & ps) {
if (in.string_view() == "__curPos")
s.pushExpr<ExprPos>(ps.at(in));
else
s.pushExpr<ExprVar>(ps.at(in), ps.symbols.create(in.string_view()));
}
};
template<> struct BuildAST<grammar::expr::int_> {
static void apply(const auto & in, ExprState & s, State & ps) {
int64_t v;
if (std::from_chars(in.begin(), in.end(), v).ec != std::errc{}) {
throw ParseError({
.msg = HintFmt("invalid integer '%1%'", in.string_view()),
.pos = ps.positions[ps.at(in)],
});
}
s.pushExpr<ExprInt>(v);
}
};
template<> struct BuildAST<grammar::expr::float_> {
static void apply(const auto & in, ExprState & s, State & ps) {
// copy the input into a temporary string so we can call stod.
// can't use from_chars because libc++ (thus darwin) does not have it,
// and floats are not performance-sensitive anyway. if they were you'd
// be in much bigger trouble than this.
//
// we also get to do a locale-save dance because stod is locale-aware and
// something (a plugin?) may have called setlocale or uselocale.
static struct locale_hack {
locale_t posix;
locale_hack(): posix(newlocale(LC_ALL_MASK, "POSIX", 0))
{
if (posix == 0)
throw SysError("could not get POSIX locale");
}
} locale;
auto tmp = in.string();
double v = [&] {
auto oldLocale = uselocale(locale.posix);
Finally resetLocale([=] { uselocale(oldLocale); });
try {
return std::stod(tmp);
} catch (...) {
throw ParseError({
.msg = HintFmt("invalid float '%1%'", in.string_view()),
.pos = ps.positions[ps.at(in)],
});
}
}();
s.pushExpr<ExprFloat>(v);
}
};
struct StringState : SubexprState {
using SubexprState::SubexprState;
std::string currentLiteral;
PosIdx currentPos;
std::vector<std::pair<nix::PosIdx, std::unique_ptr<Expr>>> parts;
void append(PosIdx pos, std::string_view s)
{
if (currentLiteral.empty())
currentPos = pos;
currentLiteral += s;
}
// FIXME this truncates strings on NUL for compat with the old parser. ideally
// we should use the decomposition the g gives us instead of iterating over
// the entire string again.
static void unescapeStr(std::string & str)
{
char * s = str.data();
char * t = s;
char c;
while ((c = *s++)) {
if (c == '\\') {
c = *s++;
if (c == 'n') *t = '\n';
else if (c == 'r') *t = '\r';
else if (c == 't') *t = '\t';
else *t = c;
}
else if (c == '\r') {
/* Normalise CR and CR/LF into LF. */
*t = '\n';
if (*s == '\n') s++; /* cr/lf */
}
else *t = c;
t++;
}
str.resize(t - str.data());
}
void endLiteral()
{
if (!currentLiteral.empty()) {
unescapeStr(currentLiteral);
parts.emplace_back(currentPos, std::make_unique<ExprString>(std::move(currentLiteral)));
}
}
std::unique_ptr<Expr> finish()
{
if (parts.empty()) {
unescapeStr(currentLiteral);
return std::make_unique<ExprString>(std::move(currentLiteral));
} else {
endLiteral();
auto pos = parts[0].first;
return std::make_unique<ExprConcatStrings>(pos, true, std::move(parts));
}
}
};
template<typename... Content> struct BuildAST<grammar::string::literal<Content...>> {
static void apply(const auto & in, StringState & s, State & ps) {
s.append(ps.at(in), in.string_view());
}
};
template<> struct BuildAST<grammar::string::cr_lf> {
static void apply(const auto & in, StringState & s, State & ps) {
s.append(ps.at(in), in.string_view()); // FIXME compat with old parser
}
};
template<> struct BuildAST<grammar::string::interpolation> {
static void apply(const auto & in, StringState & s, State & ps) {
s.endLiteral();
s.parts.emplace_back(ps.at(in), s->popExpr());
}
};
template<> struct BuildAST<grammar::string::escape> {
static void apply(const auto & in, StringState & s, State & ps) {
s.append(ps.at(in), "\\"); // FIXME compat with old parser
s.append(ps.at(in), in.string_view());
}
};
template<> struct BuildAST<grammar::string> : change_head<StringState> {
static void success0(StringState & s, ExprState & e, State &) {
e.exprs.push_back(s.finish());
}
};
struct IndStringState : SubexprState {
using SubexprState::SubexprState;
std::vector<std::pair<PosIdx, std::variant<std::unique_ptr<Expr>, StringToken>>> parts;
};
template<bool Indented, typename... Content>
struct BuildAST<grammar::ind_string::literal<Indented, Content...>> {
static void apply(const auto & in, IndStringState & s, State & ps) {
s.parts.emplace_back(ps.at(in), StringToken{in.string_view(), Indented});
}
};
template<> struct BuildAST<grammar::ind_string::interpolation> {
static void apply(const auto & in, IndStringState & s, State & ps) {
s.parts.emplace_back(ps.at(in), s->popExpr());
}
};
template<> struct BuildAST<grammar::ind_string::escape> {
static void apply(const auto & in, IndStringState & s, State & ps) {
switch (*in.begin()) {
case 'n': s.parts.emplace_back(ps.at(in), StringToken{"\n"}); break;
case 'r': s.parts.emplace_back(ps.at(in), StringToken{"\r"}); break;
case 't': s.parts.emplace_back(ps.at(in), StringToken{"\t"}); break;
default: s.parts.emplace_back(ps.at(in), StringToken{in.string_view()}); break;
}
}
};
template<> struct BuildAST<grammar::ind_string> : change_head<IndStringState> {
static void success(const auto & in, IndStringState & s, ExprState & e, State & ps) {
e.exprs.emplace_back(ps.stripIndentation(ps.at(in), std::move(s.parts)));
}
};
template<typename... Content> struct BuildAST<grammar::path::literal<Content...>> {
static void apply(const auto & in, StringState & s, State & ps) {
s.append(ps.at(in), in.string_view());
s.endLiteral();
}
};
template<> struct BuildAST<grammar::path::interpolation> : BuildAST<grammar::string::interpolation> {};
template<> struct BuildAST<grammar::path::anchor> {
static void apply(const auto & in, StringState & s, State & ps) {
Path path(absPath(in.string(), ps.basePath.path.abs()));
/* add back in the trailing '/' to the first segment */
if (in.string_view().ends_with('/') && in.size() > 1)
path += "/";
s.parts.emplace_back(ps.at(in), new ExprPath(std::move(path)));
}
};
template<> struct BuildAST<grammar::path::home_anchor> {
static void apply(const auto & in, StringState & s, State & ps) {
if (evalSettings.pureEval)
throw Error("the path '%s' can not be resolved in pure mode", in.string_view());
Path path(getHome() + in.string_view().substr(1));
s.parts.emplace_back(ps.at(in), new ExprPath(std::move(path)));
}
};
template<> struct BuildAST<grammar::path::searched_path> {
static void apply(const auto & in, StringState & s, State & ps) {
std::vector<std::unique_ptr<Expr>> args{2};
args[0] = std::make_unique<ExprVar>(ps.s.nixPath);
args[1] = std::make_unique<ExprString>(in.string());
s.parts.emplace_back(
ps.at(in),
std::make_unique<ExprCall>(
ps.at(in),
std::make_unique<ExprVar>(ps.s.findFile),
std::move(args)));
}
};
template<> struct BuildAST<grammar::path> : change_head<StringState> {
template<typename E>
static void check_slash(PosIdx end, StringState & s, State & ps) {
auto e = dynamic_cast<E *>(s.parts.back().second.get());
if (!e || !e->s.ends_with('/'))
return;
if (s.parts.size() > 1 || e->s != "/")
throw ParseError({
.msg = HintFmt("path has a trailing slash"),
.pos = ps.positions[end],
});
}
static void success(const auto & in, StringState & s, ExprState & e, State & ps) {
s.endLiteral();
check_slash<ExprPath>(ps.atEnd(in), s, ps);
check_slash<ExprString>(ps.atEnd(in), s, ps);
if (s.parts.size() == 1) {
e.exprs.emplace_back(std::move(s.parts.back().second));
} else {
e.pushExpr<ExprConcatStrings>(ps.at(in), false, std::move(s.parts));
}
}
};
// strings and paths sare handled fully by the grammar-level rule for now
template<> struct BuildAST<grammar::expr::string> : p::maybe_nothing {};
template<> struct BuildAST<grammar::expr::ind_string> : p::maybe_nothing {};
template<> struct BuildAST<grammar::expr::path> : p::maybe_nothing {};
template<> struct BuildAST<grammar::expr::uri> {
static void apply(const auto & in, ExprState & s, State & ps) {
static bool noURLLiterals = experimentalFeatureSettings.isEnabled(Xp::NoUrlLiterals);
if (noURLLiterals)
throw ParseError({
.msg = HintFmt("URL literals are disabled"),
.pos = ps.positions[ps.at(in)]
});
s.pushExpr<ExprString>(in.string());
}
};
template<> struct BuildAST<grammar::expr::ancient_let> : change_head<BindingsState> {
static void success(const auto & in, BindingsState & b, ExprState & s, State & ps) {
b.attrs.pos = ps.at(in);
b.attrs.recursive = true;
s.pushExpr<ExprSelect>(b.attrs.pos, std::make_unique<ExprAttrs>(std::move(b.attrs)), ps.s.body);
}
};
template<> struct BuildAST<grammar::expr::rec_set> : change_head<BindingsState> {
static void success(const auto & in, BindingsState & b, ExprState & s, State & ps) {
b.attrs.pos = ps.at(in);
b.attrs.recursive = true;
s.pushExpr<ExprAttrs>(std::move(b.attrs));
}
};
template<> struct BuildAST<grammar::expr::set> : change_head<BindingsState> {
static void success(const auto & in, BindingsState & b, ExprState & s, State & ps) {
b.attrs.pos = ps.at(in);
s.pushExpr<ExprAttrs>(std::move(b.attrs));
}
};
using ListState = std::vector<std::unique_ptr<Expr>>;
template<> struct BuildAST<grammar::expr::list> : change_head<ListState> {
static void success0(ListState & ls, ExprState & s, State &) {
auto e = std::make_unique<ExprList>();
e->elems = std::move(ls);
s.exprs.push_back(std::move(e));
}
};
template<> struct BuildAST<grammar::expr::list::entry> : change_head<ExprState> {
static void success0(ExprState & e, ListState & s, State & ps) {
s.emplace_back(e.finish(ps));
}
};
struct SelectState : SubexprState {
using SubexprState::SubexprState;
PosIdx pos;
ExprSelect * e = nullptr;
};
template<> struct BuildAST<grammar::expr::select::head> {
static void apply(const auto & in, SelectState & s, State & ps) {
s.pos = ps.at(in);
}
};
template<> struct BuildAST<grammar::expr::select::attr> : change_head<AttrState> {
static void success0(AttrState & a, SelectState & s, State &) {
s.e = &s->pushExpr<ExprSelect>(s.pos, s->popExpr(), std::move(a.attrs), nullptr);
}
};
template<> struct BuildAST<grammar::expr::select::attr_or> {
static void apply0(SelectState & s, State &) {
s.e->def = s->popExpr();
}
};
template<> struct BuildAST<grammar::expr::select::as_app_or> {
static void apply(const auto & in, SelectState & s, State & ps) {
std::vector<std::unique_ptr<Expr>> args(1);
args[0] = std::make_unique<ExprVar>(ps.at(in), ps.s.or_);
s->pushExpr<ExprCall>(s.pos, s->popExpr(), std::move(args));
}
};
template<> struct BuildAST<grammar::expr::select> : change_head<SelectState> {
static void success0(const auto &...) {}
};
struct AppState : SubexprState {
using SubexprState::SubexprState;
PosIdx pos;
ExprCall * e = nullptr;
};
template<> struct BuildAST<grammar::expr::app::select_or_fn> {
static void apply(const auto & in, AppState & s, State & ps) {
s.pos = ps.at(in);
}
};
template<> struct BuildAST<grammar::expr::app::first_arg> {
static void apply(auto & in, AppState & s, State & ps) {
auto arg = s->popExpr(), fn = s->popExpr();
if ((s.e = dynamic_cast<ExprCall *>(fn.get()))) {
// TODO remove.
// AST compat with old parser, semantics are the same.
// this can happen on occasions such as `<p> <p>` or `a or b or`,
// neither of which are super worth optimizing.
s.e->args.push_back(std::move(arg));
s->exprs.emplace_back(std::move(fn));
} else {
std::vector<std::unique_ptr<Expr>> args{1};
args[0] = std::move(arg);
s.e = &s->pushExpr<ExprCall>(s.pos, std::move(fn), std::move(args));
}
}
};
template<> struct BuildAST<grammar::expr::app::another_arg> {
static void apply0(AppState & s, State & ps) {
s.e->args.push_back(s->popExpr());
}
};
template<> struct BuildAST<grammar::expr::app> : change_head<AppState> {
static void success0(const auto &...) {}
};
template<typename Op> struct BuildAST<grammar::expr::operator_<Op>> {
static void apply(const auto & in, ExprState & s, State & ps) {
s.pushOp(ps.at(in), Op{}, ps);
}
};
template<> struct BuildAST<grammar::expr::operator_<grammar::op::has_attr>> : change_head<AttrState> {
static void success(const auto & in, AttrState & a, ExprState & s, State & ps) {
s.pushOp(ps.at(in), ExprState::has_attr{{}, std::move(a.attrs)}, ps);
}
};
template<> struct BuildAST<grammar::expr::lambda::arg> {
static void apply(const auto & in, LambdaState & s, State & ps) {
s.arg = ps.symbols.create(in.string_view());
}
};
template<> struct BuildAST<grammar::expr::lambda> : change_head<LambdaState> {
static void success(const auto & in, LambdaState & l, ExprState & s, State & ps) {
if (l.formals)
l.formals = ps.validateFormals(std::move(l.formals), ps.at(in), l.arg);
s.pushExpr<ExprLambda>(ps.at(in), l.arg, std::move(l.formals), l->popExpr());
}
};
template<> struct BuildAST<grammar::expr::assert_> {
static void apply(const auto & in, ExprState & s, State & ps) {
auto body = s.popExpr(), cond = s.popExpr();
s.pushExpr<ExprAssert>(ps.at(in), std::move(cond), std::move(body));
}
};
template<> struct BuildAST<grammar::expr::with> {
static void apply(const auto & in, ExprState & s, State & ps) {
auto body = s.popExpr(), scope = s.popExpr();
s.pushExpr<ExprWith>(ps.at(in), std::move(scope), std::move(body));
}
};
template<> struct BuildAST<grammar::expr::let> : change_head<BindingsState> {
static void success(const auto & in, BindingsState & b, ExprState & s, State & ps) {
if (!b.attrs.dynamicAttrs.empty())
throw ParseError({
.msg = HintFmt("dynamic attributes not allowed in let"),
.pos = ps.positions[ps.at(in)]
});
s.pushExpr<ExprLet>(std::make_unique<ExprAttrs>(std::move(b.attrs)), b->popExpr());
}
};
template<> struct BuildAST<grammar::expr::if_> {
static void apply(const auto & in, ExprState & s, State & ps) {
auto else_ = s.popExpr(), then = s.popExpr(), cond = s.popExpr();
s.pushExpr<ExprIf>(ps.at(in), std::move(cond), std::move(then), std::move(else_));
}
};
template<> struct BuildAST<grammar::expr> : change_head<ExprState> {
static void success0(ExprState & inner, ExprState & outer, State & ps) {
outer.exprs.push_back(inner.finish(ps));
}
};
}
}
namespace nix {
Expr * EvalState::parse(
char * text,
size_t length,
Pos::Origin origin,
const SourcePath & basePath,
std::shared_ptr<StaticEnv> & staticEnv)
{
parser::State s = {
symbols,
positions,
basePath,
positions.addOrigin(origin, length),
exprSymbols,
};
parser::ExprState x;
assert(length >= 2);
assert(text[length - 1] == 0);
assert(text[length - 2] == 0);
length -= 2;
p::string_input<p::tracking_mode::lazy> inp{std::string_view{text, length}, "input"};
try {
p::parse<parser::grammar::root, parser::BuildAST, parser::Control>(inp, x, s);
} catch (p::parse_error & e) {
auto pos = e.positions().back();
throw ParseError({
.msg = HintFmt("syntax error, %s", e.message()),
.pos = positions[s.positions.add(s.origin, pos.byte)]
});
}
auto result = x.finish(s);
result->bindVars(*this, staticEnv);
return result.release();
}
}

View file

@ -3,64 +3,44 @@
#include "eval.hh"
namespace nix {
namespace nix::parser {
/**
* @note Storing a C-style `char *` and `size_t` allows us to avoid
* having to define the special members that using string_view here
* would implicitly delete.
*/
struct StringToken
{
const char * p;
size_t l;
std::string_view s;
bool hasIndentation;
operator std::string_view() const { return {p, l}; }
operator std::string_view() const { return s; }
};
struct ParserLocation
{
int first_line, first_column;
int last_line, last_column;
// backup to recover from yyless(0)
int stashed_first_line, stashed_first_column;
int stashed_last_line, stashed_last_column;
void stash() {
stashed_first_line = first_line;
stashed_first_column = first_column;
stashed_last_line = last_line;
stashed_last_column = last_column;
}
void unstash() {
first_line = stashed_first_line;
first_column = stashed_first_column;
last_line = stashed_last_line;
last_column = stashed_last_column;
}
};
struct ParserState
struct State
{
SymbolTable & symbols;
PosTable & positions;
Expr * result;
SourcePath basePath;
PosTable::Origin origin;
const Expr::AstSymbols & s;
void dupAttr(const AttrPath & attrPath, const PosIdx pos, const PosIdx prevPos);
void dupAttr(Symbol attr, const PosIdx pos, const PosIdx prevPos);
void addAttr(ExprAttrs * attrs, AttrPath && attrPath, Expr * e, const PosIdx pos);
Formals * validateFormals(Formals * formals, PosIdx pos = noPos, Symbol arg = {});
Expr * stripIndentation(const PosIdx pos,
std::vector<std::pair<PosIdx, std::variant<Expr *, StringToken>>> && es);
PosIdx at(const ParserLocation & loc);
void addAttr(ExprAttrs * attrs, AttrPath && attrPath, std::unique_ptr<Expr> e, const PosIdx pos);
std::unique_ptr<Formals> validateFormals(std::unique_ptr<Formals> formals, PosIdx pos = noPos, Symbol arg = {});
std::unique_ptr<Expr> stripIndentation(const PosIdx pos,
std::vector<std::pair<PosIdx, std::variant<std::unique_ptr<Expr>, StringToken>>> && es);
// lazy positioning means we don't get byte offsets directly, in.position() would work
// but also requires line and column (which is expensive)
PosIdx at(const auto & in)
{
return positions.add(origin, in.begin() - in.input().begin());
}
PosIdx atEnd(const auto & in)
{
return positions.add(origin, in.end() - in.input().begin());
}
};
inline void ParserState::dupAttr(const AttrPath & attrPath, const PosIdx pos, const PosIdx prevPos)
inline void State::dupAttr(const AttrPath & attrPath, const PosIdx pos, const PosIdx prevPos)
{
throw ParseError({
.msg = HintFmt("attribute '%1%' already defined at %2%",
@ -69,7 +49,7 @@ inline void ParserState::dupAttr(const AttrPath & attrPath, const PosIdx pos, co
});
}
inline void ParserState::dupAttr(Symbol attr, const PosIdx pos, const PosIdx prevPos)
inline void State::dupAttr(Symbol attr, const PosIdx pos, const PosIdx prevPos)
{
throw ParseError({
.msg = HintFmt("attribute '%1%' already defined at %2%", symbols[attr], positions[prevPos]),
@ -77,7 +57,7 @@ inline void ParserState::dupAttr(Symbol attr, const PosIdx pos, const PosIdx pre
});
}
inline void ParserState::addAttr(ExprAttrs * attrs, AttrPath && attrPath, Expr * e, const PosIdx pos)
inline void State::addAttr(ExprAttrs * attrs, AttrPath && attrPath, std::unique_ptr<Expr> e, const PosIdx pos)
{
AttrPath::iterator i;
// All attrpaths have at least one attr
@ -89,20 +69,20 @@ inline void ParserState::addAttr(ExprAttrs * attrs, AttrPath && attrPath, Expr *
ExprAttrs::AttrDefs::iterator j = attrs->attrs.find(i->symbol);
if (j != attrs->attrs.end()) {
if (j->second.kind != ExprAttrs::AttrDef::Kind::Inherited) {
ExprAttrs * attrs2 = dynamic_cast<ExprAttrs *>(j->second.e);
ExprAttrs * attrs2 = dynamic_cast<ExprAttrs *>(j->second.e.get());
if (!attrs2) dupAttr(attrPath, pos, j->second.pos);
attrs = attrs2;
} else
dupAttr(attrPath, pos, j->second.pos);
} else {
ExprAttrs * nested = new ExprAttrs;
attrs->attrs[i->symbol] = ExprAttrs::AttrDef(nested, pos);
attrs = nested;
auto next = attrs->attrs.emplace(std::piecewise_construct,
std::tuple(i->symbol),
std::tuple(std::make_unique<ExprAttrs>(), pos));
attrs = static_cast<ExprAttrs *>(next.first->second.e.get());
}
} else {
ExprAttrs *nested = new ExprAttrs;
attrs->dynamicAttrs.push_back(ExprAttrs::DynamicAttrDef(i->expr, nested, pos));
attrs = nested;
auto & next = attrs->dynamicAttrs.emplace_back(std::move(i->expr), std::make_unique<ExprAttrs>(), pos);
attrs = static_cast<ExprAttrs *>(next.valueExpr.get());
}
}
// Expr insertion.
@ -114,41 +94,41 @@ inline void ParserState::addAttr(ExprAttrs * attrs, AttrPath && attrPath, Expr *
// e and the expr pointed by the attr path are two attribute sets,
// we want to merge them.
// Otherwise, throw an error.
auto ae = dynamic_cast<ExprAttrs *>(e);
auto jAttrs = dynamic_cast<ExprAttrs *>(j->second.e);
auto ae = dynamic_cast<ExprAttrs *>(e.get());
auto jAttrs = dynamic_cast<ExprAttrs *>(j->second.e.get());
if (jAttrs && ae) {
if (ae->inheritFromExprs && !jAttrs->inheritFromExprs)
jAttrs->inheritFromExprs = std::make_unique<std::vector<Expr *>>();
jAttrs->inheritFromExprs = std::make_unique<std::vector<std::unique_ptr<Expr>>>();
for (auto & ad : ae->attrs) {
auto j2 = jAttrs->attrs.find(ad.first);
if (j2 != jAttrs->attrs.end()) // Attr already defined in iAttrs, error.
dupAttr(ad.first, j2->second.pos, ad.second.pos);
jAttrs->attrs.emplace(ad.first, ad.second);
return dupAttr(ad.first, j2->second.pos, ad.second.pos);
if (ad.second.kind == ExprAttrs::AttrDef::Kind::InheritedFrom) {
auto & sel = dynamic_cast<ExprSelect &>(*ad.second.e);
auto & from = dynamic_cast<ExprInheritFrom &>(*sel.e);
from.displ += jAttrs->inheritFromExprs->size();
}
jAttrs->attrs.emplace(ad.first, std::move(ad.second));
}
jAttrs->dynamicAttrs.insert(jAttrs->dynamicAttrs.end(), ae->dynamicAttrs.begin(), ae->dynamicAttrs.end());
if (ae->inheritFromExprs) {
jAttrs->inheritFromExprs->insert(jAttrs->inheritFromExprs->end(),
ae->inheritFromExprs->begin(), ae->inheritFromExprs->end());
}
std::ranges::move(ae->dynamicAttrs, std::back_inserter(jAttrs->dynamicAttrs));
if (ae->inheritFromExprs)
std::ranges::move(*ae->inheritFromExprs, std::back_inserter(*jAttrs->inheritFromExprs));
} else {
dupAttr(attrPath, pos, j->second.pos);
}
} else {
// This attr path is not defined. Let's create it.
attrs->attrs.emplace(i->symbol, ExprAttrs::AttrDef(e, pos));
e->setName(i->symbol);
attrs->attrs.emplace(std::piecewise_construct,
std::tuple(i->symbol),
std::tuple(std::move(e), pos));
}
} else {
attrs->dynamicAttrs.push_back(ExprAttrs::DynamicAttrDef(i->expr, e, pos));
attrs->dynamicAttrs.emplace_back(std::move(i->expr), std::move(e), pos);
}
}
inline Formals * ParserState::validateFormals(Formals * formals, PosIdx pos, Symbol arg)
inline std::unique_ptr<Formals> State::validateFormals(std::unique_ptr<Formals> formals, PosIdx pos, Symbol arg)
{
std::sort(formals->formals.begin(), formals->formals.end(),
[] (const auto & a, const auto & b) {
@ -177,10 +157,10 @@ inline Formals * ParserState::validateFormals(Formals * formals, PosIdx pos, Sym
return formals;
}
inline Expr * ParserState::stripIndentation(const PosIdx pos,
std::vector<std::pair<PosIdx, std::variant<Expr *, StringToken>>> && es)
inline std::unique_ptr<Expr> State::stripIndentation(const PosIdx pos,
std::vector<std::pair<PosIdx, std::variant<std::unique_ptr<Expr>, StringToken>>> && es)
{
if (es.empty()) return new ExprString("");
if (es.empty()) return std::make_unique<ExprString>("");
/* Figure out the minimum indentation. Note that by design
whitespace-only final lines are not taken into account. (So
@ -198,11 +178,11 @@ inline Expr * ParserState::stripIndentation(const PosIdx pos,
}
continue;
}
for (size_t j = 0; j < str->l; ++j) {
for (size_t j = 0; j < str->s.size(); ++j) {
if (atStartOfLine) {
if (str->p[j] == ' ')
if (str->s[j] == ' ')
curIndent++;
else if (str->p[j] == '\n') {
else if (str->s[j] == '\n') {
/* Empty line, doesn't influence minimum
indentation. */
curIndent = 0;
@ -210,7 +190,7 @@ inline Expr * ParserState::stripIndentation(const PosIdx pos,
atStartOfLine = false;
if (curIndent < minIndent) minIndent = curIndent;
}
} else if (str->p[j] == '\n') {
} else if (str->s[j] == '\n') {
atStartOfLine = true;
curIndent = 0;
}
@ -218,35 +198,35 @@ inline Expr * ParserState::stripIndentation(const PosIdx pos,
}
/* Strip spaces from each line. */
auto * es2 = new std::vector<std::pair<PosIdx, Expr *>>;
std::vector<std::pair<PosIdx, std::unique_ptr<Expr>>> es2;
atStartOfLine = true;
size_t curDropped = 0;
size_t n = es.size();
auto i = es.begin();
const auto trimExpr = [&] (Expr * e) {
const auto trimExpr = [&] (std::unique_ptr<Expr> & e) {
atStartOfLine = false;
curDropped = 0;
es2->emplace_back(i->first, e);
es2.emplace_back(i->first, std::move(e));
};
const auto trimString = [&] (const StringToken & t) {
std::string s2;
for (size_t j = 0; j < t.l; ++j) {
for (size_t j = 0; j < t.s.size(); ++j) {
if (atStartOfLine) {
if (t.p[j] == ' ') {
if (t.s[j] == ' ') {
if (curDropped++ >= minIndent)
s2 += t.p[j];
s2 += t.s[j];
}
else if (t.p[j] == '\n') {
else if (t.s[j] == '\n') {
curDropped = 0;
s2 += t.p[j];
s2 += t.s[j];
} else {
atStartOfLine = false;
curDropped = 0;
s2 += t.p[j];
s2 += t.s[j];
}
} else {
s2 += t.p[j];
if (t.p[j] == '\n') atStartOfLine = true;
s2 += t.s[j];
if (t.s[j] == '\n') atStartOfLine = true;
}
}
@ -258,24 +238,17 @@ inline Expr * ParserState::stripIndentation(const PosIdx pos,
s2 = std::string(s2, 0, p + 1);
}
es2->emplace_back(i->first, new ExprString(std::move(s2)));
es2.emplace_back(i->first, new ExprString(std::move(s2)));
};
for (; i != es.end(); ++i, --n) {
std::visit(overloaded { trimExpr, trimString }, i->second);
}
/* If this is a single string, then don't do a concatenation. */
if (es2->size() == 1 && dynamic_cast<ExprString *>((*es2)[0].second)) {
auto *const result = (*es2)[0].second;
delete es2;
return result;
if (es2.size() == 1 && dynamic_cast<ExprString *>(es2[0].second.get())) {
return std::move(es2[0].second);
}
return new ExprConcatStrings(pos, true, es2);
}
inline PosIdx ParserState::at(const ParserLocation & loc)
{
return positions.add(origin, loc.first_line, loc.first_column);
return std::make_unique<ExprConcatStrings>(pos, true, std::move(es2));
}
}

View file

@ -6,6 +6,7 @@ namespace nix {
class PosIdx
{
friend struct LazyPosAcessors;
friend class PosTable;
private:

View file

@ -7,6 +7,7 @@
#include "chunked-vector.hh"
#include "pos-idx.hh"
#include "position.hh"
#include "sync.hh"
namespace nix {
@ -17,66 +18,69 @@ public:
{
friend PosTable;
private:
// must always be invalid by default, add() replaces this with the actual value.
// subsequent add() calls use this index as a token to quickly check whether the
// current origins.back() can be reused or not.
mutable uint32_t idx = std::numeric_limits<uint32_t>::max();
uint32_t offset;
// Used for searching in PosTable::[].
explicit Origin(uint32_t idx)
: idx(idx)
, origin{std::monostate()}
{
}
Origin(Pos::Origin origin, uint32_t offset, size_t size):
offset(offset), origin(origin), size(size)
{}
public:
const Pos::Origin origin;
const size_t size;
Origin(Pos::Origin origin)
: origin(origin)
uint32_t offsetOf(PosIdx p) const
{
return p.id - 1 - offset;
}
};
struct Offset
{
uint32_t line, column;
};
private:
std::vector<Origin> origins;
ChunkedVector<Offset, 8192> offsets;
using Lines = std::vector<uint32_t>;
public:
PosTable()
: offsets(1024)
{
origins.reserve(1024);
}
std::map<uint32_t, Origin> origins;
mutable Sync<std::map<uint32_t, Lines>> lines;
PosIdx add(const Origin & origin, uint32_t line, uint32_t column)
const Origin * resolve(PosIdx p) const
{
const auto idx = offsets.add({line, column}).second;
if (origins.empty() || origins.back().idx != origin.idx) {
origin.idx = idx;
origins.push_back(origin);
}
return PosIdx(idx + 1);
}
if (p.id == 0)
return nullptr;
Pos operator[](PosIdx p) const
{
if (p.id == 0 || p.id > offsets.size())
return {};
const auto idx = p.id - 1;
/* we want the last key <= idx, so we'll take prev(first key > idx).
this is guaranteed to never rewind origin.begin because the first
key is always 0. */
const auto pastOrigin = std::upper_bound(
origins.begin(), origins.end(), Origin(idx), [](const auto & a, const auto & b) { return a.idx < b.idx; });
const auto origin = *std::prev(pastOrigin);
const auto offset = offsets[idx];
return {offset.line, offset.column, origin.origin};
const auto pastOrigin = origins.upper_bound(idx);
return &std::prev(pastOrigin)->second;
}
public:
Origin addOrigin(Pos::Origin origin, size_t size)
{
uint32_t offset = 0;
if (auto it = origins.rbegin(); it != origins.rend())
offset = it->first + it->second.size;
// +1 because all PosIdx are offset by 1 to begin with, and
// another +1 to ensure that all origins can point to EOF, eg
// on (invalid) empty inputs.
if (2 + offset + size < offset)
return Origin{origin, offset, 0};
return origins.emplace(offset, Origin{origin, offset, size}).first->second;
}
PosIdx add(const Origin & origin, size_t offset)
{
if (offset > origin.size)
return PosIdx();
return PosIdx(1 + origin.offset + offset);
}
Pos operator[](PosIdx p) const;
Pos::Origin originOf(PosIdx p) const
{
if (auto o = resolve(p))
return o->origin;
return std::monostate{};
}
};

View file

@ -243,9 +243,9 @@ static void import(EvalState & state, const PosIdx pos, Value & vPath, Value * v
// args[0]->attrs is already sorted.
printTalkative("evaluating file '%1%'", path);
Expr * e = state.parseExprFromFile(resolveExprPath(path), staticEnv);
Expr & e = state.parseExprFromFile(resolveExprPath(path), staticEnv);
e->eval(state, *env, v);
e.eval(state, *env, v);
}
}
}
@ -388,13 +388,13 @@ void prim_exec(EvalState & state, const PosIdx pos, Value * * args, Value & v)
auto output = runProgram(program, true, commandArgs);
Expr * parsed;
try {
parsed = state.parseExprFromString(std::move(output), state.rootPath(CanonPath::root));
parsed = &state.parseExprFromString(std::move(output), state.rootPath(CanonPath::root));
} catch (Error & e) {
e.addTrace(state.positions[pos], "while parsing the output from '%1%'", program);
throw;
}
try {
state.eval(parsed, v);
state.eval(*parsed, v);
} catch (Error & e) {
e.addTrace(state.positions[pos], "while evaluating the output from '%1%'", program);
throw;
@ -2492,6 +2492,54 @@ static RegisterPrimOp primop_unsafeGetAttrPos(PrimOp {
.fun = prim_unsafeGetAttrPos,
});
// access to exact position information (ie, line and colum numbers) is deferred
// due to the cost associated with calculating that information and how rarely
// it is used in practice. this is achieved by creating thunks to otherwise
// inaccessible primops that are not exposed as __op or under builtins to turn
// the internal PosIdx back into a line and column number, respectively. exposing
// these primops in any way would at best be not useful and at worst create wildly
// indeterministic eval results depending on parse order of files.
//
// in a simpler world this would instead be implemented as another kind of thunk,
// but each type of thunk has an associated runtime cost in the current evaluator.
// as with black holes this cost is too high to justify another thunk type to check
// for in the very hot path that is forceValue.
static struct LazyPosAcessors {
PrimOp primop_lineOfPos{
.arity = 1,
.fun = [] (EvalState & state, PosIdx pos, Value * * args, Value & v) {
v.mkInt(state.positions[PosIdx(args[0]->integer)].line);
}
};
PrimOp primop_columnOfPos{
.arity = 1,
.fun = [] (EvalState & state, PosIdx pos, Value * * args, Value & v) {
v.mkInt(state.positions[PosIdx(args[0]->integer)].column);
}
};
Value lineOfPos, columnOfPos;
LazyPosAcessors()
{
lineOfPos.mkPrimOp(&primop_lineOfPos);
columnOfPos.mkPrimOp(&primop_columnOfPos);
}
void operator()(EvalState & state, const PosIdx pos, Value & line, Value & column)
{
Value * posV = state.allocValue();
posV->mkInt(pos.id);
line.mkApp(&lineOfPos, posV);
column.mkApp(&columnOfPos, posV);
}
} makeLazyPosAccessors;
void makePositionThunks(EvalState & state, const PosIdx pos, Value & line, Value & column)
{
makeLazyPosAccessors(state, pos, line, column);
}
/* Dynamic version of the `?' operator. */
static void prim_hasAttr(EvalState & state, const PosIdx pos, Value * * args, Value & v)
{
@ -2765,7 +2813,7 @@ static void prim_functionArgs(EvalState & state, const PosIdx pos, Value * * arg
auto attrs = state.buildBindings(args[0]->lambda.fun->formals->formals.size());
for (auto & i : args[0]->lambda.fun->formals->formals)
// !!! should optimise booleans (allocate only once)
attrs.alloc(i.name, i.pos).mkBool(i.def);
attrs.alloc(i.name, i.pos).mkBool(i.def.get());
v.mkAttrs(attrs);
}
@ -4437,7 +4485,7 @@ void EvalState::createBaseEnv()
// the parser needs two NUL bytes as terminators; one of them
// is implied by being a C string.
"\0";
eval(parse(code, sizeof(code), derivationInternal, {CanonPath::root}, staticBaseEnv), *vDerivation);
eval(*parse(code, sizeof(code), derivationInternal, {CanonPath::root}, staticBaseEnv), *vDerivation);
}

View file

@ -51,4 +51,6 @@ void prim_importNative(EvalState & state, const PosIdx pos, Value * * args, Valu
*/
void prim_exec(EvalState & state, const PosIdx pos, Value * * args, Value & v);
void makePositionThunks(EvalState & state, const PosIdx pos, Value & line, Value & column);
}

View file

@ -144,7 +144,7 @@ static void printValueAsXML(EvalState & state, bool strict, bool location,
if (v.lambda.fun->arg) attrs["name"] = state.symbols[v.lambda.fun->arg];
if (v.lambda.fun->formals->ellipsis) attrs["ellipsis"] = "1";
XMLOpenElement _(doc, "attrspat", attrs);
for (auto & i : v.lambda.fun->formals->lexicographicOrder(state.symbols))
for (const Formal & i : v.lambda.fun->formals->lexicographicOrder(state.symbols))
doc.writeEmptyElement("attr", singletonAttrs("name", state.symbols[i.name]));
} else
doc.writeEmptyElement("varpat", singletonAttrs("name", state.symbols[v.lambda.fun->arg]));

View file

@ -323,11 +323,11 @@ public:
}
}
inline void mkThunk(Env * e, Expr * ex)
inline void mkThunk(Env * e, Expr & ex)
{
internalType = tThunk;
thunk.env = e;
thunk.expr = ex;
thunk.expr = &ex;
}
inline void mkApp(Value * l, Value * r)

View file

@ -29,32 +29,17 @@ std::optional<LinesOfCode> Pos::getCodeLines() const
return std::nullopt;
if (auto source = getSource()) {
std::istringstream iss(*source);
// count the newlines.
int count = 0;
std::string curLine;
int pl = line - 1;
LinesIterator lines(*source), end;
LinesOfCode loc;
do {
std::getline(iss, curLine);
++count;
if (count < pl)
;
else if (count == pl) {
loc.prevLineOfCode = curLine;
} else if (count == pl + 1) {
loc.errLineOfCode = curLine;
} else if (count == pl + 2) {
loc.nextLineOfCode = curLine;
break;
}
if (!iss.good())
break;
} while (true);
if (line > 1)
std::advance(lines, line - 2);
if (lines != end && line > 1)
loc.prevLineOfCode = *lines++;
if (lines != end)
loc.errLineOfCode = *lines++;
if (lines != end)
loc.nextLineOfCode = *lines++;
return loc;
}
@ -109,4 +94,26 @@ std::ostream & operator<<(std::ostream & str, const Pos & pos)
return str;
}
void Pos::LinesIterator::bump(bool atFirst)
{
if (!atFirst) {
pastEnd = input.empty();
if (!input.empty() && input[0] == '\r')
input.remove_prefix(1);
if (!input.empty() && input[0] == '\n')
input.remove_prefix(1);
}
// nix line endings are not only \n as eg std::getline assumes, but also
// \r\n **and \r alone**. not treating them all the same causes error
// reports to not match with line numbers as the parser expects them.
auto eol = input.find_first_of("\r\n");
if (eol > input.size())
eol = input.size();
curLine = input.substr(0, eol);
input.remove_prefix(eol);
}
}

View file

@ -67,6 +67,48 @@ struct Pos
bool operator==(const Pos & rhs) const = default;
bool operator!=(const Pos & rhs) const = default;
bool operator<(const Pos & rhs) const;
struct LinesIterator {
using difference_type = size_t;
using value_type = std::string_view;
using reference = const std::string_view &;
using pointer = const std::string_view *;
using iterator_category = std::input_iterator_tag;
LinesIterator(): pastEnd(true) {}
explicit LinesIterator(std::string_view input): input(input), pastEnd(input.empty()) {
if (!pastEnd)
bump(true);
}
LinesIterator & operator++() {
bump(false);
return *this;
}
LinesIterator operator++(int) {
auto result = *this;
++*this;
return result;
}
reference operator*() const { return curLine; }
pointer operator->() const { return &curLine; }
bool operator!=(const LinesIterator & other) const {
return !(*this == other);
}
bool operator==(const LinesIterator & other) const {
return (pastEnd && other.pastEnd)
|| (std::forward_as_tuple(input.size(), input.data())
== std::forward_as_tuple(other.input.size(), other.input.data()));
}
private:
std::string_view input, curLine;
bool pastEnd = false;
void bump(bool atFirst);
};
};
std::ostream & operator<<(std::ostream & str, const Pos & pos);

View file

@ -291,7 +291,7 @@ static void main_nix_build(int argc, char * * argv)
DrvInfos drvs;
/* Parse the expressions. */
std::vector<Expr *> exprs;
std::vector<std::reference_wrapper<Expr>> exprs;
if (readStdin)
exprs = {state->parseStdin()};
@ -394,7 +394,7 @@ static void main_nix_build(int argc, char * * argv)
if (!shell) {
try {
auto expr = state->parseExprFromString(
auto & expr = state->parseExprFromString(
"(import <nixpkgs> {}).bashInteractive",
state->rootPath(CanonPath::fromCwd()));

View file

@ -413,7 +413,7 @@ static void queryInstSources(EvalState & state,
loadSourceExpr(state, *instSource.nixExprPath, vArg);
for (auto & i : args) {
Expr * eFun = state.parseExprFromString(i, state.rootPath(CanonPath::fromCwd()));
Expr & eFun = state.parseExprFromString(i, state.rootPath(CanonPath::fromCwd()));
Value vFun, vTmp;
state.eval(eFun, vFun);
vTmp.mkApp(&vFun, &vArg);

View file

@ -28,10 +28,10 @@ enum OutputKind { okPlain, okXML, okJSON };
void processExpr(EvalState & state, const Strings & attrPaths,
bool parseOnly, bool strict, Bindings & autoArgs,
bool evalOnly, OutputKind output, bool location, Expr * e)
bool evalOnly, OutputKind output, bool location, Expr & e)
{
if (parseOnly) {
e->show(state.symbols, std::cout);
e.show(state.symbols, std::cout);
std::cout << "\n";
return;
}
@ -176,14 +176,14 @@ static int main_nix_instantiate(int argc, char * * argv)
}
if (readStdin) {
Expr * e = state->parseStdin();
Expr & e = state->parseStdin();
processExpr(*state, attrPaths, parseOnly, strict, autoArgs,
evalOnly, outputKind, xmlOutputSourceLocation, e);
} else if (files.empty() && !fromArgs)
files.push_back("./default.nix");
for (auto & i : files) {
Expr * e = fromArgs
Expr & e = fromArgs
? state->parseExprFromString(i, state->rootPath(CanonPath::fromCwd()))
: state->parseExprFromFile(resolveExprPath(state->checkSourcePath(lookupFileArg(*state, i))));
processExpr(*state, attrPaths, parseOnly, strict, autoArgs,

View file

@ -405,7 +405,7 @@ struct CmdFlakeCheck : FlakeCommand
if (v.lambda.fun->hasFormals()
|| !argHasName(v.lambda.fun->arg, "final"))
throw Error("overlay does not take an argument named 'final'");
auto body = dynamic_cast<ExprLambda *>(v.lambda.fun->body);
auto body = dynamic_cast<ExprLambda *>(v.lambda.fun->body.get());
if (!body
|| body->hasFormals()
|| !argHasName(body->arg, "prev"))

View file

@ -229,7 +229,7 @@ static void showHelp(std::vector<std::string> subcommand, NixArgs & toplevel)
auto vUtils = state.allocValue();
state.cacheFile(
CanonPath("/utils.nix"), CanonPath("/utils.nix"),
state.parseExprFromString(
&state.parseExprFromString(
#include "utils.nix.gen.hh"
, CanonPath::root),
*vUtils);

View file

@ -0,0 +1,6 @@
error: undefined variable 'invalid'
at /pwd/lang/eval-fail-eol-1.nix:2:1:
1| # foo
2| invalid
| ^
3| # bar

View file

@ -0,0 +1,3 @@
# foo
invalid
# bar

View file

@ -0,0 +1,6 @@
error: undefined variable 'invalid'
at /pwd/lang/eval-fail-eol-2.nix:2:1:
1| # foo
2| invalid
| ^
3| # bar

View file

@ -0,0 +1,2 @@
# foo invalid
# bar

View file

@ -0,0 +1,6 @@
error: undefined variable 'invalid'
at /pwd/lang/eval-fail-eol-3.nix:2:1:
1| # foo
2| invalid
| ^
3| # bar

View file

@ -0,0 +1,3 @@
# foo
invalid
# bar

View file

@ -0,0 +1 @@
[ { column = 17; file = "/pwd/lang/eval-okay-inherit-attr-pos.nix"; line = 4; } { column = 19; file = "/pwd/lang/eval-okay-inherit-attr-pos.nix"; line = 4; } { column = 21; file = "/pwd/lang/eval-okay-inherit-attr-pos.nix"; line = 5; } { column = 23; file = "/pwd/lang/eval-okay-inherit-attr-pos.nix"; line = 5; } ]

View file

@ -0,0 +1,12 @@
let
d = 0;
x = 1;
y = { inherit d x; };
z = { inherit (y) d x; };
in
[
(builtins.unsafeGetAttrPos "d" y)
(builtins.unsafeGetAttrPos "x" y)
(builtins.unsafeGetAttrPos "d" z)
(builtins.unsafeGetAttrPos "x" z)
]

View file

@ -0,0 +1 @@
{ column = 4; file = "/pwd/lang/eval-okay-unsafeGetAttrPos.imported-nix"; line = 6; }

View file

@ -0,0 +1,4 @@
(
{ y = "x"; })

View file

@ -0,0 +1,4 @@
let
pos = builtins.unsafeGetAttrPos "y" (import ./eval-okay-unsafeGetAttrPos.imported-nix);
in
pos

View file

@ -3,3 +3,4 @@ error: attribute 'x' already defined at «stdin»:1:3
2| y = 456;
3| x = 789;
| ^
4| }

View file

@ -1,5 +1,6 @@
error: attribute 'x' already defined at «stdin»:9:5
at «stdin»:10:17:
at «stdin»:10:18:
9| x = 789;
10| inherit (as) x;
| ^
11| };

View file

@ -1,5 +1,6 @@
error: attribute 'x' already defined at «stdin»:9:5
at «stdin»:10:17:
at «stdin»:10:18:
9| x = 789;
10| inherit (as) x;
| ^
11| };

View file

@ -3,3 +3,4 @@ error: attribute 'services.ssh.port' already defined at «stdin»:2:3
2| services.ssh.port = 22;
3| services.ssh.port = 23;
| ^
4| }

View file

@ -1,5 +1,6 @@
error: attribute 'x' already defined at «stdin»:6:12
at «stdin»:7:12:
error: attribute 'x' already defined at «stdin»:6:13
at «stdin»:7:13:
6| inherit x;
7| inherit x;
| ^
8| };

View file

@ -1,5 +1,5 @@
error: syntax error, unexpected end of file, expecting '"'
at «stdin»:3:5:
at «stdin»:3:6:
2| # Note that this file must not end with a newline.
3| a 1"$
| ^

View file

@ -0,0 +1,5 @@
error: syntax error, unexpected end of file, expecting expression
at «stdin»:3:1:
2| # no content
3|
| ^

View file

@ -0,0 +1,2 @@
(
# no content

View file

@ -1,6 +1,6 @@
error: undefined variable 'gcc'
at «stdin»:8:12:
7|
at «stdin»:9:13:
8| body = ({
| ^
9| inherit gcc;
| ^
10| }).gcc;

View file

@ -1,5 +1,6 @@
error: syntax error, unexpected ':', expecting '}'
error: syntax error, expecting '}'
at «stdin»:3:13:
2|
3| f = {x, y :
3| f = {x, y : ["baz" "bar" z "bat"]}: x + y;
| ^
4|

View file

@ -1,4 +1,5 @@
error: syntax error, unexpected invalid token, expecting end of file
error: syntax error, expecting end of file
at «stdin»:1:5:
1| 123 Ã
1| 123 é 4
| ^
2|

View file

@ -1 +1 @@
({ fetchurl, localServer ? false, httpServer ? false, sslSupport ? false, pythonBindings ? false, javaSwigBindings ? false, javahlBindings ? false, stdenv, openssl ? null, httpd ? null, db4 ? null, expat, swig ? null, j2sdk ? null }: assert (expat != null); assert (localServer -> (db4 != null)); assert (httpServer -> ((httpd != null) && ((httpd).expat == expat))); assert (sslSupport -> ((openssl != null) && (httpServer -> ((httpd).openssl == openssl)))); assert (pythonBindings -> ((swig != null) && (swig).pythonSupport)); assert (javaSwigBindings -> ((swig != null) && (swig).javaSupport)); assert (javahlBindings -> (j2sdk != null)); ((stdenv).mkDerivation { inherit expat httpServer javaSwigBindings javahlBindings localServer pythonBindings sslSupport; builder = /foo/bar; db4 = (if localServer then db4 else null); httpd = (if httpServer then httpd else null); j2sdk = (if javaSwigBindings then (swig).j2sdk else (if javahlBindings then j2sdk else null)); name = "subversion-1.1.1"; openssl = (if sslSupport then openssl else null); patches = (if javahlBindings then [ (/javahl.patch) ] else [ ]); python = (if pythonBindings then (swig).python else null); src = (fetchurl { md5 = "a180c3fe91680389c210c99def54d9e0"; url = "http://subversion.tigris.org/tarballs/subversion-1.1.1.tar.bz2"; }); swig = (if (pythonBindings || javaSwigBindings) then swig else null); }))
({ db4 ? null, expat, fetchurl, httpServer ? false, httpd ? null, j2sdk ? null, javaSwigBindings ? false, javahlBindings ? false, localServer ? false, openssl ? null, pythonBindings ? false, sslSupport ? false, stdenv, swig ? null }: assert (expat != null); assert (localServer -> (db4 != null)); assert (httpServer -> ((httpd != null) && ((httpd).expat == expat))); assert (sslSupport -> ((openssl != null) && (httpServer -> ((httpd).openssl == openssl)))); assert (pythonBindings -> ((swig != null) && (swig).pythonSupport)); assert (javaSwigBindings -> ((swig != null) && (swig).javaSupport)); assert (javahlBindings -> (j2sdk != null)); ((stdenv).mkDerivation { inherit expat httpServer javaSwigBindings javahlBindings localServer pythonBindings sslSupport; builder = /foo/bar; db4 = (if localServer then db4 else null); httpd = (if httpServer then httpd else null); j2sdk = (if javaSwigBindings then (swig).j2sdk else (if javahlBindings then j2sdk else null)); name = "subversion-1.1.1"; openssl = (if sslSupport then openssl else null); patches = (if javahlBindings then [ (/javahl.patch) ] else [ ]); python = (if pythonBindings then (swig).python else null); src = (fetchurl { md5 = "a180c3fe91680389c210c99def54d9e0"; url = "http://subversion.tigris.org/tarballs/subversion-1.1.1.tar.bz2"; }); swig = (if (pythonBindings || javaSwigBindings) then swig else null); }))

View file

@ -28,8 +28,7 @@ namespace nix {
}
Value eval(std::string input, bool forceValue = true) {
Value v;
Expr * e = state.parseExprFromString(input, state.rootPath(CanonPath::root));
assert(e);
Expr & e = state.parseExprFromString(input, state.rootPath(CanonPath::root));
state.eval(e, v);
if (forceValue)
state.forceValue(v, noPos);

View file

@ -91,7 +91,8 @@ TEST_F(ValuePrintingTests, tList)
TEST_F(ValuePrintingTests, vThunk)
{
Value vThunk;
vThunk.mkThunk(nullptr, nullptr);
ExprInt e(0);
vThunk.mkThunk(nullptr, e);
test(vThunk, "«thunk»");
}
@ -110,12 +111,10 @@ TEST_F(ValuePrintingTests, vLambda)
.up = nullptr,
.values = { }
};
PosTable::Origin origin((std::monostate()));
auto posIdx = state.positions.add(origin, 1, 1);
auto body = ExprInt(0);
auto formals = Formals {};
PosTable::Origin origin = state.positions.addOrigin(std::monostate(), 1);
auto posIdx = state.positions.add(origin, 0);
ExprLambda eLambda(posIdx, createSymbol("a"), &formals, &body);
ExprLambda eLambda(posIdx, createSymbol("a"), std::make_unique<Formals>(), std::make_unique<ExprInt>(0));
Value vLambda;
vLambda.mkLambda(&env, &eLambda);
@ -514,14 +513,13 @@ TEST_F(ValuePrintingTests, ansiColorsDerivationError)
TEST_F(ValuePrintingTests, ansiColorsAssert)
{
ExprVar eFalse(state.symbols.create("false"));
eFalse.bindVars(state, state.staticBaseEnv);
ExprInt eInt(1);
ExprAssert expr(noPos, &eFalse, &eInt);
ExprAssert expr(noPos,
std::make_unique<ExprVar>(state.symbols.create("false")),
std::make_unique<ExprInt>(1));
expr.bindVars(state, state.staticBaseEnv);
Value v;
state.mkThunk_(v, &expr);
state.mkThunk_(v, expr);
test(v,
ANSI_RED "«error: assertion 'false' failed»" ANSI_NORMAL,
@ -558,12 +556,10 @@ TEST_F(ValuePrintingTests, ansiColorsLambda)
.up = nullptr,
.values = { }
};
PosTable::Origin origin((std::monostate()));
auto posIdx = state.positions.add(origin, 1, 1);
auto body = ExprInt(0);
auto formals = Formals {};
PosTable::Origin origin = state.positions.addOrigin(std::monostate(), 1);
auto posIdx = state.positions.add(origin, 0);
ExprLambda eLambda(posIdx, createSymbol("a"), &formals, &body);
ExprLambda eLambda(posIdx, createSymbol("a"), std::make_unique<Formals>(), std::make_unique<ExprInt>(0));
Value vLambda;
vLambda.mkLambda(&env, &eLambda);
@ -621,7 +617,8 @@ TEST_F(ValuePrintingTests, ansiColorsPrimOpApp)
TEST_F(ValuePrintingTests, ansiColorsThunk)
{
Value v;
v.mkThunk(nullptr, nullptr);
ExprInt e(0);
v.mkThunk(nullptr, e);
test(v,
ANSI_MAGENTA "«thunk»" ANSI_NORMAL,