lix/src/libexpr/value.hh

477 lines
11 KiB
C++
Raw Normal View History

#pragma once
///@file
#include <cassert>
#include <climits>
#include "gc-alloc.hh"
#include "symbol-table.hh"
#include "value/context.hh"
#include "input-accessor.hh"
#include "source-path.hh"
Unify and refactor value printing Previously, there were two mostly-identical value printers -- one in `libexpr/eval.cc` (which didn't force values) and one in `libcmd/repl.cc` (which did force values and also printed ANSI color codes). This PR unifies both of these printers into `print.cc` and provides a `PrintOptions` struct for controlling the output, which allows for toggling whether values are forced, whether repeated values are tracked, and whether ANSI color codes are displayed. Additionally, `PrintOptions` allows tuning the maximum number of attributes, list items, and bytes in a string that will be displayed; this makes it ideal for contexts where printing too much output (e.g. all of Nixpkgs) is distracting. (As requested by @roberth in https://github.com/NixOS/nix/pull/9554#issuecomment-1845095735) Please read the tests for example output. Future work: - It would be nice to provide this function as a builtin, perhaps `builtins.toStringDebug` -- a printing function that never fails would be useful when debugging Nix code. - It would be nice to support customizing `PrintOptions` members on the command line, e.g. `--option to-string-max-attrs 1000`. (cherry picked from commit 0fa08b451682fb3311fe58112ff05c4fe5bee3a4, ) === Restore ambiguous value printer for `nix-instantiate` The Nix team has requested that this output format remain unchanged. I've added a warning to the man page explaining that `nix-instantiate --eval` output will not parse correctly in many situations. (cherry picked from commit df84dd4d8dd3fd6381ac2ca3064432ab31a16b79) Change-Id: I7cca6b4b53cd0642f2d49af657d5676a8554c9f8
2024-03-08 02:05:47 +00:00
#include "print-options.hh"
language: cleanly ban integer overflows This also bans various sneaking of negative numbers from the language into unsuspecting builtins as was exposed while auditing the consequences of changing the Nix language integer type to a newtype. It's unlikely that this change comprehensively ensures correctness when passing integers out of the Nix language and we should probably add a checked-narrowing function or something similar, but that's out of scope for the immediate change. During the development of this I found a few fun facts about the language: - You could overflow integers by converting from unsigned JSON values. - You could overflow unsigned integers by converting negative numbers into them when going into Nix config, into fetchTree, and into flake inputs. The flake inputs and Nix config cannot actually be tested properly since they both ban thunks, however, we put in checks anyway because it's possible these could somehow be used to do such shenanigans some other way. Note that Lix has banned Nix language integer overflows since the very first public beta, but threw a SIGILL about them because we run with -fsanitize=signed-overflow -fsanitize-undefined-trap-on-error in production builds. Since the Nix language uses signed integers, overflow was simply undefined behaviour, and since we defined that to trap, it did. Trapping on it was a bad UX, but we didn't even entirely notice that we had done this at all until it was reported as a bug a couple of months later (which is, to be fair, that flag working as intended), and it's got enough production time that, aside from code that is IMHO buggy (and which is, in any case, not in nixpkgs) such as https://git.lix.systems/lix-project/lix/issues/445, we don't think anyone doing anything reasonable actually depends on wrapping overflow. Even for weird use cases such as doing funny bit crimes, it doesn't make sense IMO to have wrapping behaviour, since two's complement arithmetic overflow behaviour is so *aggressively* not what you want for *any* kind of mathematics/algorithms. The Nix language exists for package management, a domain where bit crimes are already only dubiously in scope to begin with, and it makes a lot more sense for that domain for the integers to never lose precision, either by throwing errors if they would, or by being arbitrary-precision. This change will be ported to CppNix as well, to maintain language consistency. Fixes: https://git.lix.systems/lix-project/lix/issues/423 Change-Id: I51f253840c4af2ea5422b8a420aa5fafbf8fae75
2024-07-12 14:22:34 +00:00
#include "checked-arithmetic.hh"
#include <nlohmann/json_fwd.hpp>
2016-08-30 11:12:12 +00:00
namespace nix {
class BindingsBuilder;
typedef enum {
tInt = 1,
tBool,
tString,
tPath,
tNull,
tAttrs,
tList1,
tList2,
tListN,
tThunk,
tApp,
tLambda,
tPrimOp,
tPrimOpApp,
tExternal,
tFloat
} InternalType;
/**
* This type abstracts over all actual value types in the language,
* grouping together implementation details like tList*, different function
* types, and types in non-normal form (so thunks and co.)
*/
typedef enum {
nThunk,
nInt,
nFloat,
nBool,
nString,
nPath,
nNull,
nAttrs,
nList,
nFunction,
nExternal
} ValueType;
2014-01-21 17:29:55 +00:00
class Bindings;
struct Env;
struct Expr;
struct ExprLambda;
struct ExprBlackHole;
struct PrimOp;
2014-01-21 17:29:55 +00:00
class Symbol;
class PosIdx;
struct Pos;
class StorePath;
class Store;
class EvalState;
class XMLWriter;
Unify and refactor value printing Previously, there were two mostly-identical value printers -- one in `libexpr/eval.cc` (which didn't force values) and one in `libcmd/repl.cc` (which did force values and also printed ANSI color codes). This PR unifies both of these printers into `print.cc` and provides a `PrintOptions` struct for controlling the output, which allows for toggling whether values are forced, whether repeated values are tracked, and whether ANSI color codes are displayed. Additionally, `PrintOptions` allows tuning the maximum number of attributes, list items, and bytes in a string that will be displayed; this makes it ideal for contexts where printing too much output (e.g. all of Nixpkgs) is distracting. (As requested by @roberth in https://github.com/NixOS/nix/pull/9554#issuecomment-1845095735) Please read the tests for example output. Future work: - It would be nice to provide this function as a builtin, perhaps `builtins.toStringDebug` -- a printing function that never fails would be useful when debugging Nix code. - It would be nice to support customizing `PrintOptions` members on the command line, e.g. `--option to-string-max-attrs 1000`. (cherry picked from commit 0fa08b451682fb3311fe58112ff05c4fe5bee3a4, ) === Restore ambiguous value printer for `nix-instantiate` The Nix team has requested that this output format remain unchanged. I've added a warning to the man page explaining that `nix-instantiate --eval` output will not parse correctly in many situations. (cherry picked from commit df84dd4d8dd3fd6381ac2ca3064432ab31a16b79) Change-Id: I7cca6b4b53cd0642f2d49af657d5676a8554c9f8
2024-03-08 02:05:47 +00:00
class Printer;
language: cleanly ban integer overflows This also bans various sneaking of negative numbers from the language into unsuspecting builtins as was exposed while auditing the consequences of changing the Nix language integer type to a newtype. It's unlikely that this change comprehensively ensures correctness when passing integers out of the Nix language and we should probably add a checked-narrowing function or something similar, but that's out of scope for the immediate change. During the development of this I found a few fun facts about the language: - You could overflow integers by converting from unsigned JSON values. - You could overflow unsigned integers by converting negative numbers into them when going into Nix config, into fetchTree, and into flake inputs. The flake inputs and Nix config cannot actually be tested properly since they both ban thunks, however, we put in checks anyway because it's possible these could somehow be used to do such shenanigans some other way. Note that Lix has banned Nix language integer overflows since the very first public beta, but threw a SIGILL about them because we run with -fsanitize=signed-overflow -fsanitize-undefined-trap-on-error in production builds. Since the Nix language uses signed integers, overflow was simply undefined behaviour, and since we defined that to trap, it did. Trapping on it was a bad UX, but we didn't even entirely notice that we had done this at all until it was reported as a bug a couple of months later (which is, to be fair, that flag working as intended), and it's got enough production time that, aside from code that is IMHO buggy (and which is, in any case, not in nixpkgs) such as https://git.lix.systems/lix-project/lix/issues/445, we don't think anyone doing anything reasonable actually depends on wrapping overflow. Even for weird use cases such as doing funny bit crimes, it doesn't make sense IMO to have wrapping behaviour, since two's complement arithmetic overflow behaviour is so *aggressively* not what you want for *any* kind of mathematics/algorithms. The Nix language exists for package management, a domain where bit crimes are already only dubiously in scope to begin with, and it makes a lot more sense for that domain for the integers to never lose precision, either by throwing errors if they would, or by being arbitrary-precision. This change will be ported to CppNix as well, to maintain language consistency. Fixes: https://git.lix.systems/lix-project/lix/issues/423 Change-Id: I51f253840c4af2ea5422b8a420aa5fafbf8fae75
2024-07-12 14:22:34 +00:00
using NixInt = checked::Checked<int64_t>;
using NixFloat = double;
/**
* External values must descend from ExternalValueBase, so that
* type-agnostic nix functions (e.g. showType) can be implemented
*/
class ExternalValueBase
{
friend std::ostream & operator << (std::ostream & str, const ExternalValueBase & v);
Unify and refactor value printing Previously, there were two mostly-identical value printers -- one in `libexpr/eval.cc` (which didn't force values) and one in `libcmd/repl.cc` (which did force values and also printed ANSI color codes). This PR unifies both of these printers into `print.cc` and provides a `PrintOptions` struct for controlling the output, which allows for toggling whether values are forced, whether repeated values are tracked, and whether ANSI color codes are displayed. Additionally, `PrintOptions` allows tuning the maximum number of attributes, list items, and bytes in a string that will be displayed; this makes it ideal for contexts where printing too much output (e.g. all of Nixpkgs) is distracting. (As requested by @roberth in https://github.com/NixOS/nix/pull/9554#issuecomment-1845095735) Please read the tests for example output. Future work: - It would be nice to provide this function as a builtin, perhaps `builtins.toStringDebug` -- a printing function that never fails would be useful when debugging Nix code. - It would be nice to support customizing `PrintOptions` members on the command line, e.g. `--option to-string-max-attrs 1000`. (cherry picked from commit 0fa08b451682fb3311fe58112ff05c4fe5bee3a4, ) === Restore ambiguous value printer for `nix-instantiate` The Nix team has requested that this output format remain unchanged. I've added a warning to the man page explaining that `nix-instantiate --eval` output will not parse correctly in many situations. (cherry picked from commit df84dd4d8dd3fd6381ac2ca3064432ab31a16b79) Change-Id: I7cca6b4b53cd0642f2d49af657d5676a8554c9f8
2024-03-08 02:05:47 +00:00
friend class Printer;
protected:
/**
* Print out the value
*/
virtual std::ostream & print(std::ostream & str) const = 0;
public:
/**
* Return a simple string describing the type
*/
2022-02-21 15:37:25 +00:00
virtual std::string showType() const = 0;
/**
* Return a string to be used in builtins.typeOf
*/
2022-02-21 15:37:25 +00:00
virtual std::string typeOf() const = 0;
/**
* Coerce the value to a string. Defaults to uncoercable, i.e. throws an
2022-02-21 15:37:25 +00:00
* error.
*/
virtual std::string coerceToString(EvalState & state, const PosIdx & pos, NixStringContext & context, bool copyMore, bool copyToStore) const;
/**
* Compare to another value of the same type. Defaults to uncomparable,
* i.e. always false.
*/
2022-02-21 15:37:25 +00:00
virtual bool operator ==(const ExternalValueBase & b) const;
/**
* Print the value as JSON. Defaults to unconvertable, i.e. throws an error
*/
virtual nlohmann::json printValueAsJSON(EvalState & state, bool strict,
Use `std::set<StringContextElem>` not `PathSet` for string contexts Motivation `PathSet` is not correct because string contexts have other forms (`Built` and `DrvDeep`) that are not rendered as plain store paths. Instead of wrongly using `PathSet`, or "stringly typed" using `StringSet`, use `std::std<StringContextElem>`. ----- In support of this change, `NixStringContext` is now defined as `std::std<StringContextElem>` not `std:vector<StringContextElem>`. The old definition was just used by a `getContext` method which was only used by the eval cache. It can be deleted altogether since the types are now unified and the preexisting `copyContext` function already suffices. Summarizing the previous paragraph: Old: - `value/context.hh`: `NixStringContext = std::vector<StringContextElem>` - `value.hh`: `NixStringContext Value::getContext(...)` - `value.hh`: `copyContext(...)` New: - `value/context.hh`: `NixStringContext = std::set<StringContextElem>` - `value.hh`: `copyContext(...)` ---- The string representation of string context elements no longer contains the store dir. The diff of `src/libexpr/tests/value/context.cc` should make clear what the new representation is, so we recommend reviewing that file first. This was done for two reasons: Less API churn: `Value::mkString` and friends did not take a `Store` before. But if `NixStringContextElem::{parse, to_string}` *do* take a store (as they did before), then we cannot have the `Value` functions use them (in order to work with the fully-structured `NixStringContext`) without adding that argument. That would have been a lot of churn of threading the store, and this diff is already large enough, so the easier and less invasive thing to do was simply make the element `parse` and `to_string` functions not take the `Store` reference, and the easiest way to do that was to simply drop the store dir. Space usage: Dropping the `/nix/store/` (or similar) from the internal representation will safe space in the heap of the Nix programming being interpreted. If the heap contains many strings with non-trivial contexts, the saving could add up to something significant. ---- The eval cache version is bumped. The eval cache serialization uses `NixStringContextElem::{parse, to_string}`, and since those functions are changed per the above, that means the on-disk representation is also changed. This is simply done by changing the name of the used for the eval cache from `eval-cache-v4` to eval-cache-v5`. ---- To avoid some duplication `EvalCache::mkPathString` is added to abstract over the simple case of turning a store path to a string with just that string in the context. Context This PR picks up where #7543 left off. That one introduced the fully structured `NixStringContextElem` data type, but kept `PathSet context` as an awkward middle ground between internal `char[][]` interpreter heap string contexts and `NixStringContext` fully parsed string contexts. The infelicity of `PathSet context` was specifically called out during Nix team group review, but it was agreeing that fixing it could be left as future work. This is that future work. A possible follow-up step would be to get rid of the `char[][]` evaluator heap representation, too, but it is not yet clear how to do that. To use `NixStringContextElem` there we would need to get the STL containers to GC pointers in the GC build, and I am not sure how to do that. ---- PR #7543 effectively is writing the inverse of a `mkPathString`, `mkOutputString`, and one more such function for the `DrvDeep` case. I would like that PR to have property tests ensuring it is actually the inverse as expected. This PR sets things up nicely so that reworking that PR to be in that more elegant and better tested way is possible. Co-authored-by: Théophane Hufschmitt <7226587+thufschmitt@users.noreply.github.com>
2023-01-29 01:31:10 +00:00
NixStringContext & context, bool copyToStore = true) const;
/**
* Print the value as XML. Defaults to unevaluated
*/
virtual void printValueAsXML(EvalState & state, bool strict, bool location,
Use `std::set<StringContextElem>` not `PathSet` for string contexts Motivation `PathSet` is not correct because string contexts have other forms (`Built` and `DrvDeep`) that are not rendered as plain store paths. Instead of wrongly using `PathSet`, or "stringly typed" using `StringSet`, use `std::std<StringContextElem>`. ----- In support of this change, `NixStringContext` is now defined as `std::std<StringContextElem>` not `std:vector<StringContextElem>`. The old definition was just used by a `getContext` method which was only used by the eval cache. It can be deleted altogether since the types are now unified and the preexisting `copyContext` function already suffices. Summarizing the previous paragraph: Old: - `value/context.hh`: `NixStringContext = std::vector<StringContextElem>` - `value.hh`: `NixStringContext Value::getContext(...)` - `value.hh`: `copyContext(...)` New: - `value/context.hh`: `NixStringContext = std::set<StringContextElem>` - `value.hh`: `copyContext(...)` ---- The string representation of string context elements no longer contains the store dir. The diff of `src/libexpr/tests/value/context.cc` should make clear what the new representation is, so we recommend reviewing that file first. This was done for two reasons: Less API churn: `Value::mkString` and friends did not take a `Store` before. But if `NixStringContextElem::{parse, to_string}` *do* take a store (as they did before), then we cannot have the `Value` functions use them (in order to work with the fully-structured `NixStringContext`) without adding that argument. That would have been a lot of churn of threading the store, and this diff is already large enough, so the easier and less invasive thing to do was simply make the element `parse` and `to_string` functions not take the `Store` reference, and the easiest way to do that was to simply drop the store dir. Space usage: Dropping the `/nix/store/` (or similar) from the internal representation will safe space in the heap of the Nix programming being interpreted. If the heap contains many strings with non-trivial contexts, the saving could add up to something significant. ---- The eval cache version is bumped. The eval cache serialization uses `NixStringContextElem::{parse, to_string}`, and since those functions are changed per the above, that means the on-disk representation is also changed. This is simply done by changing the name of the used for the eval cache from `eval-cache-v4` to eval-cache-v5`. ---- To avoid some duplication `EvalCache::mkPathString` is added to abstract over the simple case of turning a store path to a string with just that string in the context. Context This PR picks up where #7543 left off. That one introduced the fully structured `NixStringContextElem` data type, but kept `PathSet context` as an awkward middle ground between internal `char[][]` interpreter heap string contexts and `NixStringContext` fully parsed string contexts. The infelicity of `PathSet context` was specifically called out during Nix team group review, but it was agreeing that fixing it could be left as future work. This is that future work. A possible follow-up step would be to get rid of the `char[][]` evaluator heap representation, too, but it is not yet clear how to do that. To use `NixStringContextElem` there we would need to get the STL containers to GC pointers in the GC build, and I am not sure how to do that. ---- PR #7543 effectively is writing the inverse of a `mkPathString`, `mkOutputString`, and one more such function for the `DrvDeep` case. I would like that PR to have property tests ensuring it is actually the inverse as expected. This PR sets things up nicely so that reworking that PR to be in that more elegant and better tested way is possible. Co-authored-by: Théophane Hufschmitt <7226587+thufschmitt@users.noreply.github.com>
2023-01-29 01:31:10 +00:00
XMLWriter & doc, NixStringContext & context, PathSet & drvsSeen,
const PosIdx pos) const;
virtual ~ExternalValueBase()
{
};
};
std::ostream & operator << (std::ostream & str, const ExternalValueBase & v);
struct Value
{
private:
InternalType internalType;
friend std::string showType(const Value & v);
public:
Unify and refactor value printing Previously, there were two mostly-identical value printers -- one in `libexpr/eval.cc` (which didn't force values) and one in `libcmd/repl.cc` (which did force values and also printed ANSI color codes). This PR unifies both of these printers into `print.cc` and provides a `PrintOptions` struct for controlling the output, which allows for toggling whether values are forced, whether repeated values are tracked, and whether ANSI color codes are displayed. Additionally, `PrintOptions` allows tuning the maximum number of attributes, list items, and bytes in a string that will be displayed; this makes it ideal for contexts where printing too much output (e.g. all of Nixpkgs) is distracting. (As requested by @roberth in https://github.com/NixOS/nix/pull/9554#issuecomment-1845095735) Please read the tests for example output. Future work: - It would be nice to provide this function as a builtin, perhaps `builtins.toStringDebug` -- a printing function that never fails would be useful when debugging Nix code. - It would be nice to support customizing `PrintOptions` members on the command line, e.g. `--option to-string-max-attrs 1000`. (cherry picked from commit 0fa08b451682fb3311fe58112ff05c4fe5bee3a4, ) === Restore ambiguous value printer for `nix-instantiate` The Nix team has requested that this output format remain unchanged. I've added a warning to the man page explaining that `nix-instantiate --eval` output will not parse correctly in many situations. (cherry picked from commit df84dd4d8dd3fd6381ac2ca3064432ab31a16b79) Change-Id: I7cca6b4b53cd0642f2d49af657d5676a8554c9f8
2024-03-08 02:05:47 +00:00
void print(EvalState &state, std::ostream &str, PrintOptions options = PrintOptions {});
// Functions needed to distinguish the type
// These should be removed eventually, by putting the functionality that's
// needed by callers into methods of this type
// type() == nThunk
inline bool isThunk() const { return internalType == tThunk; };
inline bool isApp() const { return internalType == tApp; };
inline bool isBlackhole() const;
// type() == nFunction
inline bool isLambda() const { return internalType == tLambda; };
inline bool isPrimOp() const { return internalType == tPrimOp; };
inline bool isPrimOpApp() const { return internalType == tPrimOpApp; };
union
{
NixInt integer;
bool boolean;
/**
* Strings in the evaluator carry a so-called `context` which
* is a list of strings representing store paths. This is to
* allow users to write things like
* "--with-freetype2-library=" + freetype + "/lib"
* where `freetype` is a derivation (or a source to be copied
* to the store). If we just concatenated the strings without
* keeping track of the referenced store paths, then if the
* string is used as a derivation attribute, the derivation
* will not have the correct dependencies in its inputDrvs and
* inputSrcs.
* The semantics of the context is as follows: when a string
* with context C is used as a derivation attribute, then the
* derivations in C will be added to the inputDrvs of the
* derivation, and the other store paths in C will be added to
* the inputSrcs of the derivations.
* For canonicity, the store paths should be in sorted order.
*/
struct {
const char * s;
const char * * context; // must be in sorted order
} string;
const char * _path;
Bindings * attrs;
struct {
2018-05-02 11:56:34 +00:00
size_t size;
Value * * elems;
} bigList;
Value * smallList[2];
struct {
Env * env;
Expr * expr;
} thunk;
struct {
Value * left, * right;
} app;
struct {
Env * env;
ExprLambda * fun;
} lambda;
PrimOp * primOp;
struct {
Value * left, * right;
} primOpApp;
ExternalValueBase * external;
NixFloat fpoint;
};
/**
* Returns the normal type of a Value. This only returns nThunk if
* the Value hasn't been forceValue'd
*
* @param invalidIsThunk Instead of aborting an an invalid (probably
* 0, so uninitialized) internal type, return `nThunk`.
*/
inline ValueType type(bool invalidIsThunk = false) const
{
switch (internalType) {
case tInt: return nInt;
case tBool: return nBool;
case tString: return nString;
case tPath: return nPath;
case tNull: return nNull;
case tAttrs: return nAttrs;
case tList1: case tList2: case tListN: return nList;
case tLambda: case tPrimOp: case tPrimOpApp: return nFunction;
case tExternal: return nExternal;
case tFloat: return nFloat;
case tThunk: case tApp: return nThunk;
}
if (invalidIsThunk)
return nThunk;
else
abort();
}
/**
* After overwriting an app node, be sure to clear pointers in the
* Value to ensure that the target isn't kept alive unnecessarily.
*/
inline void clearValue()
{
app.left = app.right = 0;
}
language: cleanly ban integer overflows This also bans various sneaking of negative numbers from the language into unsuspecting builtins as was exposed while auditing the consequences of changing the Nix language integer type to a newtype. It's unlikely that this change comprehensively ensures correctness when passing integers out of the Nix language and we should probably add a checked-narrowing function or something similar, but that's out of scope for the immediate change. During the development of this I found a few fun facts about the language: - You could overflow integers by converting from unsigned JSON values. - You could overflow unsigned integers by converting negative numbers into them when going into Nix config, into fetchTree, and into flake inputs. The flake inputs and Nix config cannot actually be tested properly since they both ban thunks, however, we put in checks anyway because it's possible these could somehow be used to do such shenanigans some other way. Note that Lix has banned Nix language integer overflows since the very first public beta, but threw a SIGILL about them because we run with -fsanitize=signed-overflow -fsanitize-undefined-trap-on-error in production builds. Since the Nix language uses signed integers, overflow was simply undefined behaviour, and since we defined that to trap, it did. Trapping on it was a bad UX, but we didn't even entirely notice that we had done this at all until it was reported as a bug a couple of months later (which is, to be fair, that flag working as intended), and it's got enough production time that, aside from code that is IMHO buggy (and which is, in any case, not in nixpkgs) such as https://git.lix.systems/lix-project/lix/issues/445, we don't think anyone doing anything reasonable actually depends on wrapping overflow. Even for weird use cases such as doing funny bit crimes, it doesn't make sense IMO to have wrapping behaviour, since two's complement arithmetic overflow behaviour is so *aggressively* not what you want for *any* kind of mathematics/algorithms. The Nix language exists for package management, a domain where bit crimes are already only dubiously in scope to begin with, and it makes a lot more sense for that domain for the integers to never lose precision, either by throwing errors if they would, or by being arbitrary-precision. This change will be ported to CppNix as well, to maintain language consistency. Fixes: https://git.lix.systems/lix-project/lix/issues/423 Change-Id: I51f253840c4af2ea5422b8a420aa5fafbf8fae75
2024-07-12 14:22:34 +00:00
inline void mkInt(NixInt::Inner n)
{
mkInt(NixInt{n});
}
inline void mkInt(NixInt n)
{
clearValue();
internalType = tInt;
integer = n;
}
inline void mkBool(bool b)
{
clearValue();
internalType = tBool;
boolean = b;
}
inline void mkString(const char * s, const char * * context = 0)
{
internalType = tString;
string.s = s;
string.context = context;
}
void mkString(std::string_view s);
Use `std::set<StringContextElem>` not `PathSet` for string contexts Motivation `PathSet` is not correct because string contexts have other forms (`Built` and `DrvDeep`) that are not rendered as plain store paths. Instead of wrongly using `PathSet`, or "stringly typed" using `StringSet`, use `std::std<StringContextElem>`. ----- In support of this change, `NixStringContext` is now defined as `std::std<StringContextElem>` not `std:vector<StringContextElem>`. The old definition was just used by a `getContext` method which was only used by the eval cache. It can be deleted altogether since the types are now unified and the preexisting `copyContext` function already suffices. Summarizing the previous paragraph: Old: - `value/context.hh`: `NixStringContext = std::vector<StringContextElem>` - `value.hh`: `NixStringContext Value::getContext(...)` - `value.hh`: `copyContext(...)` New: - `value/context.hh`: `NixStringContext = std::set<StringContextElem>` - `value.hh`: `copyContext(...)` ---- The string representation of string context elements no longer contains the store dir. The diff of `src/libexpr/tests/value/context.cc` should make clear what the new representation is, so we recommend reviewing that file first. This was done for two reasons: Less API churn: `Value::mkString` and friends did not take a `Store` before. But if `NixStringContextElem::{parse, to_string}` *do* take a store (as they did before), then we cannot have the `Value` functions use them (in order to work with the fully-structured `NixStringContext`) without adding that argument. That would have been a lot of churn of threading the store, and this diff is already large enough, so the easier and less invasive thing to do was simply make the element `parse` and `to_string` functions not take the `Store` reference, and the easiest way to do that was to simply drop the store dir. Space usage: Dropping the `/nix/store/` (or similar) from the internal representation will safe space in the heap of the Nix programming being interpreted. If the heap contains many strings with non-trivial contexts, the saving could add up to something significant. ---- The eval cache version is bumped. The eval cache serialization uses `NixStringContextElem::{parse, to_string}`, and since those functions are changed per the above, that means the on-disk representation is also changed. This is simply done by changing the name of the used for the eval cache from `eval-cache-v4` to eval-cache-v5`. ---- To avoid some duplication `EvalCache::mkPathString` is added to abstract over the simple case of turning a store path to a string with just that string in the context. Context This PR picks up where #7543 left off. That one introduced the fully structured `NixStringContextElem` data type, but kept `PathSet context` as an awkward middle ground between internal `char[][]` interpreter heap string contexts and `NixStringContext` fully parsed string contexts. The infelicity of `PathSet context` was specifically called out during Nix team group review, but it was agreeing that fixing it could be left as future work. This is that future work. A possible follow-up step would be to get rid of the `char[][]` evaluator heap representation, too, but it is not yet clear how to do that. To use `NixStringContextElem` there we would need to get the STL containers to GC pointers in the GC build, and I am not sure how to do that. ---- PR #7543 effectively is writing the inverse of a `mkPathString`, `mkOutputString`, and one more such function for the `DrvDeep` case. I would like that PR to have property tests ensuring it is actually the inverse as expected. This PR sets things up nicely so that reworking that PR to be in that more elegant and better tested way is possible. Co-authored-by: Théophane Hufschmitt <7226587+thufschmitt@users.noreply.github.com>
2023-01-29 01:31:10 +00:00
void mkString(std::string_view s, const NixStringContext & context);
Use `std::set<StringContextElem>` not `PathSet` for string contexts Motivation `PathSet` is not correct because string contexts have other forms (`Built` and `DrvDeep`) that are not rendered as plain store paths. Instead of wrongly using `PathSet`, or "stringly typed" using `StringSet`, use `std::std<StringContextElem>`. ----- In support of this change, `NixStringContext` is now defined as `std::std<StringContextElem>` not `std:vector<StringContextElem>`. The old definition was just used by a `getContext` method which was only used by the eval cache. It can be deleted altogether since the types are now unified and the preexisting `copyContext` function already suffices. Summarizing the previous paragraph: Old: - `value/context.hh`: `NixStringContext = std::vector<StringContextElem>` - `value.hh`: `NixStringContext Value::getContext(...)` - `value.hh`: `copyContext(...)` New: - `value/context.hh`: `NixStringContext = std::set<StringContextElem>` - `value.hh`: `copyContext(...)` ---- The string representation of string context elements no longer contains the store dir. The diff of `src/libexpr/tests/value/context.cc` should make clear what the new representation is, so we recommend reviewing that file first. This was done for two reasons: Less API churn: `Value::mkString` and friends did not take a `Store` before. But if `NixStringContextElem::{parse, to_string}` *do* take a store (as they did before), then we cannot have the `Value` functions use them (in order to work with the fully-structured `NixStringContext`) without adding that argument. That would have been a lot of churn of threading the store, and this diff is already large enough, so the easier and less invasive thing to do was simply make the element `parse` and `to_string` functions not take the `Store` reference, and the easiest way to do that was to simply drop the store dir. Space usage: Dropping the `/nix/store/` (or similar) from the internal representation will safe space in the heap of the Nix programming being interpreted. If the heap contains many strings with non-trivial contexts, the saving could add up to something significant. ---- The eval cache version is bumped. The eval cache serialization uses `NixStringContextElem::{parse, to_string}`, and since those functions are changed per the above, that means the on-disk representation is also changed. This is simply done by changing the name of the used for the eval cache from `eval-cache-v4` to eval-cache-v5`. ---- To avoid some duplication `EvalCache::mkPathString` is added to abstract over the simple case of turning a store path to a string with just that string in the context. Context This PR picks up where #7543 left off. That one introduced the fully structured `NixStringContextElem` data type, but kept `PathSet context` as an awkward middle ground between internal `char[][]` interpreter heap string contexts and `NixStringContext` fully parsed string contexts. The infelicity of `PathSet context` was specifically called out during Nix team group review, but it was agreeing that fixing it could be left as future work. This is that future work. A possible follow-up step would be to get rid of the `char[][]` evaluator heap representation, too, but it is not yet clear how to do that. To use `NixStringContextElem` there we would need to get the STL containers to GC pointers in the GC build, and I am not sure how to do that. ---- PR #7543 effectively is writing the inverse of a `mkPathString`, `mkOutputString`, and one more such function for the `DrvDeep` case. I would like that PR to have property tests ensuring it is actually the inverse as expected. This PR sets things up nicely so that reworking that PR to be in that more elegant and better tested way is possible. Co-authored-by: Théophane Hufschmitt <7226587+thufschmitt@users.noreply.github.com>
2023-01-29 01:31:10 +00:00
void mkStringMove(const char * s, const NixStringContext & context);
inline void mkString(const Symbol & s)
{
mkString(((const std::string &) s).c_str());
}
void mkPath(const SourcePath & path);
inline void mkPath(const char * path)
{
clearValue();
internalType = tPath;
_path = path;
}
inline void mkNull()
{
clearValue();
internalType = tNull;
}
inline void mkAttrs(Bindings * a)
{
clearValue();
internalType = tAttrs;
attrs = a;
}
Value & mkAttrs(BindingsBuilder & bindings);
inline void mkList(size_t size)
{
clearValue();
if (size == 1)
internalType = tList1;
else if (size == 2)
internalType = tList2;
else {
internalType = tListN;
bigList.size = size;
}
}
inline void mkThunk(Env * e, Expr & ex)
{
internalType = tThunk;
thunk.env = e;
thunk.expr = &ex;
}
inline void mkApp(Value * l, Value * r)
{
internalType = tApp;
app.left = l;
app.right = r;
}
inline void mkLambda(Env * e, ExprLambda * f)
{
internalType = tLambda;
lambda.env = e;
lambda.fun = f;
}
inline void mkBlackhole();
void mkPrimOp(PrimOp * p);
inline void mkPrimOpApp(Value * l, Value * r)
{
internalType = tPrimOpApp;
Unify and refactor value printing Previously, there were two mostly-identical value printers -- one in `libexpr/eval.cc` (which didn't force values) and one in `libcmd/repl.cc` (which did force values and also printed ANSI color codes). This PR unifies both of these printers into `print.cc` and provides a `PrintOptions` struct for controlling the output, which allows for toggling whether values are forced, whether repeated values are tracked, and whether ANSI color codes are displayed. Additionally, `PrintOptions` allows tuning the maximum number of attributes, list items, and bytes in a string that will be displayed; this makes it ideal for contexts where printing too much output (e.g. all of Nixpkgs) is distracting. (As requested by @roberth in https://github.com/NixOS/nix/pull/9554#issuecomment-1845095735) Please read the tests for example output. Future work: - It would be nice to provide this function as a builtin, perhaps `builtins.toStringDebug` -- a printing function that never fails would be useful when debugging Nix code. - It would be nice to support customizing `PrintOptions` members on the command line, e.g. `--option to-string-max-attrs 1000`. (cherry picked from commit 0fa08b451682fb3311fe58112ff05c4fe5bee3a4, ) === Restore ambiguous value printer for `nix-instantiate` The Nix team has requested that this output format remain unchanged. I've added a warning to the man page explaining that `nix-instantiate --eval` output will not parse correctly in many situations. (cherry picked from commit df84dd4d8dd3fd6381ac2ca3064432ab31a16b79) Change-Id: I7cca6b4b53cd0642f2d49af657d5676a8554c9f8
2024-03-08 02:05:47 +00:00
primOpApp.left = l;
primOpApp.right = r;
}
Unify and refactor value printing Previously, there were two mostly-identical value printers -- one in `libexpr/eval.cc` (which didn't force values) and one in `libcmd/repl.cc` (which did force values and also printed ANSI color codes). This PR unifies both of these printers into `print.cc` and provides a `PrintOptions` struct for controlling the output, which allows for toggling whether values are forced, whether repeated values are tracked, and whether ANSI color codes are displayed. Additionally, `PrintOptions` allows tuning the maximum number of attributes, list items, and bytes in a string that will be displayed; this makes it ideal for contexts where printing too much output (e.g. all of Nixpkgs) is distracting. (As requested by @roberth in https://github.com/NixOS/nix/pull/9554#issuecomment-1845095735) Please read the tests for example output. Future work: - It would be nice to provide this function as a builtin, perhaps `builtins.toStringDebug` -- a printing function that never fails would be useful when debugging Nix code. - It would be nice to support customizing `PrintOptions` members on the command line, e.g. `--option to-string-max-attrs 1000`. (cherry picked from commit 0fa08b451682fb3311fe58112ff05c4fe5bee3a4, ) === Restore ambiguous value printer for `nix-instantiate` The Nix team has requested that this output format remain unchanged. I've added a warning to the man page explaining that `nix-instantiate --eval` output will not parse correctly in many situations. (cherry picked from commit df84dd4d8dd3fd6381ac2ca3064432ab31a16b79) Change-Id: I7cca6b4b53cd0642f2d49af657d5676a8554c9f8
2024-03-08 02:05:47 +00:00
/**
* For a `tPrimOpApp` value, get the original `PrimOp` value.
*/
PrimOp * primOpAppPrimOp() const;
inline void mkExternal(ExternalValueBase * e)
{
clearValue();
internalType = tExternal;
external = e;
}
inline void mkFloat(NixFloat n)
{
clearValue();
internalType = tFloat;
fpoint = n;
}
bool isList() const
{
return internalType == tList1 || internalType == tList2 || internalType == tListN;
}
Value * * listElems()
{
return internalType == tList1 || internalType == tList2 ? smallList : bigList.elems;
}
Value * const * listElems() const
{
return internalType == tList1 || internalType == tList2 ? smallList : bigList.elems;
}
2018-05-02 11:56:34 +00:00
size_t listSize() const
{
return internalType == tList1 ? 1 : internalType == tList2 ? 2 : bigList.size;
}
PosIdx determinePos(const PosIdx pos) const;
/**
* Check whether forcing this value requires a trivial amount of
* computation. In particular, function applications are
* non-trivial.
*/
bool isTrivial() const;
2020-06-29 17:08:37 +00:00
auto listItems()
{
struct ListIterable
{
typedef Value * const * iterator;
iterator _begin, _end;
iterator begin() const { return _begin; }
iterator end() const { return _end; }
};
assert(isList());
auto begin = listElems();
return ListIterable { begin, begin + listSize() };
}
auto listItems() const
{
struct ConstListIterable
{
typedef const Value * const * iterator;
iterator _begin, _end;
iterator begin() const { return _begin; }
iterator end() const { return _end; }
};
assert(isList());
auto begin = listElems();
return ConstListIterable { begin, begin + listSize() };
}
SourcePath path() const
{
assert(internalType == tPath);
return SourcePath{CanonPath(_path)};
}
std::string_view str() const
{
assert(internalType == tString);
return std::string_view(string.s);
}
};
extern ExprBlackHole eBlackHole;
bool Value::isBlackhole() const
{
return internalType == tThunk && thunk.expr == (Expr*) &eBlackHole;
}
void Value::mkBlackhole()
{
internalType = tThunk;
thunk.expr = (Expr*) &eBlackHole;
}
using ValueVector = GcVector<Value *>;
using ValueMap = GcMap<Symbol, Value *>;
using ValueVectorMap = std::map<Symbol, ValueVector>;
/**
* A value allocated in traceable memory.
*/
typedef std::shared_ptr<Value *> RootValue;
RootValue allocRootValue(Value * v);
}