Search in the manual is terrible #785

Open
opened 2025-04-02 03:13:45 +00:00 by jade · 2 comments
Owner

The manual's search is useless for about the same reason as the docs.python.org search is useless (btw, use Zeal/Dash for that!): it doesn't prioritize results where the function name/command name directly matches your query. For example if you want to be annoyed, look at https://docs.python.org/3/search.html?q=Path for pathlib.Path (though I wonder if they have made it better in the last few years), or for Lix, https://docs.lix.systems/manual/lix/stable/?search=nix%20path-info.

cc @patka

I found that it made basically no difference whether I set the boost-* options in mdbook.

It appears that at least some of our issues with the search being heinous are because of how it is word-splitting for indexing: if you have a query path-info it will not consider it one word, if my manual examination of the giant json file in firefox is to be believed: it is path and info, and they are not prioritized for appearing near each other. Also, I think that the way that we are doing page titles for nix3 subcommands is not the way that it likes them to be done for prioritization and so I am not sure if it is properly prioritizing results even if it doesn't screw up its index usage.

I also am not thoroughly convinced that use-boolean-and is doing anything, but maybe it is working and just matching things far away in the document. If so, that's annoying.

Someone needs to go stare at this harder than me and figure out how to fix it.

However, in looking at this, I did find that we can put one click edit links to gerrit in our manual, so that is one thing I can fix.

Raised on #779

The manual's search is useless for about the same reason as the docs.python.org search is useless (btw, use Zeal/Dash for that!): it doesn't prioritize results where the *function name*/*command name* directly matches your query. For example if you want to be annoyed, look at https://docs.python.org/3/search.html?q=Path for `pathlib.Path` (though I wonder if they have made it better in the last few years), or for Lix, https://docs.lix.systems/manual/lix/stable/?search=nix%20path-info. cc @patka I found that it made basically no difference whether I set the `boost-*` options in mdbook. It appears that *at least* some of our issues with the search being heinous are because of how it is word-splitting for indexing: if you have a query `path-info` it will not consider it one word, if my manual examination of the giant json file in firefox is to be believed: it is `path` and `info`, and they are not prioritized for appearing near each other. Also, I think that the way that we are doing page titles for nix3 subcommands is not the way that it likes them to be done for prioritization and so I am not sure if it is properly prioritizing results even if it doesn't screw up its index usage. I also am not thoroughly convinced that `use-boolean-and` is doing anything, but maybe it is working and just matching things far away in the document. If so, that's annoying. Someone needs to go stare at this harder than me and figure out how to fix it. However, in looking at this, I did find that we can put one click edit links to gerrit in our manual, so that is one thing I can fix. Raised on https://git.lix.systems/lix-project/lix/issues/779
Author
Owner

woops, @patka-123

woops, @patka-123

I'm on a phone and can't do much research, but want to note down what I found so far, in order to continue later.

mdBook uses (the unmaintained) elasticlunr for its search. It uses /[\s\-]+/ for splitting terms into tokens. We could probably start with removing the hyphen from it.

I'm on a phone and can't do much research, but want to note down what I found so far, in order to continue later. mdBook uses (the unmaintained) [elasticlunr](http://elasticlunr.com) for its search. It uses `/[\s\-]+/` for [splitting terms into tokens](http://elasticlunr.com/docs/tokenizer.js.html#seperator). We could probably start with removing the hyphen from it.
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: lix-project/lix#785
No description provided.