Consider using BigQuery to avoid limitations of github search #1

Open
opened 2026-01-25 20:07:12 +00:00 by rootile · 1 comment
Member

This dataset exists: https://console.cloud.google.com/marketplace/product/github/github-repos

It might have fewer silly limitations.

This dataset exists: https://console.cloud.google.com/marketplace/product/github/github-repos It might have fewer silly limitations.
Member

it has an unfortunate limitation of being wildly out of date. it doesn’t seem to be updating anymore (see: https://mastodon.social/@deafbeef/115785227134492196)

I had the best luck with scraping secondary indexes, like sourcegraph.com and grep.app. they’re incomplete, but I was able to extract more results from them compared to native github search.

it has an unfortunate limitation of being wildly out of date. it doesn’t seem to be updating anymore (see: https://mastodon.social/@deafbeef/115785227134492196) I had the best luck with scraping secondary indexes, like sourcegraph.com and grep.app. they’re incomplete, but I was able to extract more results from them compared to native github search.
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lix-project/flaker#1
No description provided.