the-distro/gerrit-monitoring

Author	SHA1	Message	Date
Thomas Draebing	893b0c4f36	Convert replication dashboard to grafonnet Change-Id: `Icffb8ffbec8541e5b956487e5ce9ec54b3c8b617`	2020-12-04 08:31:26 +01:00
Thomas Draebing	c7c17679e9	Divide latency dashboard There are a lot of latency metrics. This change splits up the existing dashboard for latencies. For REST API latencies, it also allows to select the REST API calls to look at. This change also adds latency dashboards for the NoteDB and UI Actions. Change-Id: `Idb9631cc1bc838d06e626d58f163e71fb78b30c5`	2020-12-04 08:31:26 +01:00
Thomas Draebing	0b4c16e881	Convert latency dashboard to grafonnet Change-Id: `Id97759996259eea802c80c2ef3261ba1883d92d3`	2020-12-04 08:31:25 +01:00
Thomas Draebing	3e811f272b	Convert git fetch/clone dashboard to grafonnet Change-Id: `I735f94599199ae2d0f304030fa023c55359e9a47`	2020-12-04 08:31:25 +01:00
Thomas Draebing	12aba901e4	Extract yAxis object Change-Id: `I98c0708e521c0122beb53869242a3a1df8db3f3d`	2020-12-04 08:31:24 +01:00
Thomas Draebing	82d9ead576	Convert caches dashboard to grafonnet Change-Id: `I42f10428bb5f85991cef2abbcdfab9424b8bb48d`	2020-12-04 08:31:23 +01:00
Thomas Draebing	72391ac5e5	Convert queues dashboard to grafonnet Change-Id: `Ia3307a923b99ecacaaa8c803aa2af0c9bf4eabcb`	2020-12-04 08:31:22 +01:00
Thomas Draebing	ce5b8300f1	Start using Grafonnet to create Grafana dashboards Versioning the pure JSON files representing the Grafana dashboards had some disadvantages. It was hard to review them, they were very cluttered and a lot was duplicated. There are some tools that deal with that. One of them is Grafonnet, which is a superset of Jsonnet, a tool to create JSON files using a domain specific language. This change implements the Gerrit Process dashboard in Grafonnet. It also extends the installer to be able to install dashboards in the Jsonnet format. Change-Id: `I6235fb7d045bd71557678a4e3b0d4ad4515f4615`	2020-12-04 08:31:21 +01:00
Thomas Draebing	bec7bf7897	Adapt dashboards to be accepted by Grafana dashboard repository Grafana provides a repository for dashboards that can be used to easily import dashboards. Providing these dashboards there would make it easier for users not using the full setup provided here to still use the dashboards. To be able to upload however, the datasource reference in the dashboards has to be a template. This is however not compatible with the way how the dashboards are imported in the Grafana of the stack provided here. Thus, the variables are removed during the installation. Change-Id: `I99f127882a6f7594ca1c40fbe1e299378e89f4e9`	2020-11-27 10:40:09 +01:00
Thomas Draebing	65582f2deb	Also monitor parallel GC This change - adds metrics for parallel GC to the GC panel in the Gerrit Process dashboard - configures the GC panel to only show queries with values other than null - changes the interval to one minute, which fits the scrape interval - changes the default time frame to the last 24h, which is used for most other dashboards Change-Id: `I3b6587e769ae7486a02e26b8d7f2822319eb94e6`	2020-08-25 13:20:11 +02:00
Thomas Draebing	451882b7e9	Allow to monitor Gerrit on Kubernetes So far it was only possible to monitor single instance Gerrit servers. This was due to to the fact that a URL had to be used that pointed to a dedicated instance, since if multiple replicas would be behind the instance, the metrics of a random replica would be scraped and not of all. Prometheus has a service discovery functionality for deployments running in Kubernetes. This is now used, when monitoring a Gerrit instance in Kubernetes. This allows to have a variable number of replicas running, which will be automatically discovered by Prometheus. The dashboards were adapted accordingly and allow now to select the replica to be observed. For now, no summary of all replicas can be displayed in the dashboards, but that feature is planned to be added in the future. Change-Id: `I96efc63a192cd90f5e3e91a53dace8e1ae83132e`	2020-05-14 15:55:35 +02:00
Thomas Draebing	7663baf7be	Use gerrit_build_info metric to display Gerrit version This replaces the hacky graph showing the Gerrit version with a table showing the current Gerrit version information. Change-Id: `Idfbdc85e376953aead40fea06544e5c84fb777e7`	2020-05-14 15:33:14 +02:00
Matthias Sohn	e8b2651af2	Add latency dashboard Add graphs for the following latency metrics - receive-commit - query total - query changes - REST total - REST change list comments - REST change list robot comments - REST change post review - REST get change detail - REST get change diff - REST get change - REST get commit - REST get change revision actions Change-Id: `Id782e12335ae76820cac4e4e8c80450671bf8216`	2020-05-05 18:30:18 +02:00
Thomas Draebing	f960eb5eab	Add dashboard for Loki metrics Change-Id: `I220d90d33be3ed292402f3adb7386953cad7b0de`	2020-04-03 11:56:24 +02:00
Thomas Draebing	ff7fd22ca2	Add dashboard to monitor Prometheus data This is an adapted version of this dashboard: https://grafana.com/grafana/dashboards/3681 Change-Id: `I405f09f75698b940becd6994a7fc457853603756`	2020-04-03 11:56:24 +02:00
Thomas Draebing	442bf6fb98	Only show Gerrit instances in the instance dropdowns A variable was used to select the Gerrit instance to observe in the dashboards. Since the instance label is set for all targets that prometheus scrapes, the variable would also contain e.g. the prometheus instance. Now only Gerrit instances are displayed by further filtering for a metric specific for Gerrit. Change-Id: `I392b2ddf53a0ea49db25018dc5d37d269365812a`	2020-04-03 11:37:27 +02:00
Thomas Draebing	623332e4b3	Create a configmap per dashboard I the dashboard files got too large (>2Mb) Kubernetes was rejecting the configmap. Now each dashboard is installed with an own configmap. A sidecar container is used to register these dashboards with Grafana. Change-Id: `I84062d6e2ac7dc2669945b54575bf239a25900a4`	2020-03-26 09:55:39 +01:00
Matthias Sohn	14e7530aab	Process dashboard: add panel showing system load - Rearrange the other panels so that we show system load over cpu usage over threads in the left column. - Reduce height of memory panel a bit Change-Id: `Icaada525f87d0df503f67cf688b94d15a4119034`	2020-03-13 17:41:01 +01:00
Matthias Sohn	4a96ed4947	Process dashboard: show number of available CPUs Change-Id: `Ifbf13edb2dfa8f5cee64aea3f9dca006d419ef20`	2020-03-13 17:40:53 +01:00
Thomas Draebing	be862d863e	Move internal project to open source This change adds the current status of a project that aims to create a simple monitoring setup to monitor Gerrit servers, which was developed internally at SAP. The project provides an opinionated and basic configuration for helm charts that can be used to install Loki, Prometheus and Grafana on a Kubernetes cluster. Scripts to easily apply the configuration and install the whole setup are provided as well. The contributions so far were done by (with number of commits) 80 Thomas Draebing 11 Matthias Sohn 2 Saša Živkov Change-Id: `I8045780446edfb3c0dc8287b8f494505e338e066`	2020-03-11 15:23:19 +01:00

20 commits