I just finished watching the [recorded] live streams from PromCon 2018. There are some really exciting things under development in the Prometheus ecosystem. This is not an exhaustive review. . . but I wanted to highlight a few presentations:
The Grafana team is working on an interface designed for exploring Prometheus query results with auto-complete and tabs! This looks really promising and it will be amazing to have the Grafana team's UI expertise iterating and investing in this. I'm not digging on the Prometheus query UI . . . it's appropriately minimal to allow that team to focus on the core of Prometheus. However, it will be amazing to have tabs and auto-complete for promql functions and metric labels/values! This can not happen soon enough. ( slides )
Better git integration
Improved synchronization with git repositories is 'in the works'. Grafana 5.0 brought better dashboard provisioning from files, which can be committed to source control, but I would still like to be able to modify a dashboard inside Grafana, and save it to git with a comment from within the Grafana UI itself. It would smooth out our current workflow for modifying dashboards, which involves a follow-up PR at the moment.
This project adds a global query capability, across multiple Prometheus instances, and long-term storage (in S3). What's really cool about this is how these guys can build this completely outside the core of Prometheus. I'm hopeful a large part of what they learn or build will make it back in to the core. It's a real testament to the engineering and API focus of the core Prometheus team that things like this and Cortex are initiatives that successfully started outside the core team and the hard earned experience of those projects is considered when deciding what the core team focuses on -- as opposed to trying to do too many features at once to the detriment of stability and usability.
A couple of coworkers at ING built something really cool that they're calling model builder. Given a metric with historical values in the time series database, it looks at the history of that metric and generates a model of what future values of that metric should look like and then feeds those back into Prometheus as a sort of 'expected value' metric. This enables them to write alerts to detect deviations from normal levels. I can see this being really handy when you have something that spikes usage or traffic based on certain times of day or days of the week... i.e. - scheduled jobs, or prime time e-commerce traffic for instance. ( slides )
A team is working on 'breathing out' and evolving the Prometheus metrics exposition format into an an open standard. ( slides )
Automated Prometheus benchmarking
A couple of talented guys pitched this PromCon presentation and then went and implemented it after it got accepted. They created a way to run benchmarks against a given PR by spinning up Prometheus in Kubernetes to run a comparison of the performance characteristics before and after code from a PR, automatically publishing the results in Grafana dashboards. Wow. What a great intersection of capabilities leveraged from Kubernetes, Github, and Promethues/Grafana – and all in the name of keeping the core of Prometheus performant. ( slides )
Lastly, I want to highlight this presentation which gave an overview of the flexibility of relabeling. I think having seen or heard this a year ago would have saved me a ton a time learning about the capabilities of relabeling, which I felt like I stumbled through, and struggled to get into the mindset where I could take full advantage of the power relabeling offers. ( slides )