Observability 2.0: Breaking the Three-Pillar Silos for Good

Managing observability at scale has really changed with the rise of distributed systems, and the traditional three-pillar approach (metrics, logs, traces) has become one of the biggest bottlenecks for DevOps teams. As things get more complicated, it can be difficult to keep your observability stack efficient, cost-effective and genuinely useful for troubleshooting. Just as we’ve moved from monoliths to microservices, the world of observability is undergoing its own profound transformation. ...

July 6, 2025 · 8 min · 1538 words · awsmorocco

Thanos Deep Dive: Addressing Prometheus Limitations at Scale

Open source, highly available Prometheus setup with long term storage capabilities. Prometheus has clearly established itself as the benchmark solution for metrics collection and alerting in cloud-native environments. Its pull-based architecture, powerful query language (PromQL) and extensive ecosystem have made it an essential tool for DevOps and SRE teams. However, as organizations scale their Kubernetes deployments across multiple clusters and regions, they often hit limits with Prometheus. That’s where Thanos comes in, offering a set of components that extend Prometheus’ capabilities and address its scalability challenges. ...

October 28, 2024 · 6 min · 1150 words · z4ck404

Low-Cost, Unlimited Metrics Storage with Thanos: Monitor All Your K8s Clusters Anywhere and More.

Monitoring large-scale, multi-cloud Kubernetes environments can be a hard task, especially when dealing with high-cardinality metrics and long-term data retention requirements. Traditional monitoring solutions often struggle to handle the sheer volume and complexity of metrics generated by distributed clusters across multiple cloud providers. This is where Thanos (Contrary to Marvel Thanos, this Thanos is an avenger) comes into play, providing a powerful and cost-effective solution for unlimited metrics storage and querying. ...

May 3, 2024 · 8 min · 1558 words · awsmorocco

Monitoring Kubernetes with Prometheus and Alertmanager: Setting Up Alerts with Slack Integration

Photo by Sigmund on Unsplash In this tutorial, we will learn how to set up Prometheus rules and configure Alertmanager to send alerts to a Slack channel. Prometheus is a popular monitoring and alerting solution in the Kubernetes ecosystem, while Alertmanager handles alert management and routing. By integrating with Slack, you can receive real-time notifications for any issues or anomalies in your Kubernetes cluster. Let’s get started 👨🏻‍💻! Table of Contents: Prerequisites Setting Up Prometheus Rules Configuring Alertmanager Integrating with Slack Testing the Setup Conclusion 🚦Prerequisites: Access to a Kubernetes cluster Prometheus and Alertmanager installed in the cluster Basic knowledge of Kubernetes and YAML syntax 1 — Setting Up Prometheus Rules : Prometheus rules define conditions for generating alerts based on metrics collected by Prometheus. In this example, we will create a PrometheusRule resource named z4ck404-alerts in the monitoring namespace. ...

February 29, 2024 · 5 min · 975 words · z4ck404