Log Like a Pro: The Open-Source Stack That Scales from NUCs to Racks
Your homelab is only as reliable as your observability. Ever SSH'd into a box at 2 a.m. to chase a ghost container? This guide shows you how to build a lightweight, production‑grade logging and monitoring stack using open‑source tools that run great on modest hardware. The payoff: fewer mystery outages, faster troubleshooting, smarter capacity planning, and dashboards your friends will actually want to see.
We'll keep it zero‑fluff and practical. You'll get a clear mental model of how Prometheus, Alloy, Loki, cAdvisor, and Grafana fit together, why they're a sweet spot for homelabs, and when to consider alternatives. I'll point you to official docs, starter configs you can drop in with minimal edits, and ready‑made dashboards so you can go from "huh, what's pegging my CPU?" to "here's the exact container and why" in minutes.
Who this is for:
- Homelabbers running Docker, Podman, or Kubernetes on mini PCs, used rack gear, or low‑power nodes
- Builders who want "just enough" observability—no enterprise complexity, no surprise cloud bills
- Learners who prefer tooling that scales from a single node to a small business and production environment
The stack at a glance:
- Logs: Alloy feeding Loki for centralised, queryable logs—without the Elasticsearch bloat
- Metrics: Node Exporter and cAdvisor scraped by Prometheus for high‑signal, pull‑based metrics
- Visualisation and alerts: Grafana for dashboards, alerting, and easy sharing
By the end, you'll have a lean, self‑hosted observability stack that's easy to run, easy to maintain, and powerful enough to catch issues before they wake you up. Let's get your homelab out of the guesswork business.
Prometheus + Alloy + Loki: the core
At the core of any monitoring solution, you will need a method to collect, store, and organise all the logs from the different physical or virtual machines and applications you might be running in your homelab. With the Prometheus + Alloy + Loki stack, you get a cohesive, open-source stack that lets you easily collect both system metrics and logs with minimal overhead, enabling you to scale from a few hosts to entire clusters of applications.
By using the stack, you're solving the following key issues in your homelab:
- Centralised, searchable logs from all hosts and containers (Loki, via Alloy)
- Rich, time-series metrics with flexible scraping and retention (Prometheus)
- Fast troubleshooting: correlate spikes in CPU with container errors in one view
By using lightweight binaries or containers for all the key components of the Prometheus monitoring stack, you get an efficient, scalable platform that scales easily by expanding components horizontally. This allows you to easily expand your log collection with Loki to support distributed mode, or utilise remote write with Prometheus, depending on where you start seeing bottlenecks in your monitoring solution.
As well as expanding the logging infrastructure, the stack's architecture, tailored to your needs, also provides a simple integration pattern for bringing in new hosts you want to collect metrics or logs from. This is done using the Node exporter container to easily export CPU, memory, disk, and network traffic on that node. You can see a diagram below visualising where the node exporter runs in the solution.

If you want to read over the documentation, you can find the key links at all of the links below;
- Prometheus official documentation.
- Node exporter official documentation here.
- Alloy + Loki official documentation.
With all these data collectors and agents, we still need a way to query, visualise, and generate useful alerts from all this metric and log data. Later in the post, we'll discuss how we can use Grafana as another open-source dashboard solution for our data.
Prometheus + Alloy + Loki: Stack setup
When running the stack, you have two primary ways to run all the components that make up the solution. These solutions are Docker or systemd, with the native application binaries, if you're looking for the simplest way to run the stack whilst ensuring sandboxing between applications. Docker would be a recommended solution. When working with stacks that require high throughput for log and metric data, systemd and the native application binaries can reduce overhead when running the application stack.
In the following guide, we will focus on setting up the application using Docker and Docker Compose. If you haven't set up Docker before, you can find a getting-started guide here.
Docker compose stack
The core of our application will be defined in a single Docker Compose file that describes our Prometheus, Alloy, and Loki containers, which will collect all our application metrics and log data into a centralised location.
|
version: "3.9" services: loki: volumes: |
We will also have another Docker stack to store all of our exporters' metrics, logs, and container exporters:
|
version: "3.8" cadvisor: alloy: volumes: |
For the exporters configuration, you'll generally want to deploy it to all the hosts you want to monitor and push the data to the logging collector services.
Prometheus: Metrics collection and aggregator
Within the logging and monitoring stack, Prometheus provides a centralised location for metrics to be pushed from logging agents. These agents can be any open telemetry-compatible agents, but in the following sections, we will provide an example of using the Node exporter specifically.
|
scrape_configs:
|
The Prometheus config file will be used to link the application to our collectors and to tune other application parameters, such as storage back ends. You can find a working example of this config file below. You can find a complete reference for the config file here.
When ingesting logs into your monitoring stack, Alloy provides an integration point for pushing host or application logs before they are consolidated into a cohesive data source in Prometheus.
Node exporter: Metrics collection
For collecting metrics from nodes with the Prometheus stack, an agent-based system allows new nodes to easily begin pushing their metrics data to one (or multiple) Prometheus endpoints for collection and aggregation. In the node exporter agent config, you can specify which metrics you want collected from a host and how often they should be sent to Prometheus.
You can find a complete config file reference here.
When running your exporter, make sure to update your Prometheus configuration so it is scraped when needed.
|
scrape_configs: - job_name: node-exporter static_configs: - targets: - node-exporter:9100 |
Loki: Log destination and data sources
Within the monitoring and logging stack, Loki is a horizontally scalable, highly available, multi-tenant log aggregation system that ingests logs via a push mechanism from nodes rather than the traditional pull model. Loki scales well to any size of logging by not storing indexes of all collected logs; instead, it stores metadata for each log stream as a set of labels that can be queried for that stream.
The following Loki config file configures an endpoint that your nodes can use to push.
|
auth_enabled: false server: common: schema_config: |
Alloy: Log collector and pusher
For collecting logs from a node, Alloy provides an easy-to-use agent-based implementation. With Alloy, you run the application as a container or a binary to collect the desired logs from your system, which are then periodically shipped to your logging destination, such as Loki. With this agent-based setup, you get an easy-to-onboard logging solution for hosts, as no additional configuration in Loki is required to start receiving logs from a node.
In the configuration file below, you can see the simple syntax used to define the Loki endpoint to send log files to, as well as how you can configure what log files should be watched.
|
logging { loki.process "alloy" { local.file_match "syslog" { loki.process "syslog" { stage.static_labels { loki.source.file "syslog" {
|
You can find reference documentation for the Alloy config file here.
cAdvisor: container-aware metrics without the hassle
In many homelab environments, containers are becoming a great way to sandbox and run applications in their own environments easily. However, when running an anode exporter-based monitoring tool, the sandboxing and increased security of these container environments do not provide sufficient visibility into what is happening in these applications. That's where cAdvisor comes in as a container-aware monitoring solution that makes it easy to inspect per-container CPU, memory, filesystem, and network stats, perfect for spotting noisy neighbours or memory leaks in your container applications.
To easily run the container in your logging and monitoring stack, add a new container definition to your Docker Compose file that exposes the container's UI and metrics endpoints on port 8080. The following definition also mounts multiple volumes into the container to allow cAdvisor to monitor the status of Docker volumes and the host file system, and to connect to the Docker socket to interact with the Docker service on the host.
Below is the cadvisor container deployed in its own stack, which we deployed earlier via the exported Docker Compose stack.
|
version: '3.8' services: cadvisor: image: google/cadvisor:latest container_name: cadvisor ports: - "8080:8080" # Expose cAdvisor UI and metrics volumes: - /:/rootfs:ro # Read-only root filesystem - /var/run:/var/run:ro # Read-only Docker socket - /sys:/sys:ro - /var/lib/docker/:/var/lib/docker:ro restart: unless-stopped |
After the cAdvisor container is running and exposing a metrics collection endpoint, your Prometheus scrape config needs to be updated to use the new address to pull metrics from the container.
| scrape_configs: - job_name: 'cadvisor' static_configs: - targets: ['cadvisor:8080'] |
With Prometheus now pulling your container data, you can query and visualise it as needed to find out where your container issues might be hiding.
Prometheus + Alloy + Loki + cAdvisor: Pro tips
Now that we have gone through all the components and configurations in the stack, here are a few short tips to help when running each component.
- Keep retention sane on small disks (e.g., Prometheus 15–30 days, Loki chunk tuning)
- Label with intent: environment=lab, node role, service names for easier queries.
- Use service discovery (Docker/K8S) to avoid brittle static targets.
- On multi-node setups, run cAdvisor on each node with static or SD-based scrape configs.
Grafana: dashboards, alerts, and sharing
Now that you have data aggregated in a data source using Prometheus and Loki servers, you're gonna want a way to visualise and alert on it. This is where we'll use Grafana to build dashboards you'll actually check. Grafana also lets you pull data from multiple sources, giving you a holistic view of your environment.
Below, we will show you how easy it is to get Grafana up and running with Docker and how to start viewing data with prebuilt dashboards you can import from the community hub.
First, we will define a new Docker Compose file for our Grafana instance.
|
version: "3.9" services: volumes: |
One other file we will want to create is a datasources json file, which will contain information on what datasources we want Grafana to connect to. You can see this configuration below.
|
apiVersion: 1 datasources: - name: Local Loki |
From this configuration, you can see we are using the built-in integration for both Prometheus and Loki, highlighting how easy it is to use Grafana with many different data sources. We can also connect to tools like cAdvisor, Ping, and Blackbox as additional data sources for our dashboard and alerting.
Now that we have a data source configured in Grafana, you can use the Grafana explorer to verify that you are pulling data from your connected data sources.

With Grafana, you can also take advantage of community efforts for dashboards and get up and running very quickly with common data sources like the Node exporter or Loki. We can use the following community dashboards.

And load these dashboard IDs when creating a Grafana dashboard next time. This has allowed us to start visualising and customising the dashboard for our data source in seconds, so it fits our needs.
Grafana: Pro tips
- Separate concerns: Run Grafana, data sources (Prometheus, Loki, etc.), and storage backends independently. Avoid co-locating persistent storage with Grafana.
- Use provisioning as code: Store dashboards, data sources, alert rules, and notifiers in version-controlled provisioning files. Immutable dashboards reduce drift and make rollbacks easy.
- Frontend load: Turn on panel lazy loading, reduce auto-refresh frequency (e.g., 15–30s), and turn off transform-heavy panels where not needed.
Performance and sizing for homelabs
Now that you have your monitoring and logging stack up and running, consider a few factors when sizing and optimising your homelab setup. Start small with the stack since it scales well for both small and large setups. Starting with just 1–2 vCPU and 2–4 GB RAM for Prometheus, Loki, and Grafana should be more than enough resources for even modest homelabs before seeing bottlenecks in metrics, log ingestion, or query performance.
When it comes to the storage of your logs and metrics data, you should consider the following:
- Put Prometheus and Loki data on SSDs for faster reads and writes.
- Cap retention and compress logs aggressively.
- Consider remote storage later (e.g., S3-compatible) as data grows.
Make sure you also set up a robust backup regime for the stacks' configs and dashboards. You can always collect more time-series data, but you can't regain the tuning you have done on all your alerting and metrics, so it's better to surface the information you need in a format that works for you.
Common pitfalls to avoid
When running the logging and monitoring stack, avoid these common logging pitfalls to ensure you get the most out of your logs.
- Scraping everything at 5s intervals: This strategy often stems from "because it looks cool" and will end up costing you in the long run, with excessive disk activity that can cause you to churn and burn through your disks faster than expected over the long term.
- Letting logs go unstructured and unlabeled: Doing this can leave you using inefficient fuzzy searches, leading to unsearchable noise and making it hard to find issues or what you are looking for.
- Single-node everything without exports: It can be easy to put everything on a single node and skip configuring exports or backups for your logging dashboards, alerts, and configurations. Plan for a simple restore and keep off-box backups of configurations.
Wrapping up
With Prometheus, Promtail, Loki, cAdvisor, and Grafana, you'll get fast, low-footprint observability that scales from a single-node homelab to a multi-node setup. Start simple, label smart, visualise what matters, and alert only on action-worthy signals:
If you're looking to read further on home lab hosting, check out some of our other posts: