Prometheus ZFS Exporter on Tailnet
When monitoring ZFS storage systems with Prometheus, protecting sensitive filesystem metrics is crucial. ZFS Exporter provides detailed pool health, dataset statistics, and performance metrics that should never be exposed to the public internet. This guide demonstrates how to configure ZFS Exporter to be accessible only through your Tailscale network (tailnet).
The Security Challenge
By default, ZFS Exporter listens on port 9134 and serves comprehensive ZFS metrics to any client that can reach that port. This creates significant security risks:
- Storage intelligence leak: ZFS metrics reveal pool layouts, capacity planning, and usage patterns
- Performance data exposure: I/O statistics and bottlenecks become visible to attackers
- Infrastructure mapping: Dataset names and structures expose organizational data
- Compliance violations: Storage metrics often contain sensitive operational information
Our Solution: Tailnet-Only Access
- All ZFS servers are on the same tailnet
- zfs_exporter should be accessible only via tailnet
- zfs_exporter should not be accessible from the public internet
This approach provides:
- Zero-trust networking: Only authenticated devices on your tailnet can access ZFS metrics
- Encrypted traffic: All communication goes through Tailscale’s encrypted tunnels
- Simplified firewall rules: Clear allow/deny rules based on Tailscale’s IP range
- Storage security: Critical filesystem data remains private
Prerequisites
Before proceeding, ensure:
- ZFS is installed and configured on your Ubuntu system
- Tailscale is installed and your server is connected to your tailnet
- You have root access for installation and firewall configuration
The Installation Script
Here’s our complete setup script that handles installation, service configuration, and firewall rules:
#!/bin/bash
set -e
# Define version
VERSION="2.3.8"
echo "[*] Installing zfs_exporter v$VERSION..."
# Download and install zfs_exporter
cd /tmp
wget -q https://github.com/pdf/zfs_exporter/releases/download/v$VERSION/zfs_exporter-$VERSION.linux-amd64.tar.gz
tar xfz zfs_exporter-$VERSION.linux-amd64.tar.gz
cp zfs_exporter-$VERSION.linux-amd64/zfs_exporter /usr/local/bin/
chmod +x /usr/local/bin/zfs_exporter
echo "[*] Creating systemd service..."
cat <<EOF | tee /etc/systemd/system/zfs_exporter.service
[Unit]
Description=ZFS Exporter
Wants=network-online.target
After=network-online.target
[Service]
User=root
Group=root
Type=simple
ExecStart=/usr/local/bin/zfs_exporter
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reexec
systemctl daemon-reload
systemctl enable zfs_exporter
systemctl start zfs_exporter
echo "[*] Configuring UFW..."
# Allow SSH explicitly (precaution)
ufw allow OpenSSH comment "Allow SSH"
# Set default policies
ufw default allow incoming
ufw default allow outgoing
# Block zfs_exporter from everywhere EXCEPT Tailnet
ufw allow from 100.0.0.0/8 to any port 9134 comment "Allow zfs_exporter from Tailnet"
ufw deny 9134 comment "Block zfs_exporter from outside Tailnet"
# Enable UFW if inactive
if ufw status | grep -q "Status: inactive"; then
echo "[*] Enabling UFW firewall with default allow policy..."
echo "y" | ufw enable
else
echo "[*] UFW already active."
fi
ufw reload
echo "[✓] Final UFW status:"
ufw status verbose
echo "[✓] zfs_exporter service status:"
systemctl status zfs_exporter --no-pager
Key Security Features
1. Tailscale IP Range Restriction
The script uses 100.0.0.0/8
which is Tailscale’s reserved IP range. This ensures only devices authenticated to your tailnet can access the ZFS metrics.
2. Explicit Deny Rule
The ufw deny 9134
rule provides defense-in-depth by explicitly blocking external access to the ZFS Exporter port.
3. SSH Protection
The script preserves SSH access as a safety measure, preventing you from locking yourself out during firewall configuration.
4. Root Permissions
ZFS Exporter runs as root to access ZFS kernel modules and gather comprehensive pool statistics.
Verification Steps
After running the script, verify your setup:
Check ZFS Exporter Status
systemctl status zfs_exporter
Verify Firewall Rules
ufw status verbose
You should see rules similar to:
9134 ALLOW 100.0.0.0/8 # Allow zfs_exporter from Tailnet
9134 DENY Anywhere # Block zfs_exporter from outside Tailnet
Test Access
From another machine on your tailnet:
curl http://[TAILSCALE_IP]:9134/metrics
From outside the tailnet, this should fail or timeout.
Verify ZFS Metrics
Check that ZFS metrics are being exported correctly:
curl -s http://localhost:9134/metrics | grep zfs_pool
You should see metrics like:
zfs_pool_allocated_bytes{pool="tank"} 1.23456789e+12
zfs_pool_size_bytes{pool="tank"} 2.34567890e+12
zfs_pool_free_bytes{pool="tank"} 1.11111111e+12
Available ZFS Metrics
ZFS Exporter provides comprehensive metrics including:
- Pool Health: Status, errors, and resilience information
- Capacity Metrics: Used, free, and total space per pool and dataset
- Performance Stats: I/O operations, bandwidth, and latency
- Dataset Information: Compression ratios, deduplication stats
- Error Counters: Read, write, and checksum errors
- Scrub Status: Last scrub time and error counts
Prometheus Configuration
Configure your Prometheus server to scrape ZFS metrics using Tailscale IPs:
scrape_configs:
- job_name: 'zfs_exporter_tailnet'
file_sd_configs:
- files:
- '/etc/prometheus/targets/zfs_exporter.json'
refresh_interval: 30s
scrape_interval: 30s
scrape_timeout: 10s
Create the /etc/prometheus/targets/zfs_exporter.json
file with your ZFS servers:
[
{
"targets": ["storage1:9134"],
"labels": {
"instance": "storage1",
"datacenter": "home"
}
},
{
"targets": ["storage2:9134"],
"labels": {
"instance": "storage2",
"datacenter": "home"
}
},
{
"targets": ["backup-server:9134"],
"labels": {
"instance": "backup-server",
"datacenter": "home"
}
}
]
Useful Grafana Queries
Here are some essential PromQL queries for monitoring ZFS:
Pool Capacity
(zfs_pool_allocated_bytes / zfs_pool_size_bytes) * 100
Pool Health Status
zfs_pool_health_status
I/O Operations Rate
rate(zfs_pool_io_operations_total[5m])
Error Rate
rate(zfs_pool_errors_total[5m])
Troubleshooting
Can’t Access Metrics
- Verify Tailscale is running on both machines
- Check firewall rules with
ufw status
- Confirm ZFS Exporter is listening:
ss -tlnp | grep 9134
- Ensure ZFS is properly loaded:
lsmod | grep zfs
No ZFS Metrics
- Check if ZFS pools exist:
zpool list
- Verify ZFS kernel modules:
modprobe zfs
- Check ZFS Exporter logs:
journalctl -u zfs_exporter -f
Permission Issues
ZFS Exporter requires root permissions to access ZFS statistics. If running as non-root:
# Add user to appropriate groups (not recommended for security)
usermod -a -G sudo prometheus
Firewall Issues
If you need to reset UFW rules:
ufw --force reset
Then re-run the configuration portion of the script.
Monitoring Best Practices
Alert Rules
Consider these Prometheus alert rules for ZFS monitoring:
groups:
- name: zfs.rules
rules:
- alert: ZpoolUnhealthy
expr: zfs_pool_health > 0
for: 2m
labels:
severity: critical
annotations:
summary: "ZFS pool on {{ $labels.instance }} is not healthy ({{ $labels.name }})"
description: "Zpool {{ $labels.name }} is in an unhealthy state. Check `zpool status`."
- alert: ZpoolReadOnly
expr: zfs_pool_read_only > 0
for: 2m
labels:
severity: critical
annotations:
summary: "ZFS pool on {{ $labels.instance }} is in read-only mode ({{ $labels.name }})"
description: "Zpool {{ $labels.name }} is in an unhealthy state. Check `zpool status`."
Regular Monitoring
- Monitor pool capacity trends
- Set up alerts for degraded pools
- Track scrub completion and error rates
- Monitor I/O performance metrics
Conclusion
This setup provides a secure, comprehensive way to monitor your ZFS storage infrastructure with Prometheus while keeping sensitive filesystem metrics away from the public internet. The combination of Tailscale’s zero-trust networking and properly configured firewall rules ensures your storage monitoring data remains private and secure.
ZFS Exporter on a tailnet gives you peace of mind that your storage metrics are protected while still providing the detailed monitoring capabilities needed for proper storage management.
This post was enhanced with assistance from Claude.
Comments