Search Knowledge Base Articles

Infrastructure Monitoring: Keeping Your Systems Healthy

Infrastructure monitoring provides continuous visibility into the health, performance, and availability of your systems. Without monitoring, you find out about problems when users report them — monitoring enables you to detect, diagnose, and often automatically remediate issues before users are affected.

What We Monitor

Infrastructure metrics: CPU utilisation, memory usage, disk I/O, network traffic — across all servers and managed services
Application metrics: Request rate, error rate, latency (the RED method: Rate, Errors, Duration)
Business metrics: Order rates, user sign-ups, payment success rates — metrics that indicate the health of your business, not just your infrastructure
Uptime / availability: External synthetic monitoring that checks your endpoints from outside your infrastructure — confirms what users experience
Database metrics: Query performance, connection counts, replication lag, storage utilisation

Monitoring Levels

Infrastructure level: VM/container CPU, memory, disk — are the machines healthy?
Application level: Are API endpoints responding within acceptable latency? What is the error rate?
User experience level: Real User Monitoring (RUM) — how fast do pages load for real users on real devices and networks?
Synthetic monitoring: Automated browser sessions that simulate user journeys and alert if they fail

Tools We Use

Datadog: Full-stack monitoring — infrastructure, APM, log management, synthetics. Our default for complex systems.
Grafana + Prometheus: Open-source monitoring stack — highly customisable, no licensing cost
AWS CloudWatch: Native AWS monitoring — included with AWS services

Did you find this article useful?

Introduction to Cloud Infrastructure: What We Use and Why

Introduction to Cloud Infrastructure: What We Use and Why Cloud infrastructure refers to the on-dema...
Virtual Machines vs Containers: Understanding the Difference

Virtual Machines vs Containers: Understanding the Difference Virtual Machines (VMs) and containers a...
Docker: Containerisation Explained for Clients

Docker: Containerisation Explained for Clients Docker is the most widely used containerisation techn...
Kubernetes: Container Orchestration Explained

Kubernetes: Container Orchestration Explained Kubernetes (K8s) is the industry-standard platform for...
Infrastructure as Code: Managing Infrastructure with Terraform

Infrastructure as Code: Managing Infrastructure with Terraform Infrastructure as Code (IaC) is the p...

Search Knowledge Base Articles

Infrastructure Monitoring: Keeping Your Systems Healthy

Infrastructure Monitoring: Keeping Your Systems Healthy

What We Monitor

Monitoring Levels

Tools We Use

Did you find this article useful?

Related Articles

Introduction to Cloud Infrastructure: What We Use and Why

Virtual Machines vs Containers: Understanding the Difference

Docker: Containerisation Explained for Clients

Kubernetes: Container Orchestration Explained

Infrastructure as Code: Managing Infrastructure with Terraform