Skip to main content
Interviews ยท Reliability ยท 15 min read ยท April 8, 2026

Inside a Big Hosting Stack: Interview With a Senior SRE

๐ŸŽ™๏ธ

We sat down with a Site Reliability Engineer who has spent years running infrastructure at scale. To speak candidly, they asked to stay anonymous. What follows is an edited conversation about uptime, speed, and what really happens when a host goes down.

What does "99.9% uptime" actually mean?

"People see '99.9%' and assume 'basically always up.' But 99.9% still allows almost nine hours of downtime a year. The number that matters more is how downtime is distributed โ€” nine hours in one outage during your launch is very different from a few seconds here and there. Always read what the SLA actually credits you, not just the headline figure."

Why do some hosts always feel slow?

"Usually it's oversold shared servers โ€” too many sites packed onto one box, all fighting for the same CPU and disk. The second big one is no server-level caching: every request hits PHP and the database. A well-tuned server with proper caching feels instant; an oversold one feels like wading through mud no matter what the customer does."

What's the biggest thing customers get wrong?

"They optimize the wrong layer. Someone will spend a weekend shaving 50ms off their CSS while their host adds 800ms of TTFB. You can't out-optimize a slow server. Fix the foundation first โ€” host and caching โ€” then worry about the front-end."

What happens behind the scenes during an outage?

"Alerts fire, an on-call engineer gets paged, and the first job is triage: what's the blast radius? Is it one node or a whole data center? The good teams have runbooks and can fail over automatically. The painful outages are the ones where something unexpected cascades and the automation doesn't cover it. That's when minutes feel like hours."

How do you think about backups?

"A backup you've never restored isn't a backup โ€” it's a hope. Test restores regularly. And keep your own copy. Don't rely solely on your host's backups; if your account gets suspended or compromised, you want an independent copy you control."

If you were buying hosting today, what would you look for?

"Server-level caching, NVMe storage, honest renewal pricing, and a status page with real history. A public, detailed status page tells you a host isn't hiding its incidents โ€” that transparency usually correlates with a team that takes reliability seriously."

Our takeaway

The themes line up with everything we test: caching and a fast origin beat front-end micro-tweaks, uptime numbers need context, and you should always keep your own backups. Want to act on it? Start with our hosting reviews, check any host's response time with our TTFB Checker, or monitor your own site with the Uptime Checker.