Build for free

Nylas
Blog
Engineering
How Nylas improved API reliability from 99.9% to 99.99%

How Nylas improved API reliability from 99.9% to 99.99%

Nov 05, 2025 • 3 min read

Jump to section

Reliability is the foundation of trust for every API platform. At Nylas, we take that trust seriously, not just by claiming high uptime but by holding ourselves to one of the strictest reliability standards in the industry.

Developers building on APIs depend on predictable performance. A single failed call can break entire workflows. That’s why reliability isn’t just a metric at Nylas; it’s a core engineering principle that guides how we design, deploy, and measure success.

Measuring API reliability the right way

While many companies report SLA as uptime (whether their systems are technically reachable), Nylas measures it by API success rate:

Total successful API calls ÷ total API calls.

This approach captures the actual customer experience. If a request fails, it counts — no matter how short or isolated the issue. It’s a tougher standard, but one that keeps us focused on real reliability instead of surface-level uptime metrics.

Lesson 1: Not everything belongs in Kubernetes

Kubernetes excels at running stateless workloads, but we found that certain components, especially our databases and primary API gateway, perform better on dedicated infrastructure. Moving these services off Kubernetes gave us tighter control over performance, latency, and failure recovery.

When we migrated our API gateway and all databases to dedicated compute, we saw a 12% reduction in average request latency and significantly reduced load on CoreDNS/KubeDNS. Stateless workloads continue to thrive in Kubernetes, but high-throughput components benefit from the predictability of managed infrastructure where we control every variable, including failover timing and I/O profiles.

Lesson 2: Data-driven canary deployments

Every new release goes through an automated canary phase before full rollout. We direct a small percentage of live traffic, from 5% – 50% in stages to the new version and monitor API success rates in real time.

We compare every metric against the previous release. If the new build shows even a 0.01% regression in success rate, the deployment halts automatically and rolls back within minutes. This measurable guardrail closes the feedback loop between code and reliability, ensuring that every change improves the developer experience rather than degrading it.

Lesson 3: Relentless chaos testing

We continuously simulate real-world failures across our databases and API services to ensure the platform self-heals. Any single node or subsystem can be brought down and replaced without customer impact.

Our chaos testing program runs multiple times per week, injecting controlled failures such as database node loss, API rate-limit spikes, and regional disruptions. These drills surfaced hidden dependencies early, allowing us to harden retry logic, add regional redundancy, and verify that all critical paths recover without manual intervention. The result is a platform designed to remain stable even when individual components fail.

From 99.9% to 99.99%: what that means for developers

Through this disciplined engineering approach, Nylas evolved from 99.9% to 99.99% API reliability, a tenfold reduction in allowed downtime and a major milestone in customer trust.

In practice, that means fewer failed requests, faster recovery from incidents, and a more consistent, developer-first experience across billions of API calls each month. For our customers, reliability isn’t an invisible feature; it’s a reason they can build confidently on our platform.

Reliability isn’t a checkbox; it’s a culture

Our journey continues, but one thing is clear: when you measure what truly matters—successful outcomes for every API call—the right architectural choices naturally follow.

Reliability is never ‘done.’ It’s an ongoing commitment to the developers who build on us every day.Interested in building on a platform that prioritizes reliability? Explore the Nylas API and see how we help developers ship faster with confidence.

@nylas/connect: A JavaScript library for connecting grants from the browser

@nylas/connect: A JavaScript library for connecting grants from the browser

Exploring how APIs, AI, and automation are reshaping SaaS platforms, CRMs, and the future of connected work.

How Nylas improved API reliability from 99.9% to 99.99%

@nylas/connect: A JavaScript library for connecting grants from the browser

Introducing Summaries and Action Items for the Notetaker API

How to integrate Zoom meetings into a SaaS application

Measuring API reliability the right way

Lesson 1: Not everything belongs in Kubernetes

Lesson 2: Data-driven canary deployments

Lesson 3: Relentless chaos testing

From 99.9% to 99.99%: what that means for developers

Reliability isn’t a checkbox; it’s a culture

Related resources

Reduce Deployment Time by 30% With Continuous Delivery and GitHub Actions

The Deceptively Complex World of RRULEs in Calendar Events

How Many Days Does it Take to Integrate Email, Calendar, and Contacts Into Your App?

@nylas/connect: A JavaScript library for connecting grants from the browser

@nylas/connect: A JavaScript library for connecting grants from the browser

Exploring how APIs, AI, and automation are reshaping SaaS platforms, CRMs, and the future of connected work.

How Nylas improved API reliability from 99.9% to 99.99%

@nylas/connect: A JavaScript library for connecting grants from the browser

Introducing Summaries and Action Items for the Notetaker API

How to integrate Zoom meetings into a SaaS application

Measuring API reliability the right way

Lesson 1: Not everything belongs in Kubernetes

Lesson 2: Data-driven canary deployments

Lesson 3: Relentless chaos testing

From 99.9% to 99.99%: what that means for developers

Reliability isn’t a checkbox; it’s a culture

Related resources

Reduce Deployment Time by 30% With Continuous Delivery and GitHub Actions

The Deceptively Complex World of RRULEs in Calendar Events

How Many Days Does it Take to Integrate Email, Calendar, and Contacts Into Your App?

Caleb Geene