Reliability and performance at scale

Recent and upcoming releases are focused on improving reliability and performance of GPU machines

3 years ago • 1 min read

By Daniel Kobran

The Paperspace cloud has grown massively over the past few years. We're now supporting 600K+ users and approaching 100M hours of GPU compute served to our users.

As we’ve grown, we’ve hit some scaling snags along the way. We are fully aware that outages and bugs are not acceptable, especially as more and more of our user base is running in production.

We've already shipped some improvements behind the scenes to enhance virtual machine health checks and alerting. We're seeing a drop in error rates, faster spin-up and spin-down durations in Core and Gradient, and better kernel performance in Gradient Notebooks. In the Paperspace console we've surpassed a 99.85% sustained crash-free session rate and the trend is continuing upward.

We’re working around the clock to address remaining reliability and performance issues head-on. Over the coming months look out for impactful changes on issues big and small. We’ll be taking shipping improvements across hardware and software, including a re-write of the billing engine which has been a persistent thorn in our sides for some time.

We'll be keeping these improvements coming behind the scenes over the next couple of releases. Thanks for your support as we scale the world's best cloud for accelerated computing.

💜 PS Engineering

Tags:
Announcement

public

AMPT-GA: Automatic Mixed Precision Floating Point Tuning for GPU Applications

public

Blog

Docs

Community

ML Showcase

Professional Services

Talk to an Expert

AMPT-GA: Automatic Mixed Precision Floating Point Tuning for GPU Applications

Introducing the Ultimate Guide to GPU Cloud Providers

Solutions

Product

Resources

Company

Spread the word

AMPT-GA: Automatic Mixed Precision Floating Point Tuning for GPU Applications

Introducing the Ultimate Guide to GPU Cloud Providers

Keep reading

Come see Paperspace by DigitalOcean at NVIDIA GTC!

Paperspace Joins DigitalOcean

Paperspace launches support for the new NVIDIA H100 Tensor Core GPU

Subscribe to our newsletter

Solutions

Product

Resources

Company