Azure Machine Learning is a suite of machine learning tools to help data scientists and ML engineers accomplish machine learning tasks.
Today we're looking at Azure Notebooks, a product within the Azure Machine Learning toolset that lets you run an enterprise-class Jupyter notebook on an Azure VM or a shared Azure cluster.
In this blogpost we'll be reviewing the good and bad parts of Azure Notebooks as well as comparing Azure Notebooks to Gradient Notebooks from Paperspace.
Let's begin!
Introduction to Azure Notebooks
Microsoft's effort to provide a full-stack machine learning platform is called Azure Machine Learning. Like Google Cloud's AI Platform and AWS's SageMaker, Azure Machine Learning is an effort by one of the public cloud providers to assemble a suite of tools for enterprise machine learning teams.
Notebooks are one of the primitives Azure offers as part of its Machine Learning offering, along with a drag-and-drop designer and automated machine learning UI.
In addition, Azure offers MLOps software to productionalize ML experiments and deployments. This is similar to the rest of the Gradient product and a comparison will be made at a later date.
Like ML tools from the other public cloud providers, Azure Machine Learning is targeted toward enterprise users and is bringing a message of speed and collaboration to market for an area (Jupyter notebooks) that is typically tricky to manage as a team.
Or as the Azure Machine Learning marketing page puts it:
Enterprise-grade machine learning service to build and deploy models faster
tl;dr
The biggest criticism of Azure's notebook offering is that while is is certainly a fully featured implementation of managed hosted JupyterLab notebook – it is extraordinarily difficult to get up and running quickly, to pre-calculate what you'll be billed, to add collaborators, and to get help if needed via documentation or customer support.
Azure Machine Learning notebooks are therefore in a similar niche as notebooks offered by the other public cloud providers – great if you're already in the ecosystem and need a bunch of enterprise features (e.g. role-based access control), but not so great if you're using notebooks to explore hypotheses and need a place to get going right away.
In general, Azure notebooks are best for those who'd like to take advantage of $200 in starter credits from Microsoft or for those who are already entrenched in the Azure computing ecosystem and have a need for enterprise features around compliance, SLAs, or for those IT departments who call the shots when it comes to resource allocation and provisioning.
Meanwhile, Paperspace Gradient notebooks are best for those who'd like to run Free CPU and GPU instances without a lot of startup time or hassle, those who'd like to launch notebooks directly from pre-built containers, and those would like more freedom during an exploration stage of model R&D.
Feature Comparison
It is possible to run both Jupyter and JupyterLab from Azure ML Notebooks. There is also a basic read-only IDE that allows you to view but not write to a notebook.
Microsoft Azure Notebooks | Paperspace Gradient Notebooks | |
---|---|---|
Cost | $200 free Azure credits for new users | Free CPU and GPU notebooks |
Resources | Any Azure instance | Any Paperspace instance |
Start from zero requirements | Credit card, GPU approval | Free CPU and GPU without credit card or approval |
Startup time | Compute requires a few mins to initialize | Compute requires a few seconds to initialize |
Auto-shutdown | No (in development as of 02/2020) | Yes |
Jupyter Notebook Option | Yes | Yes |
JupyterLab Option | Yes | Yes |
Build from container | Yes | Yes |
Cost Comparison
New Azure customers currently receive $200 in credit for Azure Machine Learning after creating an account and entering credit card information. Credits may be applied to create compute instances.
An overview of spot prices for compute instances is as follows:
Instance Type | Paperspace Gradient Notebooks | Instance Type | Microsoft Azure Notebooks |
---|---|---|---|
Free (M4000) | $0.00/hr | M60 | $1.14/hr |
Free (P5000) | $0.00/hr | M60 x2 | $2.40/hr |
P4000* | $0.51/hr | M60 x4 | $4.81/hr |
P5000* | $0.78/hr | K80 | $0.90/hr |
P6000* | $1.10/hr | K80 x2 | $1.80/hr |
V100* | $2.30/hr | K80 x 4 | $3.60/hr |
P5000 x4* | $3.12/hr | V100 | $3.06/hr |
P6000 x4* | $4.40/hr | V100 x2 | $6.12/hr |
--- | --- | V100 x4 | $13.46/hr |
*While Paperspace offers free GPUs with no subscription required, paid instances from Paperspace require a plan. Gradient subscription tiers are as follows:
Gradient Subscription Type | Cost | Details |
---|---|---|
Free | $0/mo | - Free instances only - Notebooks are public - Limit 1 concurrent notebook - Limit 12 hours max per session - 5GB persistent storage |
G1 (Individual) | $8/mo | - Free and Paid instances - Private notebooks - Limit 5 concurrent notebooks - Unlimited session length - 200GB persistent storage |
G2 (Individual) | $24/mo | - Free and Paid instances - Private notebooks - Limit 10 concurrent notebooks - Unlimited session length - 1TB persistent storage |
T1 (Team) | $12/user/mo | - Free and Paid instances - Private notebooks - Limit 10 concurrent notebooks - Unlimited session length - 500GB persistent storage - Private team collaboration - Private managed cluster |
T2 (Team) | $49/user/mo | - Free and Paid instances - Private notebooks - Limit 50 concurrent notebooks - Unlimited session length - 1TB persistent storage - Private team collaboration - Private managed cluster |
Getting Started
Setting up a Jupyter Notebook in Azure
Getting started in a notebook in Azure takes a large number of steps:
- Create an Azure account (link)
- Once you've created an account, visit the Azure portal (link)
- Navigate to the Machine Learning service (link)
- Create a new Machine Learning workspace and specify subscription tier, resource group, and any other values that you might need such as container registry
- Once your new workspace is deployed, visit the resource and select Launch Studio
- From the studio view, select from the sidebar Author > Notebooks > Create
- Once you create a file, you will need to select Compute > New Compute in order to create an instance to run your notebook against. GPU instances require you to request additional quota from Azure.
- NOTE: Azure currently offers $200 in credits for new accounts
Setting up a Jupyter Notebook in Paperspace Gradient
To get started with a notebook in Gradient:
- Create a Paperspace account (link)
- Navigate to Gradient > Notebooks and select Create Notebook
- Enter a name for the notebook, a runtime (optional), and select an instance
- If you've selected a free CPU or free GPU instance, select Start Notebook and that's it! (Paid instances require a credit card.)
- NOTE: Paperspace offers unlimited use of free-tier CPU and GPU-backed notebooks
Startup time
Any cloud provider will take a few moments to spin-up a CPU or GPU instance. Azure Machine Learning takes about 3 mins of provisioning to create your first resource, while Paperspace takes about 30 seconds.
If you want to use a GPU-backed notebook, Azure requires that you submit a resource request for additional resource types. Gradient does not have this requirement for Free CPUs or GPUs, but does have this requirement for paid tier resources.
Cognitive overhead
Getting the "lay of the land" in Azure is naturally more difficult than in Gradient due to the enterprise focus. This is nice if you need strict RBAC or compliance measures, but less nice if you're trying to start exploring in a notebook right away.
Auto-shutdown
When you create a Gradient notebook, you always specify an auto-shutdown interval. This prevents cost overruns and gives you peace of mind that you're never paying for a notebook that you're not actively using.
JupyterLab
Both Azure and Gradient will give you a full version of JupyterLab to run your notebook. This is a plus for both products as some cloud notebook providers (such as Google Colab) instead give you a far more limited featureset.
Adding a card
Azure Machine Learning requires a credit card to use the product. Gradient only requires a credit card for paid instances. Free Gradient instances always include at least one GPU option (typically NVIDIA M4000 or P5000).
Queued cells
In our testing, Azure notebook cells often queued for an extended period before evaluating. Gradient Notebook instances meanwhile are not preemptible.
Conclusion
Overall, both Azure Machine Learning and Paperspace Gradient offer CPU and GPU-backed cloud notebooks in both native IDE and JupyterLab format.
For those who are already in the Azure ecosystem, it may make sense to apply compute and compute credits to driving Azure-based notebooks.
For others, Paperspace Gradient offers a suitable alternative with less complexity and overhead to get started.