Coronavirus COVID-19: The Data We Have, and How We Can Use It IntroductionMany of us currently feel trapped inside the confines of their homes, stuck between four walls and afraid to venture out into open spaces to avoid the spread of COVID-19.

Data Science Introduction to Naive Bayes: A Probability-Based Classification Algorithm Imagine this: an electricity operator would like to supply specific units of electric current to various factory divisions based on their past trends of power consumption. To simplify the process,

Series: Ensemble Methods Gradient Boosting In Classification: Not a Black Box Anymore! Machine learning algorithms require more than just fitting models and making predictions to improve accuracy. Most winning models in the industry or in competitions have been using Ensemble Techniques or

Machine Learning How To Implement Support Vector Machine With Scikit-Learn In this tutorial we'll cover: An introduction to the support vector machine algorithmImplementing SVM using Python and SklearnSo, let's get started! Bring this project to life Run on gradientIntroduction to

Data Science Implementing The Levenshtein Distance for Word Autocompletion and Autocorrection The Levenshtein distance is a text similarity measure that compares two words and returns a numeric value representing the distance between them. The distance reflects the total number of single-character

Data Science Measuring Text Similarity Using the Levenshtein Distance In word processing or text chat applications, it's common that users make some unintended spelling mistakes. It could be as simple as writing "helo" (single "l") rather than "hello". Luckily,

Data Science Anomaly Detection Using Isolation Forest in Python From bank fraud to preventative machine maintenance, anomaly detection is an incredibly useful and common application of machine learning. The isolation forest algorithm is a simple yet powerful choice to

Series: Ensemble Methods A Guide to AdaBoost: Boosting To Save The Day Today, machine learning is the premise of big innovations and promises to continue enabling companies to make the best decisions through accurate predictions. But what happens when the error susceptibility

Series: Ensemble Methods A Guide To Random Forests: Consolidating Decision Trees The Random Forest algorithm is one of the most popular machine learning algorithms that is used for both classification and regression. The ability to perform both tasks makes it unique,

Series: Ensemble Methods An Introduction to Decision Trees Decision Trees are the foundation for many classical machine learning algorithms like Random Forests, Bagging, and Boosted Decision Trees. They were first proposed by Leo Breiman, a statistician at the

Series: Ensemble Methods Introduction to Bagging and Ensemble Methods The bias-variance trade-off is a challenge we all face while training machine learning algorithms. Bagging is a powerful ensemble method which helps to reduce variance, and by extension, prevent overfitting.

Quilt Reproducible machine learning with PyTorch and Quilt In this article, we'll use Quilt to transfer versioned training data to a remote machine. We'll start with the Berkeley Segmentation Dataset, package the dataset, then train a PyTorch model for super-resolution imaging.

Series: Optimization Intro to optimization in deep learning: Momentum, RMSProp and Adam In this post, we take a look at a problem that plagues training of neural networks, pathological curvature.

Series Dimension Reduction - Autoencoders This tutorial is from a 7 part series on Dimension Reduction: Understanding Dimension Reduction with Principal Component Analysis (PCA) Diving Deeper into Dimension Reduction with Independent Components Analysis (ICA) Multi-Dimension Scaling (MDS) LLE t-SNE IsoMap Autoencoders (This post assumes you have a working knowledge

Machine Learning Dimension Reduction - IsoMap This tutorial is from a 7 part series on Dimension Reduction: Understanding Dimension Reduction with Principal Component Analysis (PCA) Diving Deeper into Dimension Reduction with Independent Components Analysis (ICA) Multi-Dimension Scaling (MDS) LLE t-SNE IsoMap Autoencoders (A jupyter notebook with math and code(spark)

Machine Learning Dimension Reduction - t-SNE This tutorial is from a 7 part series on Dimension Reduction: Understanding Dimension Reduction with Principal Component Analysis (PCA) Diving Deeper into Dimension Reduction with Independent Components Analysis (ICA) Multi-Dimension Scaling (MDS) LLE t-SNE IsoMap Autoencoders (A more mathematical notebook with code is available

Machine Learning Dimension Reduction - LLE This tutorial is from a 7 part series on Dimension Reduction: Understanding Dimension Reduction with Principal Component Analysis (PCA) Diving Deeper into Dimension Reduction with Independent Components Analysis (ICA) Multi-Dimension Scaling (MDS) LLE t-SNE IsoMap Autoencoders (A jupyter notebook with math and code(python

Machine Learning Multi-Dimension Scaling (MDS) This tutorial is from a 7 part series on Dimension Reduction: Understanding Dimension Reduction with Principal Component Analysis (PCA) Diving Deeper into Dimension Reduction with Independent Components Analysis (ICA) Multi-Dimension Scaling (MDS) LLE (Coming Soon!) t-SNE (Coming Soon!) IsoMap (Coming Soon!) Autoencoders (Coming Soon!

Machine Learning Diving Deeper into Dimension Reduction with Independent Components Analysis (ICA) This tutorial is from a 7 part series on Dimension Reduction: Understanding Dimension Reduction with Principal Component Analysis (PCA) Diving Deeper into Dimension Reduction with Independent Components Analysis (ICA) Multi-Dimension Scaling (MDS) LLE (Coming Soon!) t-SNE (Coming Soon!) IsoMap (Coming Soon!) Autoencoders (Coming Soon!

Machine Learning Understanding Dimension Reduction with Principal Component Analysis (PCA) This tutorial is from a 7 part series on Dimension Reduction: Understanding Dimension Reduction with Principal Component Analysis (PCA) Diving Deeper into Dimension Reduction with Independent Components Analysis (ICA) Multi-Dimension Scaling (MDS) LLE (Coming Soon!) t-SNE (Coming Soon!) IsoMap (Coming Soon!) Autoencoders (Coming Soon!

Data Science How to run Tableau on a Chromebook What is Tableau Desktop? Tableau Desktop is a business analytics solution that can visualize data and deliver insights from nearly any data source. It's built for collaboration and can handle

Data Science Jupyter notebooks the easy way! (with GPU support) 1. Create a Paperspace GPU machine You can choose any of our GPU types (GPU+/P5000/P6000). For this tutorial we are just going to pick the default Ubuntu 16.

Tutorial Unsupervised Politics: Clustering in Machine Learning When faced with large quantities of data which you need to make sense of, it might be difficult to know where to begin looking for interesting trends. Rather than trying to make specific predictions with the data, you might want to start with simply

Machine Learning What is Data Science? In the time that it takes you to read this article, about 26 million gigabytes of data will be produced. That's equivalent to about 2.34 billion minutes of standard definition video on iTunes or about 13 billion e-books. In a year, this would

Data Science Getting started with scikit-learn The Machine Learning field is growing at a tremendous pace. One of the most interesting aspects of this development is the community created around it. With a closer look, we can see that the ML community can be separated in several niches, interested in