This post will give you a detailed roadmap to learn Deep Learning and will help you get Deep Learning internships and full-time jobs within 6 months. This post is practical, result oriented and follows a top-down approach. It is targeted towards beginners strapped for time, as well as for intermediate practitioners.
If you do MOOC after MOOC and dredge through the math and theory, like what most other tutorials offer, you’ll only get to build your first neural net in 3 months. You should be able to build one much sooner. This post follows a two-stage strategy,
- Gain a high-level idea of Deep Learning: You do beginner - medium level projects and do courses and theory that don't involve too much math.
- Focus - Building cool stuff over math and theory + Getting a high-level overview of the Deep Learning landscape.
- Time - 3 months
- Dive Deeper into Deep Learning: Read about Math and Machine Learning in detail. You will get to do ambitious projects that require quite a bit of theoretical know-how and ones with a larger codebase with a lot more functionality.
- Focus - Heavy theory and bigger projects.
- Time - 3 months
- You know basic programming.
- Have a basic understanding of Calculus, Linear Algebra, and Probability.
- You’re willing to spend 20 hours/ week.
- Do the Python Crash Course. This is an awesome resource for Python beginners and is very hands-on and project driven. It is brief and to the point. Loads of fun with lots of best practices and gems. Pretty much covers all the concepts required for building things with Deep Learning.
- Read the pep8 rules. It is important to know how to write and style python correctly.
Important packages to get comfortable with:
- Data wrangling
- os (For file management)
- json (Quite a lot of datasets are in the json format)
- Argparse (For writing neat scripts)
- Pandas (For working with csv and other tabular data)
- Science Stack
Time: 1 week
- It is imperative to have a good understanding of Machine Learning before diving into Deep Learning.
- Do Andrew Ng’s Machine learning course on Coursera until week 8. Weeks 9, 10, 11 are not as important as the first 8. The first 8 weeks cover the necessary theory and weeks 9, 10, 11 are application oriented. Although the course schedule states that it takes 8 weeks to complete, it is quite possible to finish the content in 4-6 weeks. The course is quite good, however, the programming assignments are in Octave. As a Machine Learning Engineer / Researcher, you will hardly use Octave, and will definitely do most of your work in Python.
- To practice programming in Python, do Jake Vanderplas’s Machine Learning Notebooks. They contain a good high-level overview of Machine Learning and sufficient Python exercises and introduce you to scikit-learn, a very popular Machine Learning library. You will need to install Jupyter Lab / Notebook for this and you can find the installation and usage instructions here.
- At this point, you should have a good theoretical and practical understanding of Machine Learning. It’s time to test your skills. Do the Titanic Classification challenge on Kaggle and play around with the data and plug and play different Machine Learning models. This is a great platform to apply what you have learned.
Time: 4-6 weeks
- It is important to have access to a GPU to run any Deep Learning experiments. Google Collaboratory has free GPU access. However, Colab may not be the best GPU solution and is known to disconnect often and could be laggy. There are several guides for building your own GPU rig but the ultimately this is a distraction and will slow you down. Cloud providers like AWS offer GPU instances but they are complex to set up and manage which also becomes a distraction. fully managed services like Gradient° (also includes affordable GPUs) eliminate this headache so you can focus all your energy on becoming a Deep Learning Developer.
- Do fast.ai V1, Practical Deep Learning For Coders. This is a very good course that covers the basics. Focuses on implementation over theory.
- Start reading research papers. This is a good list of the a few early and important papers in Deep Learning. They cover the fundamentals.
- Pick either one of the two, Pytorch / TensorFlow and start building things. Get very comfortable with the framework you choose. Build extensive experience with one so that you become very versatile and know the ins and outs of the framework.
- PyTorch: Easy to experiment with and won’t take long to jump in. Has a good number of tutorials and lots of community support (My goto library) You can control almost every aspect of the pipeline and is very flexible. Fast.ai V1 will give you sufficient experience in PyTorch.
- TensorFlow: Has a moderate learning curve and difficult to debug. Has more features, tutorials than PyTorch and a very strong community.
- Keras: A lot can be down with Keras and it’s easy to learn, however, I’ve always found it to have too many black boxes and at times, difficult to customize. But, If you’re a beginner looking to build quick and simple neural nets, Keras is brilliant.
- Start doing projects in an area you’re interested in. Build a good profile. Areas include - Object Detection, Segmentation, VQA, GANs, NLP etc. Build applications and open source them. If you’re in school, find professors and start doing research under them. In my experience, companies seem to value research papers and popular open source repositories almost equally.
Time: 4-6 weeks
By now, you should,
- Have a good understanding of Deep Learning.
- Have 2-3 projects in Deep Learning.
- Know how to build Deep Learning models comfortably in a popular framework.
You can start applying for internships and jobs now, and this is sufficient. Most startups care about how well you can build and optimize a model and if you have the basic theoretical knowledge. But to have a shot at the big companies you need to go delve have a good understanding of the math and theory.
This is where things get interesting. You dive deeper into the theory and work on bigger and more ambitious projects.
Math is the bread and butter of Machine Learning and is very important in interviews. Make sure you understand the basics well.
- Linear Algebra: Do Ch. 2 of The Deep Learning book. You can use Gilbert Strang’s MIT OCW course as a reference.
- Calculus: The Matrix Calculus You Need For Deep Learning is a very good and relevant resource.
- Probability: Read more about Probability Theory and statistics - Introduction to Probability, Statistics, and Random Processes by Hossein Pishro-Nik. is brilliant. I highly recommend this over any MOOC or textbook. Solid theory with a focus on brevity, sufficient examples and problems with solutions. Follow this with Ch. 3 of the Deep Learning book.
- Optimization: These Course notes from NYU are a very good read. Week 5 of Mathematics for Machine Learning on Coursera is a very good resource too. Do Ch. 4 of the Deep Learning book to solidify your understanding.
- Do Ch. 5 of the Deep Learning book. It’s a rich condensed read. 40-50% of a ML/DL interview is usually on Machine Learning.
- Reference: Bishop - Pattern Recognition and Machine Learning (Be warned, this is a difficult text!)
- Do the Deep Learning Specialization on Coursera. There are 5 courses
- Neural Networks and Deep Learning: Goes deeper into the subject and will be a good continuation from fast.ai V1.
- Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization: This is probably the most important of the courses and covers important topics frequently asked in interviews (BatchNorm, Dropout, regularization etc)
- Structuring Machine Learning Projects: This will teach you to build a ML model and will give you practical tips. (Can be skipped and done later if strapped for time)
- Convolutional Neural Networks: This course explores the theory and the practical applications of CNN’s in depth.
- Sequence Models: Explores natural language models (LSTMs, GRUs etc) and NLP, NLU and NMT.
- Continue working on bigger and more ambitious projects in Deep Learning. Push your projects to GitHub and have an active GitHub profile.
- A good way to learn more about Deep Learning is to reimplement a paper. Reimplementing a popular paper (from a big lab like FAIR, DeepMind, Google AI etc) will give you very good experience.
Time: 3 months
At this stage, you should have a good theoretical understanding and sufficient experience in Deep Learning. You can start applying to better roles and opportunities.
What to do next?
- If you’re adventurous, read Bishop’s Pattern Recognition and Machine Learning to gain a very good understanding of Machine Learning.
- Read the rest of the Deep Learning book (Ch. 6 - Ch. 12 cover the relevant bits)
- Go through PyTorch or TensorFlow source code to see how they’ve implemented basic functionality. Also, Keras’ source code and structure is very simple, so you can use that as a start.
- Cs231n’s assignments are pretty good. The best way to understand Dropout, Batchnorm and Backprop is by coding them in NumPy!
- In my experience, Interviews = Data Structures and Algorithms + Math + Machine Learning + Deep Learning. A rough break up would be - Math = 40%, Classical Machine Learning = 30%, Deep Learning = 30%.
- Real world experience will teach you loads. Do remote gigs (AngelList is an awesome resource) or deploy a Machine Learning model like this: https://platerecognizer.com/
- Jupyter Lab/notebook is very good for experimentation and debugging, but has its cons. Use a standard text editor/IDE (Sublime Text, atom, PyCharm) over Jupyter Notebook. It’s faster and helps in writing good, reproducible code.
- Keep up to date with research. To push the accuracy of your models you will need to keep up with the research. And research in Deep Learning moves very fast. Popular Conferences include:
- Computer Vision: CVPR, ICCV, ECCV, BMVC.
- Machine Learning and Reinforcement Learning (Theoretical): NeurIPS, ICML, ICLR
- NLP: ACL, EMNLP, NAACL