365 points • Hellr0x
100days Data Science Challenge!
One month ago I made this post about starting my curriculum for DS/ML and got lots of great advice, suggestions, and feedback. Through this month I have not skipped a single day and I plan to continue my streak for 100 days. Also, I made some changes in my "curriculum" and wanted to provide some updates and feedback on my experience. There's tons of information and resources out there and it's really easy to get overwhelmed (Which I did before I came up with this plan), so maybe this can help others to organize better and get started.
​
Math:
 Linear Algebra:
 Udemy course: Become a Linear Algebra Master
 Book: Linear Algebra Done Right
 YouTube: Essence of linear algebra
I've been doing exercises from the book mainly but the Udemy course helps to explain some topics which seem confusing in the book. 3Blue1Brown YT is a great supplement as it helps to visualize all the concepts which are massive for understanding topics and application of the Linear algebra. I'm through 2/3 of the class and it already helps a lot with statistics part so it's mustdo if you have not learned linear algebra before
 Statistical Learning
 Book: An Introduction to Statistical Learning with Application in R
 YouTube 1: Data Science Analytics
 YouTube 2: StatQuest
ITSL is a great introductory book and I'm halfway through. Well explained with great examples, lab works and exercises. The book uses R but as a part of python practice, I'm reproducing all the lab works and exercises in Python. Usually, it's challenging but I learn way more doing this. (If you'll need python codes for this book's lab works let me know and I can share) The DSA YT channel just follows the ITSL chapter by chapter so it's a great way to read the book make notes and watch their videos simultaneously. StatQuest is an alternative YT channel that explains ML concepts clearly. After I'm done with ITSL I plan to continue with a more advanced book from the same authors
Programming:
 I use the Dataquest Data Science path and usually, I do onetwo missions per day. The program is wellstructured and gives what you will need at the job, but has a small number of exercises. So when you learn something it's a good idea to get some data and practice on it.
 Udemy: Machine Learning AZ
 I use their videos after I finish the chapter in ITSL to see how t code regressions etc. But their explanation of statistics behind models is limited and vague. Anyway, a good tutorial for coding
 Book: Think Python
 Good intro book in python. I know the majority of concepts from this book but exercises are sweet and here and there I encounter some new topic.
 Leetcode/Hackerrank
 Mainly for SQL practice. I spend around 40 minutes to 1 hour per day (usually 5 days per week). I can solve 7080% of easy questions on my own. Plan to move to mediums when I'm done with Dataquest specialization.
 Projects:
 Nothin massive yet. Mainly trying to collect, clean and organize data. Lots of you suggested getting really good at it, as usual, that's what entrylevel analysts do so here I am. After a couple of days, I'm returning to my previous code to see where I can make my code more readable. Where I can replace lines of code with function not to be redundant and make more reusable code. And of course, asking for feedback. It amazes me how completely unknown people can take their time to give you comprehensive and thorough feedback!
​
I spend 45 hours minimum every day on the listed activities. I'm recording time when I actually study because it helps me to reduce the noise (scrolling on Reddit, FB, Linkedin, etc.). I'm doing 25minute cycles (25 minutes uninterrupted study than a 5minute break). At the end of the day, I'm writing a summary of what I learned during that day and what is the plan for the next day. These practices help a lot to stay organized and really stick to the plan. On the lazy days, I'm just reminding myself how bad I will feel If I skip the day and break the streak and how much gratification I will receive If I complete the challenge. That keeps me motivated. Plus material is really captivating for me and that's another stimulus.
What can be a good way to improve my coding, stats or math? any books, courses, or practice will you recommend continuing my journey?
Any questions, suggestions, and feedback are welcome and encouraged! :D
Cool, a certification will definitely help you. You can try online resources in the mean time. Udemy, Udacity, Coursera, LinkedIn Learning, Linux Academy, ai.google are pretty good resources that you can use and learn from in the meantime.
I am a cloud and big data engineer and have used Udemy a lot. It is fairly cheap and has a "sale" almost every few days. A ML Engineer at my company recommended me Machine Learning AZ™: HandsOn Python & R In Data Science course. I enver got around to finish this course and it's not exactly a certification but pretty dope for learning.
I really liked Machine Learning AZ: HandsOn Python & R In Data Science.
There’s a lot of courses out there, but I really like this one because you get practical examples you can use immediately in the real world.
You won’t be a machine learning expert after completing it, but you will understand the fundamentals and you will be able to create models.
After you finish this, you could start creating models at your work, or you could take more indepth courses. Andrew Ng machine learning courses on Coursera is often recommended and is much more indepth.
I can tell you about Ng's ML course. I completed it last year. It is very Mathematical. All the important ML algorithms are explained in great detail with their mathematical intuition. Linear regression, logistic regression, Neural networks, Support Vector Machines, Dimensionality Reduction, Anamoly Detection, and Recommender Systems are the major topics that are covered. Along with this, Ng shares his knowledge on the nuanced topics like Regularization, Gradient Decent, Pipelining and some general advices along the way.
The only drawback of the course is that you won't be applying these algorithms to real world datasets. All you will be doing is coding out the algorithms in OCTAVE or MATLAB which I think is pretty much outdated. Python and R are widely used for Machine Learning now. When you will start participating in ML competitions on Kaggle etc., you will have no idea what to do. You don't need to write down the algorithms, you'll need to simply import the module. You will end up taking another course. The course is also not updated much after it's release, it is kind of a classical course.
If knowing the Math behind the above mentioned algorithms is your aim then go for it (Probability models and Tree Models are not included in the course).
If you are more inclined towards learning the practical implementation, I suggest you enroll to Krill Eremenko's Udemy course Machine Learning AZ™: HandsOn Python & R In Data Science'. He is a great tutor and his teaching is very early to follow. Also, if you want to understand how algorithms actually work, you MUST subscribe to StatQuest on YouTube. He is the best out there. I hope this helps.
If you're looking for online courses you can try the udemy ones.. I went through these when i was a beginner. https://www.udemy.com/course/machinelearning/ and https://www.udemy.com/course/deeplearning ). Then I did the Google Tensorflow course on coursera and the linked kaggle NYC taxi fare prediction challenge. The udemy one covers a wide range of topics from statistics, machine learning and AI but its basic, for desktop modelling, not production scale. The tensorflow course is more narrow on just linear, nonlinear models and fully connected neural networks , but prepares you for production scale coding.
Thanks! There are nowadays lots of good courses/tutorials to get you started and I'm gonna provide few courses/books here that I found useful in my ML journey:
 http://faculty.marshall.usc.edu/garethjames/ISL/ Excellent book to get you started. This books contains moderate amount of math but I found this one still easy to grasp. Book provides also nice R code snippet to test models on different datasets
 https://www.udemy.com/course/machinelearning/ This is a great (but lengthy) course to get you started in Machine Learning. This basically skips most of the math and goes straight into hands on learning with Python and R provided for this course. In my opinion this a good starting point
Those two were deal breakers for me that helped me to get into Machine Learning. Remember that learning ML is not a sprint it is a marathon
