How to learn Data Science
Our resident data scientist, Tom Davenport, talks through his favourite websites for learning data science.
I get asked a lot about the best way to learn data science. You can go on a course at a college or university, but there are many online learning resources which are very good at getting you started. What I list below isn’t designed to be exhaustive. With most of these sites, I’ve found the communities in the forums to be just as useful as the course content.
Overall, my favourite site for learning data science is www.kaggle.com. They provide interesting tutorials on machine learning and programming. They also host competitions (with big prizes) which can offer invaluable experience in applying machine learning to a practical example.
The best part however is learning from other’s scripts. Previous winners often share the code they used to explore and model the data, and this presents a really good resource to understand how to use tools and structure your code well.
If you have access to www.lynda.com, it can be a really great resource for learning how to do a huge range of technologies, skills and much more. They have series on R and Python as well as other series that look at how these are applied to specific business problems.
One of my favourite sites I use for brushing up on my coding languages or approaching ones that I haven’t used before is www.codecademy.com.
They offer interactive tools where you can learn a programming language through your browser, without having to download or install anything. It tracks your progress by awarding you points as you complete different sections, which can be satisfying. They also produce certifications that you can share on your linkedin profile.
If you have some programming experience, I also really like www.hackerrank.com. This site provides coding challenges and tutorials for a wide range of languages and the code is also run in your browser.
The most inventive way to learn coding that I’ve found recently is www.codecombat.com, which is a Role Playing Game based around using code snippets to progress through a game. It’s used a lot in high schools, but is quite informative and entertaining.
For nice machine learning tutorials, I also recommend www.analyticsvidhya.com. They provide very easy to follow tutorials for a range of projects, interesting blog articles and theory.
Subscribing to R Bloggers and the weekly email from Cross Validated, the section of Stack Overflow that deals with machine learning, also offers an insight into new tools coming on to the stage and some innovative examples of how machine learning is being used.
Machine Learning Mastery
Jason Brownlee’s site www.machinelearningmastery.com also is great for theory and provides a lot of practical examples. His books are great too!
For a list of the tools and languages I use, see my post on Tools and Languages for Data Science.