Free data science resources
Overview
The goal of this page is to gather resources and learning materials across a broad range of popular data science topics and arrange them thematically. Resources have been selected because they are
- High quality
- Free of charge
- Don’t require readers to sign up
Remember that material that is offered freely on the web is paid for by the author’s time - if you find a resource particularly useful, consider supporting them in whatever way they prefer. If you find this page useful please share it and spread the word! If you find a mistake or broken link, please file an issue or submit a pull request.
Key to resource types
- 🎓 = Course
- 📝 = Tutorial or blog post
- 📚 = Book or book chapter
- ▶️ = Video or webinar
- 🎧 = Podcast or audio recording
- 👥 = Community or user forum
- 📜 = Journal or technical article
- 💡 = Cheat sheet
- ✅ = List
Software & Programming
Getting started with R
- 📚 Modern Dive: Getting Started by Chester Ismay and Albert Y.
Kim.
- The very first of first steps. Install R & RStudio and what to do after that.
- 📝 RYouWithMe: Basic Basics by Lisa Williams, RLadies
Sydney.
- Tour of RStudio, installing and using packages and getting data into RStudio.
- 🎓 Teacups, Statistics and Giraffes by Hasse Walum and Desirée de
Leon.
- Accessible introduction to R and statistics with interactive coding exercises.
- ▶️ A Gentle Introduction to Tidy Statistics in R by Thomas Mock,
RStudio.
- Webinar covering exploratory data analysis, tidyverse, statistical testing and plotting.
- 🎓 The R Bootcamp by Ted Laderas and Jessica
Minnier.
- A tidyverse-centric interactive course for data manipulation, graphics, data reshaping, and statistical modelling.
- 🎓 RStudio Primers by
RStudio.
- Interactive tutorials from RStudio covering data manipulation, visualisation and programming with R.
- 🎓 Swirl: Learn R, in R by Ismael Fernández, Nick Carchedi and
Sean Kross.
- Learn R with interactive courses in the console.
- 🎓 Using R for Data Journalism by Andrew Ba
Tran.
- Video supported intro course with emphasis on wrangling and visualisation.
- 📚 R for Data Science by Garrett Grolemund and Hadley
Wickham.
- Comprehensive guide to using R programming for data science workflows.
- 📚 Introduction to Data Science: Data Analysis and Prediction
Algorithms with R by Rafael A.
Irizarry.
- Introduction to data science focused topics in R: visualisation, wrangling, prediction and workflow.
- 💡 Base R Cheat Sheet by Mhairi
McNeill.
- Quick overview of basic R functionality.
Advancing with R
- 📚 Tidynomicon - A Brief Introduction to R for People Who Count
From Zero by Greg Wilson.
- An introduction to R for Python users.
- 📚 Hands-on Programming with R by Garrett
Grolemund.
- A friendly introduction to the R language for non-programmers.
- 📚 R Cookbook: Proven Recipes for Data Analysis, Statistics, and
Graphics by James (JD) Long, Paul Teetor.
- Recipes and worked examples for performing core tasks in R.
- 📝 R package primer: a minimal tutorial by Karl
Broman.
- Overview of R packages development.
- 📚 R Packages by Hadley Wickham and Jennifer
Bryan.
- Comprehensive guide to how R packages work and how to write your own.
- 📚 Efficient R programming by Colin Gillespie and Robin
Lovelace.
- Comprehensive introduction to writing faster and more efficient R code.
- 📚 Advanced R by Hadley Wickham.
- Get deeper into R programming fundamentals, object oriented and functional programming concepts and a lot more. A must-read for experience R users!
- ▶️ RStudio Webinars by
RStudio.
- Recordings of past RStudio webinars covering a variety of R and data science content.
- 📚 An Introduction to R by W. N. Venables, D. M. Smith and the R
Core Team.
- Introduction to R written by the R-Core team.
- 📚 / 🎓 Data science for economists by Grant
McDermott.
- Slides and code examples covering wide ranging introduction to data science in R.
- 📚 / 🎓 Big Data in Economics by Grant
McDermott.
- Notes cover the use of R with shell, GitHub, web scraping, docker and cloud compute.
- 📚 Handling Strings with R by Gaston Sanchez and Chitra
Venkatesh.
- Detailed introduction to strings, manipulation, regex and text wrangling.
- ▶️ R Package Development by John
Muschelli.
- 6-part video series on the basics of R package development, testing
and building a
pkgdownsite.
- 6-part video series on the basics of R package development, testing
and building a
Getting started with Python
- 📝 Install Python and Anaconda by
Anaconda.
- The most commonly used package and environment manager for Python and how to install it.
- 🎓 pandas Tutorial by
W3Schools.
- Beginner-friendly, in-browser introduction to pandas with interactive “try it yourself” examples.
- 📝 Quick reference to Python in a single script and notebook by
Kevin Markham.
- Comprehensive reference guides for Python programming via notebooks and script examples.
- 🎓 / ▶️ An Introduction to Python and Programming by Alexander
Hess.
- Python course for aspiring data scientists via notebooks, videos and exercises.
- 📚A Whirlwind Tour of Python by Jake
VanderPlas.
- A fast-paced introduction to essential features of the Python language for those already familiar with another language.
- 🎓 Learn Python by Ron Reiter.
- Interactive online courses and tutorials for a wide range of Python topics.
- 💡 Pandas Cheat Sheet by the Pandas development
team.
- 2-page quick reference to the most commonly used
pandasfunctions.
- 2-page quick reference to the most commonly used
- 📝 Getting Started in pandas by the Pandas development team.
- Tutorials and quick start guides from the
pandasdevelopment team.
Advancing with Python
- 📚 Python Data Science Handbook by Jake
VanderPlas.
- Online book with comprehensive coverage of IPython, numpy, pandas, matplotlib and machine learning with scikit-learn.
- 📚 Python for Everybody: Exploring Data Using Python 3 by
Charles R. Severance.
- Python ebook with a focus on programming fundamentals. Translations available in several languages.
- 📝 Python Packaging User Guide by the Python Packaging Authority
(PyPA).
- A collection of tutorials and references to help you distribute and install Python packages with modern tools.
Polars
- 📝 Polars User Guide by the Polars
team
- The official, example-driven guide to Polars’ expressions, contexts and lazy execution.
- 📚 Modern Polars by Kevin
Heavey
- A side-by-side comparison of idiomatic Polars and pandas, with commentary on API design and performance.
Shell
- 🎓 Learn Shell by Ron Reiter.
- A browser-based interactive Shell tutorial covering basics through to advanced topics.
- 🎓 The Unix Shell by Software
Carpentry.
- Tutorials and examples of how to use the unix shell.
- 📝 Beginners/BashScripting by Ubuntu
Documentation.
- Introduction to using the shell for OS navigation and scripting.
- ▶️ How to Write a Shell Script using Bash Shell in Ubuntu by FS
Tutorial
- Short video showing how to write a first shell script using vim.
- 🎓 / ▶️ The Missing Semester of Your CS Education by Anish
Athalye, Jon Gjengset and Jose Javier Gonzalez
Ortiz
- Videos and notes on using shell and version control.
- 📝 The Art of the Command Line by Joshua
Levy
- Useful list of bash commands and explanations, all laid out on a single page!
- 📝 ExplainShell.com by Idan Kamara
- Handy utility - type in a shell command and get an explanation of what it does.
Regular expressions
- 🎓 RegexOne: Learn Regular Expressions with simple, interactive
exercises. by RegexOne
- Simple, browser based course with interactive exercises.
- 📝 Regular Expressions 101: Online Regular Expression Tester and
Debugger by Firas Dib
- Very handy tool to test regular expressions against test strings.
- 💡 Data Science Cheat Sheet: Python Regular Expressions by
Dataquest
- PDF cheat-sheet for standard regular expression syntax.
- 💡Regular Expressions Cheat Sheet by Dave
Child
- PDF cheat-sheet for standard regular expression syntax.
Web scraping
- 📚 Automate the Boring Stuff with Python (3e): Web Scraping by Al
Sweigart
- A beginner-friendly, project-driven walkthrough of downloading and parsing web pages with requests, Beautiful Soup, Selenium and Playwright.
- 📚 R for Data Science (2e): Web scraping by Hadley Wickham, Mine
Çetinkaya-Rundel and Garrett
Grolemund
- A clear introduction to extracting data from HTML pages in R with the rvest package, including selectors and scraping ethics.
Git
- 📚 Happy Git and GitHub for the useR by Jenny Bryan, the STAT 545
TAs and Jim Hester
- If you are an R user and new to git, this is currently the best place to start.
- 📝 An introduction to Git and how to use it with RStudio by
François Michonneau
- Conceptual overview of what git is and how to use it, with particular emphasis on Github and its use with RStudio.
- 💡 Git Cheat Sheet by
GitHub
- A list of the main git shell commands.
- 📚 Pro Git by Scott Chacon and Ben
Straub
- Free ebook covering more advanced usage of git - good once you’re confident with the basics.
- 📝 Oh Shit Git! by Katie Sylor-Miller
- Light-hearted troubleshooting guide for when things inevitably go wrong!
- 📝 Step-by-step guide to contributing on GitHub by Kevin
Markham
- Detailed guide on how to contribute to open source software projects using git and Github.
Spark
- 💡 PySpark Cheat Sheet by Kevin
Schaich
- A handy single-page reference of the most commonly used PySpark DataFrame operations.
- 🎓 Mastering Spark with R by Javier Luraschi, Kevin Kuo and Edgar
Ruiz
- Comprehensive guide to analysing large-scale data from R using the
sparklyrpackage.
- Comprehensive guide to analysing large-scale data from R using the
- ▶️ R & Spark: How to Analyze Data Using RStudio’s Sparklyr by
Nathan Stephens
- Webinar demonstrating how to connect to Spark and analyse data from
R with
sparklyr.
- Webinar demonstrating how to connect to Spark and analyse data from
R with
- 📚A Gentle Introduction to Apache Spark by
Databricks
- Short ebook introducing Apache Spark’s core concepts, architecture and APIs.
SQL
- 📚 / 🎓 The SQL Tutorial for Data Analysis by mode.com. Tutorials and interactive exercises teaching fundamentals of SQL.
- 🎓 SQLBolt: Learn SQL with simple, interactive
exercises.
- Learn SQL from the ground up through a series of short, interactive browser-based lessons.
- 📚 / 🎓 SQLZoo: SQL Tutorial. Wikibook with interactive exercises.
- 📚 / 🎓 Select Star SQL by Zi Chong
Kao.
- Free, ad-free interactive book that teaches SQL through real-world datasets in the browser.
- 🎓 Intro to SQL: Querying and managing data by Khan
Academy
- Video-based course on querying and managing data with SQL.
- 🎓 LearnSQLOnline by Ron Reiter
- Browser-based interactive tutorial covering SQL fundamentals.
DuckDB
- 📝 DuckDB Documentation by the DuckDB
team
- The official documentation hub covering installation, SQL and client APIs across many languages.
- 📝 SQL Introduction by the DuckDB
team
- A hands-on tour of core SQL operations – tables, queries, joins and aggregates – using DuckDB.
Docker
- 📝 An Introduction to Docker for R Users by Colin
Fay
- A short guide to using Docker to make R analyses portable and reproducible.
- 🎓 R Docker tutorial by Jemma
Stachelek
- Hands-on tutorial on running R and RStudio inside Docker containers.
- ▶️ Docker and Python: making them play nicely and securely for Data
Science and ML by Tania Allard at PyCon
2020
- Talk on building Docker images for Python data science work safely and efficiently.
Markdown, LaTeX and publishing
- 📚 R Markdown: The Definitive Guide by Yihui Xie, J. J. Allaire,
Garrett Grolemund
- The comprehensive reference for authoring reproducible documents and reports with R Markdown.
- 📚 bookdown: Authoring Books and Technical Documents with R
Markdown by Yihui Xie
- Guide to writing books and long-form technical documents with R Markdown.
- 📝 Get Started with Quarto by
Posit
- The official starting point for authoring and publishing reproducible documents, reports and slides with Quarto.
- 📚 R for Data Science (2e): Quarto by Hadley Wickham, Mine
Çetinkaya-Rundel and Garrett
Grolemund
- A concise, practical chapter on weaving code, results and prose into a single reproducible document.
- 📚 Quarto for Scientists by Nicholas
Tierney
- A gentle, scientist-focused guide to producing reproducible reports, papers and slides with Quarto.
- 📚 The Not So Short Introduction to LaTeX 2ε by Tobias
Oetiker
- A concise and practical introduction to typesetting documents with LaTeX.
- 📚 LaTeX for Beginners by UoE IS
Services
- Step-by-step primer for getting started with LaTeX.
Data engineering and pipelines
- 🎓 Data Engineering Zoomcamp by
DataTalksClub
- A free, end-to-end course building production-ready pipelines with dbt, orchestration, BigQuery, Spark and Kafka.
- 📚 dbt Documentation by dbt Labs
- The official documentation for building, testing and deploying analytics transformations with dbt.
- 📝 Apache Airflow Tutorials by the Apache Software
Foundation
- The official hands-on guides for authoring your first workflows and DAGs in Apache Airflow.
- 📚 The {targets} R Package User Manual by Will
Landau
- A thorough guide to building reproducible, dependency-aware analysis pipelines in R.
- 📝 minimal make: A minimal tutorial on make by Karl
Broman
- A short, approachable tutorial on using GNU Make to automate reproducible research workflows.
Machine Learning
Theory
- 📚 The Elements of Statistical Learning: Data Mining, Inference,
and Prediction by Trevor Hastie, Robert Tibshirani and Jerome
Friedman
(2017)
- The classic, mathematically rigorous reference on statistical and machine learning.
- 📚 Computer Age Statistical Inference: Algorithms, Evidence and
Data Science by Bradley Efron and Trevor Hastie
(2017).
- A statistical approach to data science and machine learning.
- 📚Mathematics for Machine Learning by Marc Peter Deisenroth, A.
Aldo Faisal, Cheng Soon
Ong
- Covers the underpinning theory to many ML algorithms, a useful reference for practitioners.
- 📝 distill.pub by multiple contributors, edited by Shan Carter
and Chris Olah
- Online scientific journal publishing very high-quality, interactive articles on ML. On hiatus as of 2021.
- 📚 Mining of Massive Datasets by Jure Leskovec, Anand Rajaraman,
Jeff Ullman
- Book based on Stanford Computer Science course CS246: Mining Massive Datasets.
- 📚 Introduction to Statistical Learning by Gareth James, Daniela
Witten, Trevor Hastie and Robert
Tibshirani
- ISLR is still one of the most important books for getting started in practical ML.
Interpretability
- 📚 Interpretable Machine Learning: A Guide for Making Black Box
Models Explainable by Christoph Molnar
(2022)
- A highly practical introduction to IML, required reading if you are new to the topic.
- ✅ Awesome: Machine Learning Interpretability by Patrick
Hall
- A big list of MLI resources with >2.5k github stars.
Guides, tutorials and courses
- 🎓 Machine Learning Crash Course with TensorFlow APIs by
Google
- fast-paced, practical introduction to machine learning, with video lectures, real-world case studies, and hands-on practice exercises.
- 📝 Tidymodels Tutorials by
RStudio
- Variety to beginners guides to solving common ML tasks with R’s tidymodels.
- 🎓 Supervised Machine Learning Case Studies in R by Julia
Silge.
- Easy-to-follow in-browser beginner’s guide to using R’s tidymodels for practical ML.
- 📝 / ▶️ Introduction to machine learning with scikit-learn by
Justin Markham
- Bite size study videos and python notebooks by Justin Markham’s Data School.
- 📝 scikit-learn User Guide by
scikit-learn
- sci-kit learn’s documentation are very thorough and a great standalone learning resource!
- 🎓 Introduction to Machine Learning for Coders by Jeremy
Howard.
- 24 hours of videos and supporting notes from a Kaggle superstar.
Deep learning
- 📚 Dive into Deep Learning by Aston Zhang, Zachary Lipton, Mu Li
and Alexander Smola
- Interactive deep learning book with runnable code across PyTorch, TensorFlow and JAX.
- 📚 Deep Learning by Ian Goodfellow, Yoshua Bengio and Aaron
Courville
- The foundational graduate-level textbook on the theory behind modern deep learning.
- 🎓 Practical Deep Learning for Coders by Jeremy Howard,
fast.ai
- Top-down course teaching state-of-the-art deep learning for vision, text and tabular data with PyTorch.
- ▶️ Neural Networks: Zero to Hero by Andrej
Karpathy
- Video series building neural networks, and eventually a GPT, from scratch starting at backpropagation.
- 🎓 MIT 6.S191: Introduction to Deep Learning by Alexander Amini
and Ava Soleimany
- MIT’s introductory deep learning course with open lectures, slides and software labs.
Computer vision
- 🎓 CS231n: Deep Learning for Computer Vision by Stanford
University
- Stanford’s computer vision course, with freely available lecture notes, Python primers and assignments.
- 📝 OpenCV Tutorials by the OpenCV
team
- Official tutorials spanning image processing, object detection, feature matching and the deep neural network module.
Reinforcement learning
- 📚 Reinforcement Learning: An Introduction (2nd ed.) by
Richard S. Sutton and Andrew G.
Barto
- The standard introductory textbook on reinforcement learning, offered as a complete free PDF by its authors.
- ▶️ Introduction to Reinforcement Learning by David Silver,
DeepMind
- A ten-lecture course covering reinforcement learning fundamentals through to function approximation and policy gradients.
- 📝 Spinning Up in Deep RL by
OpenAI
- An educational resource on deep RL with explanatory theory, a curated paper list and documented reference implementations.
Large language models and generative AI
- 🎓 Hugging Face LLM Course by Hugging
Face
- Hands-on course on large language models and NLP using the Hugging Face libraries.
- 📝 The Illustrated Transformer by Jay
Alammar
- A visual, intuitive walkthrough of the Transformer architecture and self-attention.
- 📚 Understanding Deep Learning by Simon J. D.
Prince
- Modern textbook covering transformers, diffusion and generative models, with a free PDF and notebooks.
- 🎓 Stanford CS224N: NLP with Deep Learning by Christopher
Manning
- Stanford’s NLP with deep learning course, with freely available slides, assignments and lecture videos.
- 📝 Prompt Engineering Guide by
DAIR.AI
- Reference guide to prompting techniques, applications and risks for large language models.
- 📝 Building Effective Agents by
Anthropic
- A practical breakdown of agent and workflow patterns, with guidance on when added complexity is worth it.
- 📝 Claude Cookbooks by
Anthropic
- Runnable notebooks covering retrieval, tool use, agents, evaluations and multimodal work with Claude.
- 🎓 Anthropic Courses by
Anthropic
- Sequential courses on API fundamentals, prompt engineering, prompt evaluations and tool use.
- 📝 Patterns for Building LLM-based Systems & Products by Eugene
Yan
- A survey of practical patterns for production LLM systems, from evaluation and retrieval to guardrails and feedback.
- 📝 Your AI Product Needs Evals by Hamel
Husain
- A grounded framework for evaluating LLM products, built around unit tests, human and model review, and A/B testing.
Data Science Practice
Software development
- 📝 Software development skills for data scientists by Trey
Causey
- Practical overview of the software engineering skills data scientists should develop.
- 📜 Hidden Technical Debt in Machine Learning
Systems
- Influential paper on the hidden, long-term maintenance costs of machine learning systems.
- 📝 How rOpenSci uses Code Review to Promote Reproducible Science
by Noam Ross, Scott Chamberlain, Karthik Ram and Maëlle
Salmon
- How rOpenSci runs open peer review of scientific R packages.
- 📜 Best Practices for Computational Science: Software
Infrastructure and Environments for Reproducible and Extensible
Research by Victoria Stodden and Sheila
Miguez
- Recommendations for building reproducible and extensible computational research software.
- 📝 Journalism as a Professional Model for Data Science by
Brian C.
Keegan
- Argues data science teams can learn from how newsrooms are organised and managed.
- 📝 Cookiecutter Data Science by
drivendata
- A standardised, logical project structure for doing and sharing data science work.
Hiring and building teams
- 📚 The Care and Feeding of Data Scientists: How to Build, Manage
and Retain a Data Science Team by Michelangelo D’Agostino and Katie
Malone
- Practical report on building, managing and retaining a data science team.
- 🎧 The Care and Feeding of Data Scientists: Becoming a Data Science
Manager on Linear Digressions podcast by Katie Malone and Ben
Jaffe
- Podcast episode on the transition from data scientist to data science manager.
- 📝 Models for integrating data science teams within companies by
Pardis
Noorzad
- Compares centralised, embedded and hybrid models for placing data science teams in an organisation.
- 🎧 Building Effective Data Science Teams with Kobi Abayomi,
Gregory Berg, Elaine McVey, Jacqueline Nolis, Nasir Uddin and Julia
Silge
- Panel webinar on what makes data science teams effective.
- 📝 Building a data team at a mid-stage startup: a short story by
Erik
Bernhardsson
- A short story illustrating how to grow a data team at a startup.
- 📝 Hiring a data scientist by Mikhail Popov, Wikimedia
- Wikimedia’s practical account of how they hire data scientists.
Agile data science
- 📚 Agile Data Science with R: A workflow by Edwin
Thoen
- Proposes a lightweight, agile workflow for data science projects in R.
- 📝 Data Science and Agile (What works, and what doesn’t) by
Eugene
Yan
- Honest reflection on which agile practices help, and which hinder, data science work.
- 📝 Data Science Best Practices: Run your data science team like an
engineering team by Leonard
Austin
- Argues for running a data science team with the discipline of an engineering team.
- 📝 Organizing machine learning projects: project management
guidelines by Jeremy
Jordan
- Project-management guidelines for structuring machine learning projects.
Ethics and fairness
- 📚 Ethics of Artificial Intelligence and Robotics by Stanford
Encyclopedia of
Philosophy
- A thorough philosophical survey of the ethical issues raised by AI and robotics.
- 📝 The Responsible Machine Learning Principles: A practical
framework to develop AI responsibly by The Institute for Ethical AI
& Machine Learning
- A practical framework of principles for developing AI responsibly.
- 📝 A Code of Ethics for Data Science by DJ
Patil
- A proposed code of ethics for practising data scientists.
- 📝 The Ethical Data Scientist by Cathy O’
Neil
- Essay arguing data scientists should take responsibility for the impact of their models.
- 📝 An ethics checklist for data scientists by
drivendata
- A command-line tool that adds an ethics checklist to your data science projects.
- 📚 Fairness and machine learning: Limitations and Opportunities
by Solon Barocas, Moritz Hardt, Arvind
Narayanan
- A textbook on the limitations and opportunities of fairness in machine learning.
- 📝 Practical Data Ethics by fast.ai
- A practical course on data ethics covering bias, disinformation and accountability.
MLOps
- 📝 MLOps: Continuous delivery and automation pipelines in machine
learning by Google
Cloud
- Google Cloud’s reference on continuous delivery and automation pipelines for machine learning.
- 📝 Using GitHub Actions for MLOps & Data Science by Hamel Husain,
The Github
Blog
- How to use GitHub Actions to automate data science and machine learning workflows.
- 📝 Continuous Delivery for Machine Learning: Automating the
end-to-end lifecycle of Machine Learning applications by Danilo
Sato, Arif Wider and Christoph
Windheuser
- A detailed pattern for automating the end-to-end machine learning lifecycle.
- 📝 Monitoring Machine Learning Models in Production: A
Comprehensive Guide by Christopher
Samiullah
- Comprehensive guide to monitoring machine learning models once they are deployed.
- 📝 What are Azure Machine Learning pipelines? by
Microsoft
- Microsoft’s overview of building machine learning pipelines in Azure.
- 📝 Getting started with Kubeflow Pipelines by Amy Unruh, Google
Cloud
- Getting started with building and running machine learning pipelines on Kubeflow.
- 📝 Continuous Machine Learning (CML) is CI/CD for Machine Learning
Projects by DVC.org
- Open-source tool bringing CI/CD practices to machine learning projects.
- 📝 Data Science Workflows by David
Neuzerling
- A practical look at structuring reproducible data science workflows.
- 🎓 Made With ML by Goku Mohandas
- Course combining machine learning and software engineering to design, deploy and iterate on production ML.
ML Platforms
- 📝 The problem with AI developer tools for enterprises (and what
IKEA has to do with it) by Clemens
Mewald
- Argues enterprise AI tooling should be modular and composable rather than monolithic.
Style Guides
- 📝 Udacity Git Commit Message Style Guide by
Udacity
- A concise convention for writing clear, consistent git commit messages.
- 📚 The tidyverse style guide by Hadley
Wickham
- The style guide for writing readable, consistent tidyverse-flavoured R code.
- 📝The Google R Style Guide by
Google
- Google’s conventions for formatting and structuring R code.
- 📝 The Google Python Style Guide by
Google
- Google’s conventions for writing clean, consistent Python code.
- 📝 PEP 8 – Style Guide for Python Code by Guido van Rossum, Barry
Warsaw, Nick Coghlan
- The official style guide for Python code.
Developing interactive applications
- ▶️ / 🎓 Learn Shiny by
RStudio
- Official tutorial for building interactive web apps in R with Shiny.
- 📚 A gRadual intRoduction to Shiny by Ted Laderas and Jessica
Minnier
- A gentle, step-by-step introduction to building Shiny apps.
- 📚 Interactive web-based data visualization with R, plotly, and
shiny by Carson Sievert
- Book on building interactive, web-based data visualisations with R, plotly and Shiny.
- 📚 Dashboards by Yihui Xie, J. J. Allaire, Garrett Grolemund. Chapter 5 from ‘R Markdown: The Definitive Guide’.
- 📝 Leaflet for R by RStudio
- Guide to making interactive maps in R with the Leaflet library.
- 📝 Dash User Guide by Plotly
- Documentation for building analytical web applications in Python with Dash.
- 📝 Getting Started with Streamlit by
streamlit
- Quick start guide for turning Python scripts into shareable data apps with Streamlit.
Visualisation
- 📚 Fundamentals of Data Visualization by Claus O.
Wilke
- A guide to making clear, compelling and truthful figures, with examples in ggplot2.
- 📚 ggplot2: Elegant Graphics for Data Analysis by Hadley
Wickham
- The definitive guide to producing graphics in R with ggplot2.
- 📚 Data Visualization: A Practical Introduction by Kieran
Healy
- A practical, hands-on introduction to data visualisation principles using R and ggplot2.
- 📝 3D Mapping and Visualization with R and Rayshader by Tyler
Morgan-Wall
- Masterclass materials on 3D mapping and visualisation in R with rayshader.
- 📚 Scientific Visualization: Python + Matplotlib by Nicolas P.
Rougier
- An open-access book on producing high-quality scientific figures with matplotlib.
- 📝 seaborn: statistical data visualization by Michael
Waskom
- The official tutorial for seaborn, covering statistical plotting built on matplotlib.
- 📝 The Python Graph Gallery by Yan
Holtz
- Hundreds of Python chart examples with reproducible code across matplotlib, seaborn, pandas and plotly.
- 📝 Plotly Open Source Graphing Library for Python by
Plotly
- Official documentation for Plotly’s free, open-source library for interactive charts.
Time series analysis
- 📚 Forecasting: Principles and Practice by Rob J Hyndman and
George Athanasopoulos
- The standard free textbook on time series forecasting, with worked examples in R.
- 📝 11 Classical Time Series Forecasting Methods in Python (Cheat
Sheet) by Jason
Brownlee
- A cheat-sheet of classical time series forecasting methods with Python code.
Generalised Additive Modelling (GAMs)
- 🎓 GAMs in R by Noam Ross Interactive course introducing Generalised Additive Models (GAMs).
- 📝 Resources for Learning About and Using GAMs in R by Noam
Ross
- A curated list of resources for learning about and using GAMs in R.
Mathematics
- ▶️ Essence of Linear Algebra by Grant Sanderson
(3Blue1Brown)
- A visual, intuition-first video series on vectors, transformations, determinants and eigenvectors.
- ▶️ Essence of Calculus by Grant Sanderson
(3Blue1Brown)
- A visual series building the central ideas of calculus from derivatives to integrals.
- 🎓 Linear Algebra (18.06) by Gilbert Strang, MIT
OpenCourseWare
- Strang’s classic full-semester course with complete video lectures, problem sets and exams.
- 📚 Introduction to Applied Linear Algebra: Vectors, Matrices, and
Least Squares by Stephen Boyd and Lieven
Vandenberghe
- A free textbook on the linear algebra most relevant to data science, with an emphasis on least squares.
- ▶️ Statistics 110: Probability by Joe Blitzstein,
Harvard
- A full set of Harvard lectures teaching probability from first principles.
Statistics
- 📚 Statistical Inference via Data Science: A Modern Dive into R and
the tidyverse by Chester Ismay and Albert Y.
Kim
- A modern, tidyverse-based introduction to statistical inference in R.
- 📚 Think Stats Exploratory Data Analysis in Python by Allen B.
Downey
- An introduction to exploratory data analysis and statistics using Python.
- 📚 Learning statistics with R: A tutorial for psychology students
and other beginners Danielle
Navarro
- A friendly, thorough introduction to statistics using R, aimed at beginners.
- 📚 Probabilistic Programming & Bayesian Methods for Hackers by Cameron Davidson-Pilon
- A computation-first introduction to Bayesian methods using Python and PyMC.
- 📚 From Algorithms to Z-Scores: Probabilistic and Statistical
Modeling in Computer Science by Norm
Matloff
- A probability and statistics text that builds concepts through computer science examples.
- 📚 Theory of Statistics by James E.
Gentle
- A rigorous graduate-level reference on the theory of mathematical statistics.
- 📚 Core Statistics by Simon
Wood
- A compact introduction to the core theory and practice of statistical inference.
- ▶️ / 🎓 Statistical Rethinking by Richard
McElreath
- Freely available video lectures, slides and homework teaching Bayesian modelling from first principles.
- 📚 Think Bayes 2 by Allen B.
Downey
- An introduction to Bayesian statistics through computation in Python.
- 📚 Bayes Rules! An Introduction to Applied Bayesian Modeling by
Alicia Johnson, Miles Ott and Mine
Dogucu
- An applied introduction to Bayesian modelling in R, covering MCMC and hierarchical models.
Causal inference
- 📚 Causal Inference: The Mixtape by Scott
Cunningham
- An applied introduction to causal inference and research designs, with code in R, Stata and Python.
- 📚 Causal Inference for The Brave and True by Matheus
Facure
- An open-source, Python-based introduction to causal inference, from fundamentals to ML methods.
- 📚 The Effect: An Introduction to Research Design and Causality
by Nick Huntington-Klein
- An accessible introduction to research design and causality, with code in R, Stata and Python.
Experimentation and A/B testing
- 📝 How Not To Run an A/B Test by Evan
Miller
- A concise, widely-cited explanation of why peeking at running experiments inflates false positives, and how to size and stop tests correctly.
- 🎓 A/B Testing at Scale (Tutorial) by the Experimentation
Platform
(exp-platform.com)
- Tutorial slides and video lectures on running trustworthy controlled experiments at scale.
- 📚 Trustworthy Online Controlled Experiments: Chapter 1 by Ron
Kohavi, Diane Tang and Ya
Xu
- The freely released opening chapter of the canonical book on A/B testing.
- 💡 Evan’s Awesome A/B Tools by Evan
Miller
- Free in-browser statistical calculators for planning and analysing A/B tests.
Spatial analysis
- 📚 Geocomputation with R by Robin Lovelace, Jakub Nowosad, Jannes
Muenchow
- A comprehensive, hands-on introduction to geographic data analysis and mapping in R.
- 📚 Spatial Data Science by Edzer Pebesma and Roger
Bivand
- A modern treatment of spatial and spatiotemporal data analysis in R.
- 📚 Geospatial Health Data: Modeling and Visualization with R-INLA
and Shiny by Paula
Moraga
- Modelling and visualising geospatial health data with R-INLA and Shiny.
Data Science community groups
Python groups
- 👥 PyData Meetup Groups
- Directory of local PyData meetup groups around the world.
- 👥 PyLadies by PyLadies
- An international mentorship community for women and friends in the Python community.
R groups
- 👥 Directory of R User Groups by Jumping
Rivers
- A maintained directory of R user groups worldwide.
- 👥 Complete list of R-Ladies groups by R-Ladies
Global.
- A directory of R-Ladies chapters around the world.
- 👥 R for Data Science Online Learning
Community
- The R4DS Online Learning Community is a community of R learners at all skill levels working together to improve their skills.
- 👥 Tidy Tuesday
- A weekly social data project for the R community, with a new dataset to explore and visualise each week.
- 👥SatRdays SatRdays +R-focused conferences that are held on Saturdays.
Natural language processing
- 📚 Text Mining with R: A Tidy Approach by Julia Silge and David
Robinson
- A tidy-data approach to text mining and analysis in R.
- 🎓 Advanced NLP with SpaCy by Ines
Montani
- A free, interactive course on doing modern NLP in Python with spaCy.
- 📜 100 Must read papers in NLP by Masato
Hagiwara
- A curated reading list of foundational papers in natural language processing.
- 🎓 Stanford CS 124: From Languages to Information by Dan Jurafsky + Stanford’s introductory course on language, information and the basics of NLP.
- 📚 Natural Language Processing with Python: Analyzing Text with the
Natural Language Toolkit by Steven Bird, Ewan Klein, and Edward
Loper.
- The official book for doing natural language processing in Python with NLTK.
- 🎓 A Code-First Intro to Natural Language Processing by
fast.ai
- The course is taught in Python with Jupyter Notebooks, using libraries such as sklearn, nltk, pytorch, and fastai.
- 📚 Speech and Language Processing by Dan Jurafsky and James H.
Martin
- The standard, comprehensive textbook covering both classical and neural NLP.
- ▶️ BERT Research Series by Chris
McCormick
- A video series unpacking how the BERT language model works.
Special Topics
- ▶️ Structural Equation Modelling by Erin M.
Buchanan
- A video playlist introducing structural equation modelling.
- 📝 PyTorch Tutorials and Recipes by
PyTorch
- Official tutorials and recipes for deep learning with PyTorch.
No resources matched your search.