27 KiB
27 KiB
awesome-pandas
A collection of resources for pandas (Python) and related subjects.
Contents: This is an unofficial collection of resources for learning pandas, an open source Python library for data analysis. Here you will find videos, cheat-sheets, tutorials and books / papers. The curated list is divided into three parts:
- pandas resources - A collection of videos, cheat-sheets, tutorials and books directly related to pandas.
- Data analysis with Python resources - Material related to adjacent Python libraries and software such as NumPy, scipy, matplotlib, seaborn, statsmodels and Jupyter.
- Miscellaneous related resources - Resources related to general data analysis, algorithms, computer science, machine learning, statistics, etc.
Pull requests are very welcome.
(1) 🐼 pandas resources
(1.1) 📺 Videos
The videos below were collected in December of 2017. They are all directly related to pandas, and the Level of a video is quantified roughly as follows:
- 😃 : Beginner - requires little knowledge to jump into, elementary topics.
- 😅 : Intermediate - some prior knowledge needed, more technical.
- 😱 : Advanced - very technical, or discusses advanced topics.
Title | Speaker | Uploader | Time | Views | Year | Level |
---|---|---|---|---|---|---|
Introduction Into Pandas | Daniel Chen | Python Tutorial | 1:28 | 46000 | 2017 | 😃 |
Introduction To Data Analytics With Pandas [repo] | Quentin Caudron | Python Tutorial | 1:51 | 25000 | 2017 | 😃 |
Pandas From The Ground Up [repo] | Brandon Rhodes | PyCon 2015 | 2:24 | 91000 | 2015 | 😃 |
Pandas for Data Analysis [repo] | Daniel Chen | Enthought | 3:45 | 13000 | 2017 | 😅 |
Optimizing Pandas Code [repo] | Sofia Heisler | PyCon 2017 | 0:29 | 12000 | 2017 | 😅 |
A Visual Guide To Pandas | Jason Wirth | Next Day Video | 0:26 | 49000 | 2015 | 😃 |
Analyzing and Manipulating Data with Pandas [repo] | Jonathan Rocher | Enthought | 3:33 | 22000 | 2016 | 😃 |
Time Series Analysis [repo] | Aileen Nielsen | PyCon 2017 | 3:11 | 9000 | 2017 | 😅 |
Predicting sports winners with pandas | Robert Layton | PyCon Australia | 0:38 | 13000 | 2015 | 😅 |
Pandas from the Inside [repo] [2016 talk] | Stephen Simmons | PyData | 1:17 | 3000 | 2017 | 😱 |
Know of a recent, good video? Send a pull request! 👍
(1.2) ❗ Cheat-sheets
- Data Wrangling with pandas
- The pandas DataFrame Object
- Python For Data Science - pandas Basics
- Python For Data Science - pandas
(1.3) 🎓 Tutorials
- https://github.com/jorisvandenbossche/pandas-tutorial
- https://github.com/guipsamora/pandas_exercises
- https://github.com/brandon-rhodes/pycon-pandas-tutorial
- https://github.com/jadianes/winerama-recommender-tutorial
- https://github.com/jonathanrocher/pandas_tutorial
- https://github.com/chendaniely/scipy-2017-tutorial-pandas
- https://github.com/tdpetrou/Learn-Pandas
- https://github.com/adeshpande3/Pandas-Tutorial
- https://github.com/GaelVaroquaux/sklearn_pandas_tutorial
- https://github.com/vi3k6i5/pandas_basics
- https://github.com/california-civic-data-coalition/first-python-notebook
(1.4) 📘 Books / papers
- [amazon] McKinney, Wes. Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. 2 edition. O’Reilly Media, 2017.
- [amazon] VanderPlas, Jake. Python Data Science Handbook: Essential Tools for Working with Data. 1 edition. O’Reilly Media, 2016.
(2) Data analysis with Python resources
(2.1) 📺 Videos
Title | Speaker | Uploader | Time | Views | Keyword | Year | Level |
---|---|---|---|---|---|---|---|
NumPy Beginner [repo] | Alexandre Chabot LeClerc | Enthought | 2:47 | 56000 | NumPy | 2016 | 🐍 🐍 |
Machine Learning | Andreas Mueller & Sebastian Raschka | Enthought | 3:03 | 47000 | sklearn | 2016 | 🐍 🐍 |
The Python Visualization Landscape | Jake VanderPlas | PyCon 2017 | 0:33 | 21000 | python | 2017 | 🐍 |
JupyterLab: Building Blocks for Interactive Computing | Brian Granger | Enthought | 0:29 | 28000 | jupyter | 2016 | 🐍 |
Machine Learning with Scikit Learn [repo] | Andreas Mueller & Kyle Kastner | Enthought | 3:22 | 48000 | sklearn | 2015 | 🐍 🐍 |
Machine Learning for Time Series Data in Python | Brett Naul | Enthought | 0:24 | 24000 | cesium | 2016 | 🐍 |
Computational Statistics [repo] | Allen Downey | Enthought | 2:05 | 10000 | scipy | 2017 | 🐍 🐍 |
Time Series Analysis [repo] | Aileen Nielsen | PyCon 2017 | 3:11 | 9000 | pandas | 2017 | 🐍 🐍 |
Learning TensorFlow | Robert Layton | PyCon Australia | 0:40 | 18000 | tensorflow | 2016 | 🐍 🐍 |
JupyterHub: Deploying Jupyter Notebooks | Min Ragan Kelley & Thomas Kluyver | PyData | 1:36 | 17000 | jupyter | 2016 | 🐍 |
Applied Time Series Econometrics | Jeffrey Yau | PyData | 1:39 | 17000 | statsmodels | 2016 | 🐍 🐍 |
Machine Learning with scikit learn [repo] | Andreas Mueller & Alexandre Gram | Enthought | 3:10 | 8000 | sklearn | 2017 | 🐍 🐍 |
Introduction to Numerical Computing with NumPy | Dillon Niederhut | Enthought | 2:27 | 8000 | NumPy | 2017 | 🐍 |
Dask - A Pythonic Distributed Data Science Framework | Matthew Rocklin | PyCon 2017 | 0:46 | 7000 | dask | 2017 | 🐍 🐍 |
Introduction to Statistical Modeling with Python [repo] | Christopher Fonnesbeck | PyCon 2017 | 3:19 | 7000 | scipy | 2017 | 🐍 🐍 |
Fully Convolutional Networks for Image Segmentation | Daniil Pakhomov | Enthought | 0:20 | 7000 | scipy | 2017 | 🐍 |
Exploratory data analysis in python [repo] | Chloe Mawer & Jonathan Whitmore | PyCon 2017 | 2:54 | 7000 | scipy | 2017 | 🐍 |
Libraries for Deep Learning with Sequences | Alex Rubinsteyn | PyData | 0:44 | 23000 | scipy | 2015 | 🐍 🐍 |
Numba - Tell Those C++ Bullies to Get Lost [repo] | Gil Forsyth & Lorena Barba | Enthought | 2:25 | 5000 | numba | 2017 | 🐍 🐍 |
Deploying Interactive Jupyter Dashboards | Philipp Rudiger | Enthought | 0:18 | 5000 | jupyter | 2017 | 🐍 🐍 |
Data Science Using Functional Python | Joel Grus | PyData | 0:44 | 18000 | python | 2015 | 🐍 🐍 |
Anatomy of matplotlib [repo] | Benjamin Root & Joe Kington | Enthought | 3:18 | 18000 | matplotlib | 2015 | 🐍 🐍 |
Anatomy of matplotlib [repo] | Benjamin Root | Enthought | 3:02 | 4000 | matplotlib | 2017 | 🐍 🐍 |
Data Science is Software [repo] | Peter Bull & Isaac Slavitt | Enthought | 2:12 | 9000 | jupyter | 2016 | 🐍 |
Machine Learning with Scikit Learn [repo] | Jake VanderPlas | PyData | 1:34 | 16000 | sklearn | 2015 | Novice |
Using Jupyter notebooks | Ioanna Ioannou | PyCon Australia | 0:28 | 8000 | jupyter | 2016 | Novice |
Parallel Python: Analyzing Large Datasets [repo] | Matthew Rocklin | Enthought | 3:05 | 7000 | scipy | 2016 | Novice |
Keynote: Project Jupyter | Brian Granger | Enthought | 0:48 | 7000 | jupyter | 2016 | Novice |
matplotlib beginner tutorial [repo] | Nicolas Rougier | Enthought | 2:59 | 6000 | matplotlib | 2016 | Novice |
Awesome Big Data Algorithms | Titus Brown | Next Day Video | 0:39 | 41000 | python | 2013 | Novice |
All About Jupyter | Brian Granger | PyData | 0:39 | 11000 | jupyter | 2015 | Novice |
PyMC: Markov Chain Monte Carlo | Chris Fonnesbeck | Enthought | 0:20 | 9000 | pyMC | 2014 | Novice |
Jupyter Advanced Topics Tutorial [repo] | Jonathan Frederic & Matthias Bussonier | Enthought | 2:48 | 4000 | jupyter | 2015 | Novice |
Using randomness to make code much faster | Rachel Thomas | SF Python | 0:54 | 1000 | scipy | 2017 | Novice |
Python Profiling & Performance | Mahmoud Hashemi | SF Python | 0:28 | 1000 | python | 2016 | Novice |
(2.2) ❗ Cheat-sheets
(2.3) 🎓 Tutorials
(2.4) 📘 Books / papers
- [amazon] Slatkin, Brett. Effective Python: 59 Specific Ways to Write Better Python. 1 edition. Addison-Wesley Professional, 2015.
- [amazon] Ramalho, Luciano. Fluent Python. 1st edition. O’Reilly, 2015.
- [amazon] Géron, Aurélien. Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. 1 edition. O’Reilly Media, 2017.
(3) Miscellaneous related resources
(3.1) 📺 Videos
Title | Speaker | Uploader | Time | Views | Keyword | Year | Level |
---|---|---|---|---|---|---|---|
How to become a Data Scientist in 6 months | Tetiana Ivanova | PyData | 0:56 | 148000 | misc | 2016 | 🐍 |
So you want to be a Python expert? | James Powell | PyData | 1:54 | 28000 | python | 2017 | 🐍🐍🐍 |
Transforming Code into Beautiful, Idiomatic Python | Raymond Hettinger | Next Day Video | 0:48 | 340000 | python | 2013 | 🐍 |
Modern Dictionaries | Raymond Hettinger | SF Python | 1:07 | 44000 | python | 2016 | 🐍 🐍 |
Keynote on Concurrency | Raymond Hettinger | SF Python | 1:13 | 15000 | python | 2017 | 🐍🐍 |
Pandas for Data Analysis [repo] | Daniel Chen | Enthought | 3:45 | 13000 | pandas | 2017 | 🐍🐍 |
The Fun of Reinvention | David Beazley | David Beazley | 0:52 | 11000 | python | 2017 | 🐍🐍🐍 |
Being a Core Developer in Python | Raymond Hettinger | SF Python | 1:02 | 19000 | python | 2016 | 🐍 |
Visualizing Geographic Data | Christopher Roach | PyData | 0:31 | 14000 | python | 2016 | 🐍 |
Builtin Superheroes | David Beazley | David Beazley | 0:44 | 12000 | python | 2016 | 🐍 🐍 |
Python’s Class Development Toolkit | Raymond Hettinger | Next Day Video | 0:45 | 80000 | python | 2013 | 🐍 🐍 |
The Other Async (Threads + Async = ❤️) - YouTube | David Beazley | David Beazley | 0:47 | 5000 | python | 2017 | 🐍 🐍 🐍 |
Functional Programming with Python | Mike Müller | Next Day Video | 0:27 | 44000 | python | 2013 | Novice |
Building a Recommendation Engine using Python | Anusua Trivedi | PyData | 0:37 | 11000 | python | 2015 | Novice |
Iterations of Evolution | David Beazley | David Beazley | 0:34 | 2000 | python | 2017 | Novice |
“Good Enough” IS Good Enough! | Alex Martelli | SF Python | 0:53 | 4000 | python | 2016 | Novice |
(3.2) ❗ Cheat-sheets
(3.3) 🎓 Tutorials
(3.4) 📘 Books / papers
- [amazon] Dasgupta, Sanjoy, Christos H. . Papadimitriou, and Umesh Virkumar. Vazirani. Algorithms. Boston, Mass: McGraw Hill, 2008.
- [amazon] Lloyd N. Trefethen. Numerical Linear Algebra. Society for Industrial and Applied Mathematics, 1997.
[amazon] Gene H. Golub. Matrix Computations. 4th ed. Johns Hopkins Studies in the Mathematical Sciences. Baltimore: Johns Hopkins University Press, 2013.
Every video is below.
Title | Speaker | Uploader | Time | Views | Keyword | Year | Level |
---|---|---|---|---|---|---|---|
How to become a Data Scientist in 6 months | Tetiana Ivanova | PyData | 0:56 | 148000 | misc | 2016 | 🐍 |
Introduction Into Pandas | Daniel Chen | Python Tutorial | 1:28 | 46000 | pandas | 2017 | 🐍 |
So you want to be a Python expert? | James Powell | PyData | 1:54 | 28000 | python | 2017 | 🐍🐍🐍 |
NumPy Beginner [repo] | Alexandre Chabot LeClerc | Enthought | 2:47 | 56000 | NumPy | 2016 | 🐍 🐍 |
Introduction To Data Analytics With Pandas | Quentin Caudron | Python Tutorial | 1:51 | 25000 | pandas | 2017 | 🐍 |
Transforming Code into Beautiful, Idiomatic Python | Raymond Hettinger | Next Day Video | 0:48 | 340000 | python | 2013 | 🐍 |
Machine Learning | Andreas Mueller & Sebastian Raschka | Enthought | 3:03 | 47000 | sklearn | 2016 | 🐍 🐍 |
Pandas From The Ground Up [repo] | Brandon Rhodes | PyCon 2015 | 2:24 | 91000 | pandas | 2015 | 🐍 🐍 |
Modern Dictionaries | Raymond Hettinger | SF Python | 1:07 | 44000 | python | 2016 | 🐍 🐍 |
The Python Visualization Landscape | Jake VanderPlas | PyCon 2017 | 0:33 | 21000 | python | 2017 | 🐍 |
Keynote on Concurrency | Raymond Hettinger | SF Python | 1:13 | 15000 | python | 2017 | 🐍🐍 |
Pandas for Data Analysis [repo] | Daniel Chen | Enthought | 3:45 | 13000 | pandas | 2017 | 🐍🐍 |
JupyterLab: Building Blocks for Interactive Computing | Brian Granger | Enthought | 0:29 | 28000 | jupyter | 2016 | 🐍 |
Optimizing Pandas Code for Speed and Efficiency | Sofia Heisler | PyCon 2017 | 0:29 | 12000 | pandas | 2017 | 🐍 🐍 |
A Visual Guide To Pandas | Jason Wirth | Next Day Video | 0:26 | 49000 | pandas | 2015 | 🐍 |
Machine Learning with Scikit Learn [repo] | Andreas Mueller & Kyle Kastner | Enthought | 3:22 | 48000 | sklearn | 2015 | 🐍 🐍 |
Machine Learning for Time Series Data in Python | Brett Naul | Enthought | 0:24 | 24000 | cesium | 2016 | 🐍 |
The Fun of Reinvention | David Beazley | David Beazley | 0:52 | 11000 | python | 2017 | 🐍🐍🐍 |
Analyzing and Manipulating Data with Pandas [repo] | Jonathan Rocher | Enthought | 3:33 | 22000 | pandas | 2016 | 🐍 |
Computational Statistics [repo] | Allen Downey | Enthought | 2:05 | 10000 | scipy | 2017 | 🐍 🐍 |
Being a Core Developer in Python | Raymond Hettinger | SF Python | 1:02 | 19000 | python | 2016 | 🐍 |
Time Series Analysis [repo] | Aileen Nielsen | PyCon 2017 | 3:11 | 9000 | pandas | 2017 | 🐍 🐍 |
Learning TensorFlow | Robert Layton | PyCon Australia | 0:40 | 18000 | tensorflow | 2016 | 🐍 🐍 |
JupyterHub: Deploying Jupyter Notebooks | Min Ragan Kelley & Thomas Kluyver | PyData | 1:36 | 17000 | jupyter | 2016 | 🐍 |
Applied Time Series Econometrics | Jeffrey Yau | PyData | 1:39 | 17000 | statsmodels | 2016 | 🐍 🐍 |
Machine Learning with scikit learn [repo] | Andreas Mueller & Alexandre Gram | Enthought | 3:10 | 8000 | sklearn | 2017 | 🐍 🐍 |
Introduction to Numerical Computing with NumPy | Dillon Niederhut | Enthought | 2:27 | 8000 | NumPy | 2017 | 🐍 |
Dask - A Pythonic Distributed Data Science Framework | Matthew Rocklin | PyCon 2017 | 0:46 | 7000 | dask | 2017 | 🐍 🐍 |
Introduction to Statistical Modeling with Python [repo] | Christopher Fonnesbeck | PyCon 2017 | 3:19 | 7000 | scipy | 2017 | 🐍 🐍 |
Fully Convolutional Networks for Image Segmentation | Daniil Pakhomov | Enthought | 0:20 | 7000 | scipy | 2017 | 🐍 |
Exploratory data analysis in python [repo] | Chloe Mawer & Jonathan Whitmore | PyCon 2017 | 2:54 | 7000 | scipy | 2017 | 🐍 |
Visualizing Geographic Data | Christopher Roach | PyData | 0:31 | 14000 | python | 2016 | 🐍 |
Builtin Superheroes | David Beazley | David Beazley | 0:44 | 12000 | python | 2016 | 🐍 🐍 |
Python’s Class Development Toolkit | Raymond Hettinger | Next Day Video | 0:45 | 80000 | python | 2013 | 🐍 🐍 |
Libraries for Deep Learning with Sequences | Alex Rubinsteyn | PyData | 0:44 | 23000 | scipy | 2015 | 🐍 🐍 |
The Other Async (Threads + Async = ❤️) - YouTube | David Beazley | David Beazley | 0:47 | 5000 | python | 2017 | 🐍 🐍 🐍 |
Numba - Tell Those C++ Bullies to Get Lost [repo] | Gil Forsyth & Lorena Barba | Enthought | 2:25 | 5000 | numba | 2017 | 🐍 🐍 |
Deploying Interactive Jupyter Dashboards | Philipp Rudiger | Enthought | 0:18 | 5000 | jupyter | 2017 | 🐍 🐍 |
Data Science Using Functional Python | Joel Grus | PyData | 0:44 | 18000 | python | 2015 | 🐍 🐍 |
Pandas from the Inside | Stephen Simmons | PyData | 1:20 | 9000 | pandas | 2016 | 🐍 🐍 🐍 |
Anatomy of matplotlib [repo] | Benjamin Root & Joe Kington | Enthought | 3:18 | 18000 | matplotlib | 2015 | 🐍 🐍 |
Anatomy of matplotlib [repo] | Benjamin Root | Enthought | 3:02 | 4000 | matplotlib | 2017 | 🐍 🐍 |
Data Science is Software [repo] | Peter Bull & Isaac Slavitt | Enthought | 2:12 | 9000 | jupyter | 2016 | 🐍 |
Machine Learning with Scikit Learn [repo] | Jake VanderPlas | PyData | 1:34 | 16000 | sklearn | 2015 | Novice |
Using Jupyter notebooks | Ioanna Ioannou | PyCon Australia | 0:28 | 8000 | jupyter | 2016 | Novice |
Parallel Python: Analyzing Large Datasets [repo] | Matthew Rocklin | Enthought | 3:05 | 7000 | scipy | 2016 | Novice |
Functional Programming with Python | Mike Müller | Next Day Video | 0:27 | 44000 | python | 2013 | Novice |
Predicting sports winners with pandas and scikit-learn | Robert Layton | PyCon Australia | 0:38 | 13000 | pandas | 2015 | Novice |
Keynote: Project Jupyter | Brian Granger | Enthought | 0:48 | 7000 | jupyter | 2016 | Novice |
matplotlib beginner tutorial [repo] | Nicolas Rougier | Enthought | 2:59 | 6000 | matplotlib | 2016 | Novice |
Awesome Big Data Algorithms | Titus Brown | Next Day Video | 0:39 | 41000 | python | 2013 | Novice |
Pandas from the Inside | Stephen Simmons | PyData | 1:17 | 3000 | pandas | 2017 | Novice |
All About Jupyter | Brian Granger | PyData | 0:39 | 11000 | jupyter | 2015 | Novice |
Building a Recommendation Engine using Python | Anusua Trivedi | PyData | 0:37 | 11000 | python | 2015 | Novice |
Iterations of Evolution | David Beazley | David Beazley | 0:34 | 2000 | python | 2017 | Novice |
“Good Enough” IS Good Enough! | Alex Martelli | SF Python | 0:53 | 4000 | python | 2016 | Novice |
PyMC: Markov Chain Monte Carlo | Chris Fonnesbeck | Enthought | 0:20 | 9000 | pyMC | 2014 | Novice |
Jupyter Advanced Topics Tutorial [repo] | Jonathan Frederic & Matthias Bussonier | Enthought | 2:48 | 4000 | jupyter | 2015 | Novice |
Using randomness to make code much faster | Rachel Thomas | SF Python | 0:54 | 1000 | scipy | 2017 | Novice |
Python Profiling & Performance | Mahmoud Hashemi | SF Python | 0:28 | 1000 | python | 2016 | Novice |