scikit-learn-videos/README.md

54 lines
5.0 KiB
Markdown
Raw Normal View History

2015-04-08 12:55:55 +08:00
# Introduction to machine learning with scikit-learn
2015-04-16 02:09:31 +08:00
This repo contains IPython notebooks from my scikit-learn video series, as seen on Kaggle's blog.
2015-04-08 12:55:55 +08:00
2015-04-16 02:09:31 +08:00
## Entire series
- [Watch the entire series](https://www.youtube.com/playlist?list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A) (YouTube playlist)
- [View the IPython Notebooks](http://nbviewer.ipython.org/github/justmarkham/scikit-learn-videos/tree/master/) (nbviewer)
- [Read the blog posts](http://blog.kaggle.com/author/kevin-markham/) (Kaggle's blog)
## Individual videos
1. What is machine learning, and how does it work? ([video](https://www.youtube.com/watch?v=elojMnjn4kk&list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A&index=1), [notebook](http://nbviewer.ipython.org/github/justmarkham/scikit-learn-videos/blob/master/01_machine_learning_intro.ipynb), [blog post](http://blog.kaggle.com/2015/04/08/new-video-series-introduction-to-machine-learning-with-scikit-learn/))
2015-04-08 12:55:55 +08:00
- What is machine learning?
- What are the two main categories of machine learning?
- What are some examples of machine learning?
- How does machine learning "work"?
2015-04-16 00:08:52 +08:00
2015-04-16 12:13:37 +08:00
2. Setting up Python for machine learning: scikit-learn and IPython Notebook ([video](https://www.youtube.com/watch?v=IsXXlYVBt1M&list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A&index=2), [notebook](http://nbviewer.ipython.org/github/justmarkham/scikit-learn-videos/blob/master/02_machine_learning_setup.ipynb), [blog post](http://blog.kaggle.com/2015/04/15/scikit-learn-video-2-setting-up-python-for-machine-learning/))
2015-04-16 00:08:52 +08:00
- What are the benefits and drawbacks of scikit-learn?
- How do I install scikit-learn?
- How do I use the IPython Notebook?
- What are some good resources for learning Python?
2015-04-22 04:57:31 +08:00
2015-04-23 11:35:40 +08:00
3. Getting started in scikit-learn with the famous iris dataset ([video](https://www.youtube.com/watch?v=hd1W4CyPX58&list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A&index=3), [notebook](http://nbviewer.ipython.org/github/justmarkham/scikit-learn-videos/blob/master/03_getting_started_with_iris.ipynb), [blog post](http://blog.kaggle.com/2015/04/22/scikit-learn-video-3-machine-learning-first-steps-with-the-iris-dataset/))
2015-04-22 04:57:31 +08:00
- What is the famous iris dataset, and how does it relate to machine learning?
- How do we load the iris dataset into scikit-learn?
- How do we describe a dataset using machine learning terminology?
- What are scikit-learn's four key requirements for working with data?
2015-05-01 12:09:01 +08:00
4. Training a machine learning model with scikit-learn ([video](https://www.youtube.com/watch?v=RlQuVL6-qe8&list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A&index=4), [notebook](http://nbviewer.ipython.org/github/justmarkham/scikit-learn-videos/blob/master/04_model_training.ipynb), [blog post](http://blog.kaggle.com/2015/04/30/scikit-learn-video-4-model-training-and-prediction-with-k-nearest-neighbors/))
- What is the K-nearest neighbors classification model?
- What are the four steps for model training and prediction in scikit-learn?
- How can I apply this pattern to other machine learning models?
2015-05-14 22:15:31 +08:00
2015-05-15 10:26:21 +08:00
5. Comparing machine learning models in scikit-learn ([video](https://www.youtube.com/watch?v=0pP4EwWJgIU&list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A&index=5), [notebook](http://nbviewer.ipython.org/github/justmarkham/scikit-learn-videos/blob/master/05_model_evaluation.ipynb), [blog post](http://blog.kaggle.com/2015/05/14/scikit-learn-video-5-choosing-a-machine-learning-model/))
2015-05-14 22:15:31 +08:00
- How do I choose which model to use for my supervised learning task?
- How do I choose the best tuning parameters for that model?
- How do I estimate the likely performance of my model on out-of-sample data?
2015-05-29 05:29:10 +08:00
6. Data science pipeline: pandas, seaborn, scikit-learn ([video](https://www.youtube.com/watch?v=3ZWuPVWq7p4&list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A&index=6), [notebook](http://nbviewer.ipython.org/github/justmarkham/scikit-learn-videos/blob/master/06_linear_regression.ipynb), [blog post](http://blog.kaggle.com/2015/05/28/scikit-learn-video-6-linear-regression-plus-pandas-seaborn/))
- How do I use the pandas library to read data into Python?
- How do I use the seaborn library to visualize data?
- What is linear regression, and how does it work?
- How do I train and interpret a linear regression model in scikit-learn?
- What are some evaluation metrics for regression problems?
- How do I choose which features to include in my model?
2015-06-27 00:32:24 +08:00
2015-07-01 00:19:12 +08:00
7. Cross-validation for parameter tuning, model selection, and feature selection ([video](https://www.youtube.com/watch?v=6dbrR-WymjI&list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A&index=7), [notebook](http://nbviewer.ipython.org/github/justmarkham/scikit-learn-videos/blob/master/07_cross_validation.ipynb), [blog post](http://blog.kaggle.com/2015/06/29/scikit-learn-video-7-optimizing-your-model-with-cross-validation/))
2015-06-27 00:32:24 +08:00
- What is the drawback of using the train/test split procedure for model evaluation?
- How does K-fold cross-validation overcome this limitation?
- How can cross-validation be used for selecting tuning parameters, choosing between models, and selecting features?
- What are some possible improvements to cross-validation?