add second notebook and related images

pull/7/head
Kevin Markham 2015-04-15 12:04:57 -04:00
parent 7a0c2f5845
commit 58315d9bd5
4 changed files with 258 additions and 0 deletions

View File

@ -0,0 +1,258 @@
{
"metadata": {
"name": "",
"signature": "sha256:762906f449424ec70e2324ab7310748fcbe613b98d4fdbfbe3c219b793aef8a3"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Setting up Python for machine learning: scikit-learn and IPython Notebook\n",
"*From the video series: [Introduction to machine learning with scikit-learn](https://github.com/justmarkham/scikit-learn-videos)*"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Agenda\n",
"\n",
"- What are the benefits and drawbacks of scikit-learn?\n",
"- How do I install scikit-learn?\n",
"- How do I use the IPython Notebook?\n",
"- What are some good resources for learning Python?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![scikit-learn algorithm map](images/02_sklearn_algorithms.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Benefits and drawbacks of scikit-learn\n",
"\n",
"### Benefits:\n",
"\n",
"- **Consistent interface** to machine learning models\n",
"- Provides many **tuning parameters** but with **sensible defaults**\n",
"- Exceptional **documentation**\n",
"- Rich set of functionality for **companion tasks**\n",
"- **Active community** for development and support\n",
"\n",
"### Potential drawbacks:\n",
"\n",
"- Harder (than R) to **get started with machine learning**\n",
"- Less emphasis (than R) on **model interpretability**\n",
"\n",
"### Further reading:\n",
"\n",
"- Ben Lorica: [Six reasons why I recommend scikit-learn](http://radar.oreilly.com/2013/12/six-reasons-why-i-recommend-scikit-learn.html)\n",
"- scikit-learn authors: [API design for machine learning software](http://arxiv.org/pdf/1309.0238v1.pdf)\n",
"- Data School: [Should you teach Python or R for data science?](http://www.dataschool.io/python-or-r-for-data-science/)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![scikit-learn logo](images/02_sklearn_logo.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Installing scikit-learn\n",
"\n",
"**Option 1:** [Install scikit-learn library](http://scikit-learn.org/stable/install.html) and dependencies (NumPy and SciPy)\n",
"\n",
"**Option 2:** [Install Anaconda distribution](https://store.continuum.io/cshop/anaconda/) of Python, which includes:\n",
"\n",
"- Hundreds of useful packages (including scikit-learn)\n",
"- IPython and IPython Notebook\n",
"- conda package manager\n",
"- Spyder IDE"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![IPython header](images/02_ipython_header.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Using the IPython Notebook\n",
"\n",
"### Components:\n",
"\n",
"- **IPython interpreter:** enhanced version of the standard Python interpreter\n",
"- **Browser-based notebook interface:** weave together code, formatted text, and plots\n",
"\n",
"### Installation:\n",
"\n",
"- **Option 1:** [Install IPython and the notebook](http://ipython.org/install.html)\n",
"- **Option 2:** Included with the Anaconda distribution\n",
"\n",
"### Launching the Notebook:\n",
"\n",
"- Type **ipython notebook** at the command line to open the dashboard\n",
"- Don't close the command line window while the Notebook is running\n",
"\n",
"### Keyboard shortcuts:\n",
"\n",
"**Command mode** (gray border)\n",
"\n",
"- Create new cells above (**a**) or below (**b**) the current cell\n",
"- Navigate using the **up arrow** and **down arrow**\n",
"- Convert the cell type to Markdown (**m**) or code (**y**)\n",
"- See keyboard shortcuts using **h**\n",
"- Switch to Edit mode using **Enter**\n",
"\n",
"**Edit mode** (green border)\n",
"\n",
"- **Ctrl+Enter** to run a cell\n",
"- Switch to Command mode using **Esc**\n",
"\n",
"### IPython and Markdown resources:\n",
"\n",
"- [nbviewer](http://nbviewer.ipython.org/): view notebooks online as static documents\n",
"- [IPython documentation](http://ipython.org/ipython-doc/stable/index.html): focuses on the interpreter\n",
"- [IPython Notebook tutorials](http://nbviewer.ipython.org/github/ipython/ipython/blob/master/examples/Notebook/Index.ipynb): in-depth introduction\n",
"- [GitHub's Mastering Markdown](https://guides.github.com/features/mastering-markdown/): short guide with lots of examples"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Resources for learning Python\n",
"\n",
"- [Codecademy's Python course](http://www.codecademy.com/en/tracks/python): browser-based, tons of exercises\n",
"- [DataQuest](https://dataquest.io/missions): browser-based, teaches Python in the context of data science\n",
"- [Google's Python class](https://developers.google.com/edu/python/): slightly more advanced, includes videos and downloadable exercises (with solutions)\n",
"- [Python for Informatics](http://www.pythonlearn.com/): beginner-oriented book, includes slides and videos"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Comments or Questions?\n",
"\n",
"- Email: <kevin@dataschool.io>\n",
"- Website: http://dataschool.io\n",
"- Twitter: [@justmarkham](https://twitter.com/justmarkham)"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from IPython.core.display import HTML\n",
"def css_styling():\n",
" styles = open(\"styles/custom.css\", \"r\").read()\n",
" return HTML(styles)\n",
"css_styling()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<style>\n",
" @font-face {\n",
" font-family: \"Computer Modern\";\n",
" src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunss.otf');\n",
" }\n",
" div.cell{\n",
" width: 90%;\n",
"/* margin-left:auto;*/\n",
"/* margin-right:auto;*/\n",
" }\n",
" ul {\n",
" line-height: 145%;\n",
" font-size: 90%;\n",
" }\n",
" li {\n",
" margin-bottom: 1em;\n",
" }\n",
" h1 {\n",
" font-family: Helvetica, serif;\n",
" }\n",
" h4{\n",
" margin-top: 12px;\n",
" margin-bottom: 3px;\n",
" }\n",
" div.text_cell_render{\n",
" font-family: Computer Modern, \"Helvetica Neue\", Arial, Helvetica, Geneva, sans-serif;\n",
" line-height: 145%;\n",
" font-size: 130%;\n",
" width: 90%;\n",
" margin-left:auto;\n",
" margin-right:auto;\n",
" }\n",
" .CodeMirror{\n",
" font-family: \"Source Code Pro\", source-code-pro,Consolas, monospace;\n",
" }\n",
"/* .prompt{\n",
" display: None;\n",
" }*/\n",
" .text_cell_render h5 {\n",
" font-weight: 300;\n",
" font-size: 16pt;\n",
" color: #4057A1;\n",
" font-style: italic;\n",
" margin-bottom: 0.5em;\n",
" margin-top: 0.5em;\n",
" display: block;\n",
" }\n",
"\n",
" .warning{\n",
" color: rgb( 240, 20, 20 )\n",
" }\n",
"</style>\n",
"<script>\n",
" MathJax.Hub.Config({\n",
" TeX: {\n",
" extensions: [\"AMSmath.js\"]\n",
" },\n",
" tex2jax: {\n",
" inlineMath: [ ['$','$'], [\"\\\\(\",\"\\\\)\"] ],\n",
" displayMath: [ ['$$','$$'], [\"\\\\[\",\"\\\\]\"] ]\n",
" },\n",
" displayAlign: 'center', // Change this to 'center' to center equations.\n",
" \"HTML-CSS\": {\n",
" styles: {'.MathJax_Display': {\"margin\": 4}}\n",
" }\n",
" });\n",
"</script>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 1,
"text": [
"<IPython.core.display.HTML at 0x3fe4240>"
]
}
],
"prompt_number": 1
}
],
"metadata": {}
}
]
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 183 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB