add first notebook and supporting files
parent
b600604bd6
commit
b6513636d6
|
@ -0,0 +1,2 @@
|
||||||
|
.ipynb_checkpoints/
|
||||||
|
*.pyc
|
|
@ -0,0 +1,248 @@
|
||||||
|
{
|
||||||
|
"metadata": {
|
||||||
|
"name": "",
|
||||||
|
"signature": "sha256:3a45466f81f7926609b8d5a7f9daaac6a202c78255a1369eb02391279866cba5"
|
||||||
|
},
|
||||||
|
"nbformat": 3,
|
||||||
|
"nbformat_minor": 0,
|
||||||
|
"worksheets": [
|
||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# What is machine learning, and how does it work?\n",
|
||||||
|
"*From the video series: [Introduction to machine learning with scikit-learn](https://github.com/justmarkham/scikit-learn-videos)*"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"![Machine learning](images/01_robot.png)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Agenda\n",
|
||||||
|
"\n",
|
||||||
|
"- What is machine learning?\n",
|
||||||
|
"- What are the two main categories of machine learning?\n",
|
||||||
|
"- What are some examples of machine learning?\n",
|
||||||
|
"- How does machine learning \"work\"?"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## What is machine learning?\n",
|
||||||
|
"\n",
|
||||||
|
"One definition: \"Machine learning is the semi-automated extraction of knowledge from data\"\n",
|
||||||
|
"\n",
|
||||||
|
"- **Knowledge from data**: Starts with a question that might be answerable using data\n",
|
||||||
|
"- **Automated extraction**: A computer provides the insight\n",
|
||||||
|
"- **Semi-automated**: Requires many smart decisions by a human"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## What are the two main categories of machine learning?\n",
|
||||||
|
"\n",
|
||||||
|
"**Supervised learning**: Making predictions using data\n",
|
||||||
|
" \n",
|
||||||
|
"- Example: Is a given email \"spam\" or \"ham\"?\n",
|
||||||
|
"- There is an outcome we are trying to predict"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"![Spam filter](images/01_spam_filter.png)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"**Unsupervised learning**: Extracting structure from data\n",
|
||||||
|
"\n",
|
||||||
|
"- Example: Segment grocery store shoppers into clusters that exhibit similar behaviors\n",
|
||||||
|
"- There is no \"right answer\""
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"![Clustering](images/01_clustering.png)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## How does machine learning \"work\"?\n",
|
||||||
|
"\n",
|
||||||
|
"High-level steps of supervised learning:\n",
|
||||||
|
"\n",
|
||||||
|
"1. First, train a **machine learning model** using **labeled data**\n",
|
||||||
|
"\n",
|
||||||
|
" - \"Labeled data\" has been labeled with the outcome\n",
|
||||||
|
" - \"Machine learning model\" learns the relationship between the attributes of the data and its outcome\n",
|
||||||
|
"\n",
|
||||||
|
"2. Then, make **predictions** on **new data** for which the label is unknown"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"![Supervised learning](images/01_supervised_learning.png)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"The primary goal of supervised learning is to build a model that \"generalizes\": It accurately predicts the **future** rather than the **past**!"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Questions about machine learning\n",
|
||||||
|
"\n",
|
||||||
|
"- How do I choose **which attributes** of my data to include in the model?\n",
|
||||||
|
"- How do I choose **which model** to use?\n",
|
||||||
|
"- How do I **optimize** this model for best performance?\n",
|
||||||
|
"- How do I ensure that I'm building a model that will **generalize** to unseen data?\n",
|
||||||
|
"- Can I **estimate** how well my model is likely to perform on unseen data?"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Resources\n",
|
||||||
|
"\n",
|
||||||
|
"- Book: [An Introduction to Statistical Learning](http://www-bcf.usc.edu/~gareth/ISL/) (section 2.1, 14 pages)\n",
|
||||||
|
"- Video: [Learning Paradigms](http://work.caltech.edu/library/014.html) (13 minutes)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Comments or Questions?\n",
|
||||||
|
"\n",
|
||||||
|
"- Email: <kevin@dataschool.io>\n",
|
||||||
|
"- Website: http://dataschool.io\n",
|
||||||
|
"- Twitter: [@justmarkham](https://twitter.com/justmarkham)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"collapsed": false,
|
||||||
|
"input": [
|
||||||
|
"from IPython.core.display import HTML\n",
|
||||||
|
"def css_styling():\n",
|
||||||
|
" styles = open(\"styles/custom.css\", \"r\").read()\n",
|
||||||
|
" return HTML(styles)\n",
|
||||||
|
"css_styling()"
|
||||||
|
],
|
||||||
|
"language": "python",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"html": [
|
||||||
|
"<style>\n",
|
||||||
|
" @font-face {\n",
|
||||||
|
" font-family: \"Computer Modern\";\n",
|
||||||
|
" src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunss.otf');\n",
|
||||||
|
" }\n",
|
||||||
|
" div.cell{\n",
|
||||||
|
" width: 90%;\n",
|
||||||
|
"/* margin-left:auto;*/\n",
|
||||||
|
"/* margin-right:auto;*/\n",
|
||||||
|
" }\n",
|
||||||
|
" ul {\n",
|
||||||
|
" line-height: 145%;\n",
|
||||||
|
" font-size: 90%;\n",
|
||||||
|
" }\n",
|
||||||
|
" li {\n",
|
||||||
|
" margin-bottom: 1em;\n",
|
||||||
|
" }\n",
|
||||||
|
" h1 {\n",
|
||||||
|
" font-family: Helvetica, serif;\n",
|
||||||
|
" }\n",
|
||||||
|
" h4{\n",
|
||||||
|
" margin-top: 12px;\n",
|
||||||
|
" margin-bottom: 3px;\n",
|
||||||
|
" }\n",
|
||||||
|
" div.text_cell_render{\n",
|
||||||
|
" font-family: Computer Modern, \"Helvetica Neue\", Arial, Helvetica, Geneva, sans-serif;\n",
|
||||||
|
" line-height: 145%;\n",
|
||||||
|
" font-size: 130%;\n",
|
||||||
|
" width: 90%;\n",
|
||||||
|
" margin-left:auto;\n",
|
||||||
|
" margin-right:auto;\n",
|
||||||
|
" }\n",
|
||||||
|
" .CodeMirror{\n",
|
||||||
|
" font-family: \"Source Code Pro\", source-code-pro,Consolas, monospace;\n",
|
||||||
|
" }\n",
|
||||||
|
"/* .prompt{\n",
|
||||||
|
" display: None;\n",
|
||||||
|
" }*/\n",
|
||||||
|
" .text_cell_render h5 {\n",
|
||||||
|
" font-weight: 300;\n",
|
||||||
|
" font-size: 16pt;\n",
|
||||||
|
" color: #4057A1;\n",
|
||||||
|
" font-style: italic;\n",
|
||||||
|
" margin-bottom: 0.5em;\n",
|
||||||
|
" margin-top: 0.5em;\n",
|
||||||
|
" display: block;\n",
|
||||||
|
" }\n",
|
||||||
|
"\n",
|
||||||
|
" .warning{\n",
|
||||||
|
" color: rgb( 240, 20, 20 )\n",
|
||||||
|
" }\n",
|
||||||
|
"</style>\n",
|
||||||
|
"<script>\n",
|
||||||
|
" MathJax.Hub.Config({\n",
|
||||||
|
" TeX: {\n",
|
||||||
|
" extensions: [\"AMSmath.js\"]\n",
|
||||||
|
" },\n",
|
||||||
|
" tex2jax: {\n",
|
||||||
|
" inlineMath: [ ['$','$'], [\"\\\\(\",\"\\\\)\"] ],\n",
|
||||||
|
" displayMath: [ ['$$','$$'], [\"\\\\[\",\"\\\\]\"] ]\n",
|
||||||
|
" },\n",
|
||||||
|
" displayAlign: 'center', // Change this to 'center' to center equations.\n",
|
||||||
|
" \"HTML-CSS\": {\n",
|
||||||
|
" styles: {'.MathJax_Display': {\"margin\": 4}}\n",
|
||||||
|
" }\n",
|
||||||
|
" });\n",
|
||||||
|
"</script>"
|
||||||
|
],
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "pyout",
|
||||||
|
"prompt_number": 1,
|
||||||
|
"text": [
|
||||||
|
"<IPython.core.display.HTML at 0x3edad30>"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"prompt_number": 1
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {}
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
Binary file not shown.
After Width: | Height: | Size: 40 KiB |
Binary file not shown.
After Width: | Height: | Size: 66 KiB |
Binary file not shown.
After Width: | Height: | Size: 58 KiB |
Binary file not shown.
After Width: | Height: | Size: 37 KiB |
|
@ -0,0 +1,67 @@
|
||||||
|
<style>
|
||||||
|
@font-face {
|
||||||
|
font-family: "Computer Modern";
|
||||||
|
src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunss.otf');
|
||||||
|
}
|
||||||
|
div.cell{
|
||||||
|
width: 90%;
|
||||||
|
/* margin-left:auto;*/
|
||||||
|
/* margin-right:auto;*/
|
||||||
|
}
|
||||||
|
ul {
|
||||||
|
line-height: 145%;
|
||||||
|
font-size: 90%;
|
||||||
|
}
|
||||||
|
li {
|
||||||
|
margin-bottom: 1em;
|
||||||
|
}
|
||||||
|
h1 {
|
||||||
|
font-family: Helvetica, serif;
|
||||||
|
}
|
||||||
|
h4{
|
||||||
|
margin-top: 12px;
|
||||||
|
margin-bottom: 3px;
|
||||||
|
}
|
||||||
|
div.text_cell_render{
|
||||||
|
font-family: Computer Modern, "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif;
|
||||||
|
line-height: 145%;
|
||||||
|
font-size: 130%;
|
||||||
|
width: 90%;
|
||||||
|
margin-left:auto;
|
||||||
|
margin-right:auto;
|
||||||
|
}
|
||||||
|
.CodeMirror{
|
||||||
|
font-family: "Source Code Pro", source-code-pro,Consolas, monospace;
|
||||||
|
}
|
||||||
|
/* .prompt{
|
||||||
|
display: None;
|
||||||
|
}*/
|
||||||
|
.text_cell_render h5 {
|
||||||
|
font-weight: 300;
|
||||||
|
font-size: 16pt;
|
||||||
|
color: #4057A1;
|
||||||
|
font-style: italic;
|
||||||
|
margin-bottom: 0.5em;
|
||||||
|
margin-top: 0.5em;
|
||||||
|
display: block;
|
||||||
|
}
|
||||||
|
|
||||||
|
.warning{
|
||||||
|
color: rgb( 240, 20, 20 )
|
||||||
|
}
|
||||||
|
</style>
|
||||||
|
<script>
|
||||||
|
MathJax.Hub.Config({
|
||||||
|
TeX: {
|
||||||
|
extensions: ["AMSmath.js"]
|
||||||
|
},
|
||||||
|
tex2jax: {
|
||||||
|
inlineMath: [ ['$','$'], ["\\(","\\)"] ],
|
||||||
|
displayMath: [ ['$$','$$'], ["\\[","\\]"] ]
|
||||||
|
},
|
||||||
|
displayAlign: 'center', // Change this to 'center' to center equations.
|
||||||
|
"HTML-CSS": {
|
||||||
|
styles: {'.MathJax_Display': {"margin": 4}}
|
||||||
|
}
|
||||||
|
});
|
||||||
|
</script>
|
Loading…
Reference in New Issue