updated intro

main
Prince Joseph Javier 2021-02-23 00:15:16 +08:00
parent 1f339967ce
commit 04596cab4d
34 changed files with 374 additions and 53 deletions

View File

@ -789,7 +789,7 @@
" <td>2.54</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Simplex Method</td>\n",
" <td>Simplex Method (pending validation)</td>\n",
" <td>1.53</td>\n",
" </tr>\n",
" <tr>\n",
@ -824,7 +824,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.6"
"version": "3.7.9"
},
"latex_envs": {
"LaTeX_envs_menu_present": true,

Binary file not shown.

View File

@ -318,9 +318,26 @@
</li>
</ul>
</li>
<li class="toc-h2 nav-item toc-entry">
<a class="reference internal nav-link" href="#benchmark-methods">
4. Benchmark Methods
</a>
<ul class="nav section-nav flex-column">
<li class="toc-h3 nav-item toc-entry">
<a class="reference internal nav-link" href="#naive-method">
Naïve Method
</a>
</li>
<li class="toc-h3 nav-item toc-entry">
<a class="reference internal nav-link" href="#seasonal-naive-method">
Seasonal Naïve Method
</a>
</li>
</ul>
</li>
<li class="toc-h2 nav-item toc-entry">
<a class="reference internal nav-link" href="#evaluation-metrics-for-forecast-accuracy">
3. Evaluation Metrics for Forecast Accuracy
5. Evaluation Metrics for Forecast Accuracy
</a>
<ul class="nav section-nav flex-column">
<li class="toc-h3 nav-item toc-entry">
@ -345,6 +362,11 @@
</li>
</ul>
</li>
<li class="toc-h2 nav-item toc-entry">
<a class="reference internal nav-link" href="#summary-of-forecast-accuracy-for-jena-climate-dataset">
6. Summary of Forecast Accuracy for Jena Climate Dataset
</a>
</li>
</ul>
</nav>
@ -381,6 +403,8 @@
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span>
<span class="kn">import</span> <span class="nn">statsmodels.graphics.tsaplots</span> <span class="k">as</span> <span class="nn">tg</span>
<span class="o">%</span><span class="k">matplotlib</span> inline
<span class="n">plt</span><span class="o">.</span><span class="n">rcParams</span><span class="p">[</span><span class="s1">&#39;figure.figsize&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="p">[</span><span class="mi">15</span><span class="p">,</span> <span class="mi">2</span><span class="p">]</span>
</pre></div>
</div>
</div>
@ -709,8 +733,31 @@
<p>with the characteristic polynomials as defined above.</p>
</div>
</div>
<div class="section" id="benchmark-methods">
<h2>4. Benchmark Methods<a class="headerlink" href="#benchmark-methods" title="Permalink to this headline"></a></h2>
<p>In order to properly measure the performance of a forecasting model, we first need to establish several baselines. This section introduces several methods that will serve as benchmarks. Obviously, any forecasting method we develop must beat these benchmarks. Otherwise, the new method is not even worth considering.</p>
<p>In the notation below, <span class="math notranslate nohighlight">\(T\)</span> refers to the length of the time series and <span class="math notranslate nohighlight">\(h\)</span> refers to the prediction horizon.</p>
<div class="section" id="naive-method">
<h3>Naïve Method<a class="headerlink" href="#naive-method" title="Permalink to this headline"></a></h3>
<p>Forecasts of all future values are equal to the last observation.</p>
<div class="amsmath math notranslate nohighlight">
\[\begin{align*}
\hat{y}_{T+h} &amp;= y_T
\end{align*}\]</div>
</div>
<div class="section" id="seasonal-naive-method">
<h3>Seasonal Naïve Method<a class="headerlink" href="#seasonal-naive-method" title="Permalink to this headline"></a></h3>
<p>Forecasts are equal to the last observed value from the same season of the year (e.g. the same month of the previous year).</p>
<div class="amsmath math notranslate nohighlight">
\[\begin{align*}
\hat{y}_{T+h} &amp;= y_{T+h-m(k+1)}
\end{align*}\]</div>
<p>where <span class="math notranslate nohighlight">\(m\)</span> is the seasonal period and <span class="math notranslate nohighlight">\(k\)</span> is the integer part of <span class="math notranslate nohighlight">\((h-1)/m\)</span> (i.e. the number of complete years in the forecast period prior to time <span class="math notranslate nohighlight">\(T+h\)</span>).</p>
<p>As an example, if we were forecasting a monthly time series, the forecast for all future February values is simply equal to the last observed February value. With weekly data, the forecast of all future Friday values is equal to the last observed Friday value. And so on.</p>
</div>
</div>
<div class="section" id="evaluation-metrics-for-forecast-accuracy">
<h2>3. Evaluation Metrics for Forecast Accuracy<a class="headerlink" href="#evaluation-metrics-for-forecast-accuracy" title="Permalink to this headline"></a></h2>
<h2>5. Evaluation Metrics for Forecast Accuracy<a class="headerlink" href="#evaluation-metrics-for-forecast-accuracy" title="Permalink to this headline"></a></h2>
<p>Forecasting is one of the most common inference tasks in time series analysis. In order to properly gauge the performance of a time series model, it is common practice to divide the dataset into two parts: training and test data. Model parameters are estimated using training data, then the models are used to generate forecasts that are evaluated against the test data.</p>
<p>Error statistics come in different flavors, each with their own advantages and disadvantages.</p>
<div class="section" id="mean-absolute-error">
@ -746,6 +793,47 @@
\end{align*}\]</div>
</div>
</div>
<div class="section" id="summary-of-forecast-accuracy-for-jena-climate-dataset">
<h2>6. Summary of Forecast Accuracy for Jena Climate Dataset<a class="headerlink" href="#summary-of-forecast-accuracy-for-jena-climate-dataset" title="Permalink to this headline"></a></h2>
<p>The handbook goes over several time series forecasting methods and compares performance of said models on the Jena Climate Dataset. Specifically, each method attempts to forecast the temperature variable (in Celsius). A summary of the forecast accuracy for each model is shown below.</p>
<table>
<thead>
<tr>
<th>Method</th>
<th>Average MAE (Celsius)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Naive</td>
<td>3.18</td>
</tr>
<tr>
<td>Seasonal Naive</td>
<td>2.61</td>
</tr>
<tr>
<td>Linear Regression</td>
<td>2.86</td>
</tr>
<tr>
<td>ARIMA</td>
<td>3.19</td>
</tr>
<tr>
<td>VAR</td>
<td>2.54</td>
</tr>
<tr>
<td>Simplex Method (pending validation)</td>
<td>1.53</td>
</tr>
<tr>
<td>LightGBM</td>
<td>2.08</td>
</tr>
</tbody>
</table></div>
</div>
<script type="text/x-thebe-config">

Binary file not shown.

Before

Width:  |  Height:  |  Size: 40 KiB

After

Width:  |  Height:  |  Size: 42 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 24 KiB

After

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 55 KiB

After

Width:  |  Height:  |  Size: 58 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 53 KiB

After

Width:  |  Height:  |  Size: 55 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 6.4 KiB

After

Width:  |  Height:  |  Size: 7.1 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 6.4 KiB

After

Width:  |  Height:  |  Size: 7.1 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 6.4 KiB

After

Width:  |  Height:  |  Size: 7.1 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 6.4 KiB

After

Width:  |  Height:  |  Size: 7.1 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 6.4 KiB

After

Width:  |  Height:  |  Size: 6.6 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 6.4 KiB

After

Width:  |  Height:  |  Size: 6.6 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 7.0 KiB

After

Width:  |  Height:  |  Size: 7.3 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 7.0 KiB

After

Width:  |  Height:  |  Size: 7.3 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 23 KiB

After

Width:  |  Height:  |  Size: 26 KiB

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@ -23,6 +23,8 @@ import matplotlib.pyplot as plt
import statsmodels.graphics.tsaplots as tg
%matplotlib inline
plt.rcParams['figure.figsize'] = [15, 2]
## 1. General Introduction
Most of us would have heard about the new buzz in the market i.e. Cryptocurrency. Many of us would have invested in their coins too. But, is investing money in such a volatile currency safe? How can we make sure that investing in these coins now would surely generate a healthy profit in the future? We cant be sure but we can surely generate an approximate value based on the previous prices. Time series models is one way to predict them.
@ -322,7 +324,33 @@ $\phi(\mathbf{B})X_t = \theta(\mathbf{B})\epsilon_t$,
with the characteristic polynomials as defined above.
## 3. Evaluation Metrics for Forecast Accuracy
## 4. Benchmark Methods
In order to properly measure the performance of a forecasting model, we first need to establish several baselines. This section introduces several methods that will serve as benchmarks. Obviously, any forecasting method we develop must beat these benchmarks. Otherwise, the new method is not even worth considering.
In the notation below, $T$ refers to the length of the time series and $h$ refers to the prediction horizon.
### Naïve Method
Forecasts of all future values are equal to the last observation.
\begin{align*}
\hat{y}_{T+h} &= y_T
\end{align*}
### Seasonal Naïve Method
Forecasts are equal to the last observed value from the same season of the year (e.g. the same month of the previous year).
\begin{align*}
\hat{y}_{T+h} &= y_{T+h-m(k+1)}
\end{align*}
where $m$ is the seasonal period and $k$ is the integer part of $(h-1)/m$ (i.e. the number of complete years in the forecast period prior to time $T+h$).
As an example, if we were forecasting a monthly time series, the forecast for all future February values is simply equal to the last observed February value. With weekly data, the forecast of all future Friday values is equal to the last observed Friday value. And so on.
## 5. Evaluation Metrics for Forecast Accuracy
Forecasting is one of the most common inference tasks in time series analysis. In order to properly gauge the performance of a time series model, it is common practice to divide the dataset into two parts: training and test data. Model parameters are estimated using training data, then the models are used to generate forecasts that are evaluated against the test data.
@ -360,3 +388,46 @@ One particular disadvantage of MAPE is that it puts a larger penalty on negative
\text{SMAPE} &= \frac{1}{n}\sum_{t=1}^{n} \frac{|\hat{y}_t - y_t|}{|y_t| + |\hat{y}_t|}
\end{align*}
## 6. Summary of Forecast Accuracy for Jena Climate Dataset
The handbook goes over several time series forecasting methods and compares performance of said models on the Jena Climate Dataset. Specifically, each method attempts to forecast the temperature variable (in Celsius). A summary of the forecast accuracy for each model is shown below.
<table>
<thead>
<tr>
<th>Method</th>
<th>Average MAE (Celsius)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Naive</td>
<td>3.18</td>
</tr>
<tr>
<td>Seasonal Naive</td>
<td>2.61</td>
</tr>
<tr>
<td>Linear Regression</td>
<td>2.86</td>
</tr>
<tr>
<td>ARIMA</td>
<td>3.19</td>
</tr>
<tr>
<td>VAR</td>
<td>2.54</td>
</tr>
<tr>
<td>Simplex Method (pending validation)</td>
<td>1.53</td>
</tr>
<tr>
<td>LightGBM</td>
<td>2.08</td>
</tr>
</tbody>
</table>

Binary file not shown.

Before

Width:  |  Height:  |  Size: 40 KiB

After

Width:  |  Height:  |  Size: 42 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 24 KiB

After

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 55 KiB

After

Width:  |  Height:  |  Size: 58 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 53 KiB

After

Width:  |  Height:  |  Size: 55 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 6.4 KiB

After

Width:  |  Height:  |  Size: 7.1 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 6.4 KiB

After

Width:  |  Height:  |  Size: 7.1 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 6.4 KiB

After

Width:  |  Height:  |  Size: 7.1 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 6.4 KiB

After

Width:  |  Height:  |  Size: 7.1 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 6.4 KiB

After

Width:  |  Height:  |  Size: 6.6 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 6.4 KiB

After

Width:  |  Height:  |  Size: 6.6 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 7.0 KiB

After

Width:  |  Height:  |  Size: 7.3 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 7.0 KiB

After

Width:  |  Height:  |  Size: 7.3 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 23 KiB

After

Width:  |  Height:  |  Size: 26 KiB