Links to Medium articles

pull/4/head
vaclavdekanovsky 2021-01-26 21:46:50 +01:00
parent f3e73b33f8
commit 39f7be67a8
5 changed files with 18 additions and 3 deletions

View File

@ -5,7 +5,9 @@
"metadata": {},
"source": [
"# Julia Proof of Concept\n",
"In this notebook we explore processing of csv file in julia. We will load two files, join them, group by and aggregate and sort the results. In the end we run all the steps 7 times as a performance test. We also explore that julia needs to compile the code only once. "
"In this notebook we explore processing of csv file in julia. We will load two files, join them, group by and aggregate and sort the results. In the end we run all the steps 7 times as a performance test. We also explore that julia needs to compile the code only once. \n",
"\n",
"Article: https://towardsdatascience.com/is-something-better-than-pandas-when-the-dataset-fits-the-memory-7e8e983c4fe5"
]
},
{

View File

@ -1,3 +1,4 @@
# article https://towardsdatascience.com/is-something-better-than-pandas-when-the-dataset-fits-the-memory-7e8e983c4fe5
import sys
import os
import gc

View File

@ -5,7 +5,9 @@
"metadata": {},
"source": [
"# Process performance test log\n",
"In this notebook we process the performance test results generated by `Performance_test.py`. "
"In this notebook we process the performance test results generated by `Performance_test.py`. \n",
"\n",
"Article: https://towardsdatascience.com/is-something-better-than-pandas-when-the-dataset-fits-the-memory-7e8e983c4fe5"
]
},
{

View File

@ -5,7 +5,9 @@
"metadata": {},
"source": [
"# Pandas alternatives Proof of Concept\n",
"In this notebook we will explore, if it's worth using pandas alternatives - vaex, spark, dask or modin if our dataset fits comfortably into the memory. We will load two files, join them, group by and aggregate and sort the results. The file `Performance_test.py` contains all these steps executable as a performance test. "
"In this notebook we will explore, if it's worth using pandas alternatives - vaex, spark, dask or modin if our dataset fits comfortably into the memory. We will load two files, join them, group by and aggregate and sort the results. The file `Performance_test.py` contains all these steps executable as a performance test. \n",
"\n",
"Article: https://towardsdatascience.com/is-something-better-than-pandas-when-the-dataset-fits-the-memory-7e8e983c4fe5"
]
},
{

View File

@ -1,2 +1,10 @@
# data-analysis-in-examples
Set of Jupyter notebook supporting articles on https://medium.com/@vdekanovsky
# Python
## Machine Learning
- [Cross Validation and train test splitting](https://towardsdatascience.com/complete-guide-to-pythons-cross-validation-with-examples-a9676b5cac12) ([code](https://github.com/vaasha/Machine-leaning-in-examples/blob/master/sklearn/cross-validation/Cross%20Validation.ipynb))
# Julia
- [Is Julia better than pandas](https://towardsdatascience.com/is-something-better-than-pandas-when-the-dataset-fits-the-memory-7e8e983c4fe5) ([code](https://github.com/vaclavdekanovsky/data-analysis-in-examples/tree/master/DataFrames/Pandas_Alternatives))
- [Read CSV in Julia](https://towardsdatascience.com/read-csv-to-data-frame-in-julia-programming-lang-77f3d0081c14) ([code](https://github.com/vaclavdekanovsky/data-analysis-in-examples/blob/master/Julia/CSV/Read_CSV.ipynb))