It is an utility library for Kaggle and offline competitions. It is particularly focused on experiment tracking, feature engineering, and validation.
Go to file
Taiga Noumi 38a1bd6b69 Create pythonpublish.yml 2019-12-29 00:07:24 +09:00
.github/workflows Create pythonpublish.yml 2019-12-29 00:07:24 +09:00
docs Create pythonpublish.yml 2019-12-29 00:07:24 +09:00
nyaggle refactoring 2019-12-28 17:25:03 +09:00
tests add test 2019-12-28 17:24:13 +09:00
.gitignore add GitHub action, python 3.5 support 2019-12-25 23:52:25 +09:00
.readthedocs.yml add readthedocs.yml 2019-12-26 23:19:54 +09:00
LICENSE Initial commit 2019-12-19 11:01:20 +09:00
MANIFEST.in Create pythonpublish.yml 2019-12-29 00:07:24 +09:00
README.md add comment 2019-12-27 23:56:31 +09:00
requirements.txt implement cv 2019-12-25 23:52:25 +09:00
setup.py add tags 2019-12-28 17:24:33 +09:00

README.md

nyaggle

nyaggle is a utility library for Kaggle and offline competitions, particularly focused on feature engineering and validation. See the documentation for details.

Installation

You can install nyaggle via pip:

$pip install nyaggle

Examples

Feature Engineering

Target Encoding with K-Fold

Text Vectorization using BERT

You need to install pytorch to your virtual environment to use BertSentenceVectorizer. MaCab and mecab-python3 are also required if you use Japanese BERT model.

Model Validation

cv() provides handy API to calculate K-fold CV, Out-of-Fold prediction and test prediction at one time. You can pass LGBMClassifier/LGBMRegressor and any other sklearn models.