Compare commits
8 Commits
main
...
documentat
Author | SHA1 | Date |
---|---|---|
Ran Aroussi | a6554c637a | |
Ran Aroussi | 5f3ab6b893 | |
silvavn | ccd5d95566 | |
silvavn | a53349f886 | |
silvavn | 37c6ca1086 | |
silvavn | 852ef93fa3 | |
Ran Aroussi | 932b3a1731 | |
Ran Aroussi | 2339aade13 |
|
@ -6,6 +6,4 @@ yfinance.egg-info
|
|||
.coverage
|
||||
.vscode/
|
||||
build/
|
||||
*.html
|
||||
*.css
|
||||
*.png
|
||||
site/
|
||||
|
|
|
@ -0,0 +1,132 @@
|
|||
Advanced Usage
|
||||
==============
|
||||
|
||||
Using Proxies
|
||||
-------------
|
||||
|
||||
If you want to use a proxy server for downloading data, use:
|
||||
|
||||
``` python
|
||||
import yfinance as yf
|
||||
|
||||
msft = yf.Ticker("MSFT")
|
||||
|
||||
msft.history(..., proxy="PROXY_SERVER")
|
||||
msft.get_actions(proxy="PROXY_SERVER")
|
||||
msft.get_dividends(proxy="PROXY_SERVER")
|
||||
msft.get_splits(proxy="PROXY_SERVER")
|
||||
msft.get_balance_sheet(proxy="PROXY_SERVER")
|
||||
msft.get_cashflow(proxy="PROXY_SERVER")
|
||||
msft.option_chain(..., proxy="PROXY_SERVER")
|
||||
...
|
||||
```
|
||||
|
||||
To use a custom `requests` session (for example to cache calls to the
|
||||
API or customize the `User-agent` header), pass a `session=` argument to
|
||||
the Ticker constructor.
|
||||
|
||||
``` python
|
||||
import requests_cache
|
||||
import yfinance as yf
|
||||
session = requests_cache.CachedSession('yfinance.cache')
|
||||
session.headers['User-agent'] = 'my-program/1.0'
|
||||
ticker = yf.Ticker('msft aapl goog', session=session)
|
||||
# The scraped response will be stored in the cache
|
||||
ticker.actions
|
||||
```
|
||||
|
||||
To initialize multiple `Ticker` objects, use
|
||||
|
||||
``` python
|
||||
import yfinance as yf
|
||||
|
||||
tickers = yf.Tickers('msft aapl goog')
|
||||
# ^ returns a named tuple of Ticker objects
|
||||
|
||||
# access each ticker using (example)
|
||||
tickers.tickers.MSFT.info
|
||||
tickers.tickers.AAPL.history(period="1mo")
|
||||
tickers.tickers.GOOG.actions
|
||||
```
|
||||
|
||||
Fetching data for multiple tickers
|
||||
----------------------------------
|
||||
|
||||
``` python
|
||||
import yfinance as yf
|
||||
data = yf.download("SPY AAPL", start="2017-01-01", end="2017-04-30")
|
||||
```
|
||||
|
||||
I've also added some options to make life easier :)
|
||||
|
||||
``` python
|
||||
import yfinance as yf
|
||||
data = yf.download( # or pdr.get_data_yahoo(...
|
||||
# tickers list or string as well
|
||||
tickers="SPY AAPL MSFT",
|
||||
|
||||
# use "period" instead of start/end
|
||||
# valid periods: 1d,5d,1mo,3mo,6mo,1y,2y,5y,10y,ytd,max
|
||||
# (optional, default is '1mo')
|
||||
period="ytd",
|
||||
|
||||
# fetch data by interval (including intraday if period < 60 days)
|
||||
# valid intervals: 1m,2m,5m,15m,30m,60m,90m,1h,1d,5d,1wk,1mo,3mo
|
||||
# (optional, default is '1d')
|
||||
interval="1m",
|
||||
|
||||
# group by ticker (to access via data['SPY'])
|
||||
# (optional, default is 'column')
|
||||
group_by="ticker",
|
||||
|
||||
# adjust all OHLC automatically
|
||||
# (optional, default is False)
|
||||
auto_adjust=True,
|
||||
|
||||
# download pre/post regular market hours data
|
||||
# (optional, default is False)
|
||||
prepost=True,
|
||||
|
||||
# use threads for mass downloading? (True/False/Integer)
|
||||
# (optional, default is True)
|
||||
threads=True,
|
||||
|
||||
# proxy URL scheme use use when downloading?
|
||||
# (optional, default is None)
|
||||
proxy=None
|
||||
)
|
||||
```
|
||||
|
||||
Managing Multi-Level Columns
|
||||
----------------------------
|
||||
|
||||
The following answer on Stack Overflow is for [How to deal with
|
||||
multi-level column names downloaded with
|
||||
yfinance?](https://stackoverflow.com/questions/63107801)
|
||||
|
||||
- `yfinance` returns a `pandas.DataFrame` with multi-level column
|
||||
names, with a level for the ticker and a level for the stock price
|
||||
data
|
||||
- The answer discusses:
|
||||
- How to correctly read the the multi-level columns after
|
||||
saving the dataframe to a csv with `pandas.DataFrame.to_csv`
|
||||
- How to download single or multiple tickers into a single
|
||||
dataframe with single level column names and a ticker column
|
||||
|
||||
`pandas_datareader` override
|
||||
----------------------------
|
||||
|
||||
If your code uses `pandas_datareader` and you want to download data
|
||||
faster, you can "hijack" `pandas_datareader.data.get_data_yahoo()`
|
||||
method to use **yfinance** while making sure the returned data is in the
|
||||
same format as **pandas\_datareader**'s `get_data_yahoo()`.
|
||||
|
||||
``` python
|
||||
from pandas_datareader import data as pdr
|
||||
import yfinance as yf
|
||||
|
||||
yf.pdr_override() # <== that's all it takes :-)
|
||||
|
||||
# download dataframe
|
||||
data = pdr.get_data_yahoo("SPY", start="2017-01-01", end="2017-04-30")
|
||||
```
|
|
@ -3,13 +3,13 @@ Installation
|
|||
|
||||
Install `yfinance` using `pip`:
|
||||
|
||||
``` {.sourceCode .bash}
|
||||
``` bash
|
||||
$ pip install yfinance --upgrade --no-cache-dir
|
||||
```
|
||||
|
||||
Install `yfinance` using `conda`:
|
||||
|
||||
``` {.sourceCode .bash}
|
||||
``` bash
|
||||
$ conda install -c ranaroussi yfinance
|
||||
```
|
||||
|
||||
|
|
|
@ -9,7 +9,7 @@ Pythonic way:
|
|||
|
||||
Note: yahoo finance datetimes are received as UTC.
|
||||
|
||||
``` {.sourceCode .python}
|
||||
``` python
|
||||
import yfinance as yf
|
||||
|
||||
msft = yf.Ticker("MSFT")
|
||||
|
@ -70,129 +70,4 @@ msft.options
|
|||
# get option chain for specific expiration
|
||||
opt = msft.option_chain('YYYY-MM-DD')
|
||||
# data available via: opt.calls, opt.puts
|
||||
```
|
||||
|
||||
If you want to use a proxy server for downloading data, use:
|
||||
|
||||
``` {.sourceCode .python}
|
||||
import yfinance as yf
|
||||
|
||||
msft = yf.Ticker("MSFT")
|
||||
|
||||
msft.history(..., proxy="PROXY_SERVER")
|
||||
msft.get_actions(proxy="PROXY_SERVER")
|
||||
msft.get_dividends(proxy="PROXY_SERVER")
|
||||
msft.get_splits(proxy="PROXY_SERVER")
|
||||
msft.get_balance_sheet(proxy="PROXY_SERVER")
|
||||
msft.get_cashflow(proxy="PROXY_SERVER")
|
||||
msft.option_chain(..., proxy="PROXY_SERVER")
|
||||
...
|
||||
```
|
||||
|
||||
To use a custom `requests` session (for example to cache calls to the
|
||||
API or customize the `User-agent` header), pass a `session=` argument to
|
||||
the Ticker constructor.
|
||||
|
||||
``` {.sourceCode .python}
|
||||
import requests_cache
|
||||
session = requests_cache.CachedSession('yfinance.cache')
|
||||
session.headers['User-agent'] = 'my-program/1.0'
|
||||
ticker = yf.Ticker('msft aapl goog', session=session)
|
||||
# The scraped response will be stored in the cache
|
||||
ticker.actions
|
||||
```
|
||||
|
||||
To initialize multiple `Ticker` objects, use
|
||||
|
||||
``` {.sourceCode .python}
|
||||
import yfinance as yf
|
||||
|
||||
tickers = yf.Tickers('msft aapl goog')
|
||||
# ^ returns a named tuple of Ticker objects
|
||||
|
||||
# access each ticker using (example)
|
||||
tickers.tickers.MSFT.info
|
||||
tickers.tickers.AAPL.history(period="1mo")
|
||||
tickers.tickers.GOOG.actions
|
||||
```
|
||||
|
||||
Fetching data for multiple tickers
|
||||
----------------------------------
|
||||
|
||||
``` {.sourceCode .python}
|
||||
import yfinance as yf
|
||||
data = yf.download("SPY AAPL", start="2017-01-01", end="2017-04-30")
|
||||
```
|
||||
|
||||
I've also added some options to make life easier :)
|
||||
|
||||
``` {.sourceCode .python}
|
||||
data = yf.download( # or pdr.get_data_yahoo(...
|
||||
# tickers list or string as well
|
||||
tickers = "SPY AAPL MSFT",
|
||||
|
||||
# use "period" instead of start/end
|
||||
# valid periods: 1d,5d,1mo,3mo,6mo,1y,2y,5y,10y,ytd,max
|
||||
# (optional, default is '1mo')
|
||||
period = "ytd",
|
||||
|
||||
# fetch data by interval (including intraday if period < 60 days)
|
||||
# valid intervals: 1m,2m,5m,15m,30m,60m,90m,1h,1d,5d,1wk,1mo,3mo
|
||||
# (optional, default is '1d')
|
||||
interval = "1m",
|
||||
|
||||
# group by ticker (to access via data['SPY'])
|
||||
# (optional, default is 'column')
|
||||
group_by = 'ticker',
|
||||
|
||||
# adjust all OHLC automatically
|
||||
# (optional, default is False)
|
||||
auto_adjust = True,
|
||||
|
||||
# download pre/post regular market hours data
|
||||
# (optional, default is False)
|
||||
prepost = True,
|
||||
|
||||
# use threads for mass downloading? (True/False/Integer)
|
||||
# (optional, default is True)
|
||||
threads = True,
|
||||
|
||||
# proxy URL scheme use use when downloading?
|
||||
# (optional, default is None)
|
||||
proxy = None
|
||||
)
|
||||
```
|
||||
|
||||
Managing Multi-Level Columns
|
||||
----------------------------
|
||||
|
||||
The following answer on Stack Overflow is for [How to deal with
|
||||
multi-level column names downloaded with
|
||||
yfinance?](https://stackoverflow.com/questions/63107801)
|
||||
|
||||
- `yfinance` returns a `pandas.DataFrame` with multi-level column
|
||||
names, with a level for the ticker and a level for the stock price
|
||||
data
|
||||
- The answer discusses:
|
||||
- How to correctly read the the multi-level columns after
|
||||
saving the dataframe to a csv with `pandas.DataFrame.to_csv`
|
||||
- How to download single or multiple tickers into a single
|
||||
dataframe with single level column names and a ticker column
|
||||
|
||||
`pandas_datareader` override
|
||||
----------------------------
|
||||
|
||||
If your code uses `pandas_datareader` and you want to download data
|
||||
faster, you can "hijack" `pandas_datareader.data.get_data_yahoo()`
|
||||
method to use **yfinance** while making sure the returned data is in the
|
||||
same format as **pandas\_datareader**'s `get_data_yahoo()`.
|
||||
|
||||
``` {.sourceCode .python}
|
||||
from pandas_datareader import data as pdr
|
||||
|
||||
import yfinance as yf
|
||||
yf.pdr_override() # <== that's all it takes :-)
|
||||
|
||||
# download dataframe
|
||||
data = pdr.get_data_yahoo("SPY", start="2017-01-01", end="2017-04-30")
|
||||
```
|
39
mkdocs.yml
39
mkdocs.yml
|
@ -1,19 +1,26 @@
|
|||
# site_name: My Docs
|
||||
site_name: My Docs
|
||||
|
||||
# # mkdocs.yml
|
||||
# theme:
|
||||
# name: "material"
|
||||
# mkdocs.yml
|
||||
theme:
|
||||
name: "material"
|
||||
|
||||
# plugins:
|
||||
# - search
|
||||
# - mkdocstrings
|
||||
plugins:
|
||||
- search
|
||||
- mkdocstrings
|
||||
|
||||
# nav:
|
||||
# - Introduction: 'index.md'
|
||||
# - Installation: 'installation.md'
|
||||
# - Quick Start: 'quickstart.md'
|
||||
# # - Ticker: 'Ticker.md'
|
||||
# - TickerBase: 'TickerBase.md'
|
||||
# # - Tickers: 'Tickers.md'
|
||||
# - utils: 'utils.md'
|
||||
# - multi: 'multi.md'
|
||||
markdown_extensions:
|
||||
- pymdownx.highlight
|
||||
- pymdownx.inlinehilite
|
||||
- pymdownx.superfences
|
||||
- pymdownx.snippets
|
||||
|
||||
nav:
|
||||
- Introduction: 'index.md'
|
||||
- Installation: 'installation.md'
|
||||
- Quick Start: 'quickstart.md'
|
||||
- Advanced Usage: 'advancedUsage.md'
|
||||
# - Ticker: 'Ticker.md'
|
||||
- TickerBase: 'TickerBase.md'
|
||||
# - Tickers: 'Tickers.md'
|
||||
- utils: 'utils.md'
|
||||
- multi: 'multi.md'
|
||||
|
|
|
@ -26,7 +26,6 @@ import re as _re
|
|||
import pandas as _pd
|
||||
import numpy as _np
|
||||
import sys as _sys
|
||||
import re as _re
|
||||
|
||||
try:
|
||||
import ujson as _json
|
||||
|
@ -34,9 +33,15 @@ except ImportError:
|
|||
import json as _json
|
||||
|
||||
|
||||
user_agent_headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
|
||||
user_agent_headers = {
|
||||
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
|
||||
|
||||
|
||||
def empty_df(index=[]):
|
||||
'''
|
||||
The "empty_df" function creates a pandas dataframe with the index being the dates, and columns including open, high, low, close, adj close and volume.
|
||||
It is used to create an empty dataframe that will be filled later on.
|
||||
'''
|
||||
empty = _pd.DataFrame(index=index, data={
|
||||
'Open': _np.nan, 'High': _np.nan, 'Low': _np.nan,
|
||||
'Close': _np.nan, 'Adj Close': _np.nan, 'Volume': _np.nan})
|
||||
|
@ -45,12 +50,25 @@ def empty_df(index=[]):
|
|||
|
||||
|
||||
def get_html(url, proxy=None, session=None):
|
||||
'''
|
||||
url: the website you want to visit.
|
||||
proxy: a dictionary of your proxies, like {'http': 'http://127.0.0.1:1080', 'https': 'https://127.0.0.1:1080'}
|
||||
session: if you have already opened a session with your proxies, then pass it in here.
|
||||
'''
|
||||
session = session or _requests
|
||||
html = session.get(url=url, proxies=proxy, headers=user_agent_headers).text
|
||||
return html
|
||||
|
||||
|
||||
def get_json(url, proxy=None, session=None):
|
||||
def get_json(url: str, proxy: dict = None, session=None):
|
||||
'''
|
||||
url: the website we want to get json from
|
||||
proxy: the proxies that we use to avoid being detected as a robot by websites.
|
||||
It is a dictionary, e.g., {"http":"http://10.10.1.10:3128"}
|
||||
session: requests library's object used for sending requests and receiving responses in multiple threads or asynchronous applications
|
||||
|
||||
The function will return a dictionary of data if it works well, otherwise an empty dictionary.
|
||||
'''
|
||||
|
||||
session = session or _requests
|
||||
html = session.get(url=url, proxies=proxy, headers=user_agent_headers).text
|
||||
|
@ -78,6 +96,10 @@ def camel2title(o):
|
|||
|
||||
|
||||
def auto_adjust(data):
|
||||
'''
|
||||
The "auto_adjust" function is used to adjust the dataframe according to the adjusted close price.
|
||||
It takes in a dataframe as an argument and returns a new dataframe with adjusted prices for all columns except volume.
|
||||
'''
|
||||
df = data.copy()
|
||||
ratio = df["Close"] / df["Adj Close"]
|
||||
df["Adj Open"] = df["Open"] / ratio
|
||||
|
@ -98,7 +120,11 @@ def auto_adjust(data):
|
|||
|
||||
|
||||
def back_adjust(data):
|
||||
""" back-adjusted data to mimic true historical prices """
|
||||
'''
|
||||
The function takes in a dataframe as an input and returns the same dataframe with adjusted columns for "Open", "High", "Low" and "Close".
|
||||
The ratio of each adjusted column is calculated by dividing the original Adj Close price by the Close price.
|
||||
Each adjusted column is then multiplied by this ratio to get the new value for that column, which will be used in future calculations.
|
||||
'''
|
||||
|
||||
df = data.copy()
|
||||
ratio = df["Adj Close"] / df["Close"]
|
||||
|
@ -119,6 +145,12 @@ def back_adjust(data):
|
|||
|
||||
|
||||
def parse_quotes(data, tz=None):
|
||||
'''
|
||||
The function takes in the data from the "get_data" function and parses it into a pandas DataFrame.
|
||||
It uses the timestamps to index each row of data, with the OHLC, volume, and adjusted close all contained within one DataFrame.
|
||||
If no timezone is specified for this function then it will default to UTC.
|
||||
'''
|
||||
|
||||
timestamps = data["timestamp"]
|
||||
ohlc = data["indicators"]["quote"][0]
|
||||
volumes = ohlc["volume"]
|
||||
|
@ -148,6 +180,11 @@ def parse_quotes(data, tz=None):
|
|||
|
||||
|
||||
def parse_actions(data, tz=None):
|
||||
'''
|
||||
The function takes in the data from "get_data" and then checks if there are any events (dividends or splits)
|
||||
If so, it creates a pandas DataFrame for each type of event with the date as an index and uses the values as columns
|
||||
It also converts all of the dates to datetime objects and sets them as timezone aware if a timezone was specified
|
||||
'''
|
||||
dividends = _pd.DataFrame(columns=["Dividends"])
|
||||
splits = _pd.DataFrame(columns=["Stock Splits"])
|
||||
|
||||
|
@ -180,6 +217,10 @@ def parse_actions(data, tz=None):
|
|||
|
||||
class ProgressBar:
|
||||
def __init__(self, iterations, text='completed'):
|
||||
'''
|
||||
The "__init__" function is the constructor for the class.
|
||||
It takes in the parameters that were passed into the class and stores them in variables.
|
||||
'''
|
||||
self.text = text
|
||||
self.iterations = iterations
|
||||
self.prog_bar = '[]'
|
||||
|
@ -189,6 +230,10 @@ class ProgressBar:
|
|||
self.elapsed = 1
|
||||
|
||||
def completed(self):
|
||||
"""
|
||||
The "completed" function is a function that is called when the program is completed.
|
||||
It prints out the progress bar and then ends the program.
|
||||
"""
|
||||
if self.elapsed > self.iterations:
|
||||
self.elapsed = self.iterations
|
||||
self.update_iteration(1)
|
||||
|
@ -197,6 +242,12 @@ class ProgressBar:
|
|||
print()
|
||||
|
||||
def animate(self, iteration=None):
|
||||
'''
|
||||
The "animate" function is a function that is called to update the progress bar.
|
||||
It takes in an optional parameter, "iteration".
|
||||
If "iteration" is not passed in, then the function will increment the "elapsed" variable by 1.
|
||||
If "iteration" is passed in, then the function will increment the "elapsed" variable by the value of "iteration".
|
||||
'''
|
||||
if iteration is None:
|
||||
self.elapsed += 1
|
||||
iteration = self.elapsed
|
||||
|
@ -208,12 +259,29 @@ class ProgressBar:
|
|||
self.update_iteration()
|
||||
|
||||
def update_iteration(self, val=None):
|
||||
'''
|
||||
The "update_iteration" function is a function that is called to update the progress bar.
|
||||
It takes in an optional parameter, "val".
|
||||
If "val" is not passed in, then the function will set the "elapsed" variable to the value of the "iterations" variable.
|
||||
If "val" is passed in, then the function will set the "elapsed" variable to the value of "val".
|
||||
'''
|
||||
val = val if val is not None else self.elapsed / float(self.iterations)
|
||||
self.__update_amount(val * 100.0)
|
||||
self.prog_bar += ' %s of %s %s' % (
|
||||
self.elapsed, self.iterations, self.text)
|
||||
|
||||
def __update_amount(self, new_amount):
|
||||
'''
|
||||
The "__update_amount" function is a function that is called to update the progress bar.
|
||||
It takes in one parameter, "new_amount".
|
||||
It calculates the percentage that the program has completed and stores it in a variable called "percent_done".
|
||||
It calculates how many hashes are needed and stores it in a variable called "all_full".
|
||||
It calculates how many hashes need to be displayed and stores it in a variable called "num_hashes".
|
||||
It creates a string called "pct_string" that displays the percentage that the program has completed.
|
||||
It creates a string called "prog_bar" that displays the progress bar.
|
||||
It creates a string called "pct_place" that stores where the percentage string should be displayed.
|
||||
It returns the value of "prog_bar".
|
||||
'''
|
||||
percent_done = int(round((new_amount / 100.0) * 100.0))
|
||||
all_full = self.width - 2
|
||||
num_hashes = int(round((percent_done / 100.0) * all_full))
|
||||
|
@ -225,4 +293,8 @@ class ProgressBar:
|
|||
(pct_string + self.prog_bar[pct_place + len(pct_string):])
|
||||
|
||||
def __str__(self):
|
||||
'''
|
||||
The "__str__" function is a function that is called to return the value of the progress bar.
|
||||
It returns the value of "prog_bar".
|
||||
'''
|
||||
return str(self.prog_bar)
|
||||
|
|
Loading…
Reference in New Issue