2013-03-30 02:07:46 +08:00
===========
2013-03-21 11:35:19 +08:00
ipython-sql
===========
2013-03-29 22:30:27 +08:00
2013-03-30 02:07:46 +08:00
:Author: Catherine Devlin, http://catherinedevlin.blogspot.com
2013-09-21 07:52:11 +08:00
2014-08-26 18:35:41 +08:00
Introduces a %sql (or %%sql) magic.
2013-03-21 00:55:52 +08:00
2023-04-22 04:46:44 +08:00
Legacy project
--------------
2023-04-22 04:49:35 +08:00
IPython-SQL's functionality and maintenance have been eclipsed by JupySQL_, a fork maintained and developed by the Ploomber team. Future work will be directed into JupySQL - please file issues there, as well!
2023-04-22 04:46:44 +08:00
Description
-----------
2020-05-02 22:07:14 +08:00
Connect to a database, using `SQLAlchemy URL`_ connect strings, then issue SQL
2013-03-21 11:35:19 +08:00
commands within IPython or IPython Notebook.
2013-03-21 00:55:52 +08:00
2014-08-26 18:35:41 +08:00
.. image :: https://raw.github.com/catherinedevlin/ipython-sql/master/examples/writers.png
2013-03-30 01:54:47 +08:00
:width: 600px
2013-03-30 02:14:56 +08:00
:alt: screenshot of ipython-sql in the Notebook
2014-08-26 18:35:41 +08:00
2014-09-27 10:38:26 +08:00
Examples
--------
.. code-block :: python
2013-03-21 11:35:19 +08:00
In [1]: %load_ext sql
2013-03-30 22:51:49 +08:00
In [2]: %%sql postgresql://will:longliveliz@localhost/shakes
2013-03-21 11:35:19 +08:00
...: select * from character
...: where abbrev = 'ALICE'
2014-08-26 18:35:41 +08:00
...:
2013-03-29 02:11:50 +08:00
Out[2]: [(u'Alice', u'Alice', u'ALICE', u'a lady attending on Princess Katherine', 22)]
2014-08-26 18:35:41 +08:00
2013-03-29 02:11:50 +08:00
In [3]: result = _
2014-08-26 18:35:41 +08:00
2013-03-29 02:11:50 +08:00
In [4]: print(result)
2014-08-26 18:35:41 +08:00
charid charname abbrev description speechcount
2013-03-30 01:54:47 +08:00
=================================================================================
2014-08-26 18:35:41 +08:00
Alice Alice ALICE a lady attending on Princess Katherine 22
2013-03-29 02:11:50 +08:00
In [4]: result.keys
Out[5]: [u'charid', u'charname', u'abbrev', u'description', u'speechcount']
2014-08-26 18:35:41 +08:00
2013-03-29 02:11:50 +08:00
In [6]: result[0][0]
Out[6]: u'Alice'
2014-08-26 18:35:41 +08:00
2013-03-30 01:54:47 +08:00
In [7]: result[0].description
2013-03-29 02:11:50 +08:00
Out[7]: u'a lady attending on Princess Katherine'
2014-08-26 18:35:41 +08:00
2013-03-21 11:35:19 +08:00
After the first connection, connect info can be omitted::
2013-03-29 22:30:27 +08:00
In [8]: %sql select count(*) from work
Out[8]: [(43L,)]
2014-08-26 18:35:41 +08:00
Connections to multiple databases can be maintained. You can refer to
2014-09-27 10:35:52 +08:00
an existing connection by username@database
.. code-block :: python
2013-03-21 11:35:19 +08:00
2013-03-29 22:30:27 +08:00
In [9]: %%sql will@shakes
2014-08-26 18:35:41 +08:00
...: select charname, speechcount from character
...: where speechcount = (select max(speechcount)
2013-03-29 22:30:27 +08:00
...: from character);
2014-08-26 18:35:41 +08:00
...:
2013-03-29 22:30:27 +08:00
Out[9]: [(u'Poet', 733)]
2014-08-26 18:35:41 +08:00
2013-03-29 22:30:27 +08:00
In [10]: print(_)
2014-08-26 18:35:41 +08:00
charname speechcount
2013-03-21 11:35:19 +08:00
======================
2014-08-26 18:35:41 +08:00
Poet 733
2017-10-12 16:52:33 +08:00
If no connect string is supplied, `` %sql `` will provide a list of existing connections;
2017-06-09 23:41:15 +08:00
however, if no connections have yet been made and the environment variable `` DATABASE_URL ``
is available, that will be used.
2016-02-23 09:56:09 +08:00
For secure access, you may dynamically access your credentials (e.g. from your system environment or `getpass.getpass` ) to avoid storing your password in the notebook itself. Use the `$` before any variable to access it in your `%sql` command.
.. code-block :: python
In [11]: user = os.getenv('SOME_USER')
....: password = os.getenv('SOME_PASSWORD')
....: connection_string = "postgresql://{user}:{password}@localhost/some_database".format(user=user, password=password)
....: %sql $connection_string
Out[11]: u'Connected: some_user@some_database'
2013-03-30 01:54:47 +08:00
You may use multiple SQL statements inside a single cell, but you will
only see any query results from the last of them, so this really only
2014-09-27 10:35:52 +08:00
makes sense for statements with no output
.. code-block :: python
2013-03-30 01:54:47 +08:00
In [11]: %%sql sqlite://
....: CREATE TABLE writer (first_name, last_name, year_of_death);
....: INSERT INTO writer VALUES ('William', 'Shakespeare', 1616);
....: INSERT INTO writer VALUES ('Bertold', 'Brecht', 1956);
2014-08-26 18:35:41 +08:00
....:
Out[11]: []
2013-03-30 01:54:47 +08:00
2013-05-31 05:03:39 +08:00
2013-10-14 10:14:05 +08:00
As a convenience, dict-style access for result sets is supported, with the
leftmost column serving as key, for unique values.
2014-09-27 10:35:52 +08:00
.. code-block :: python
2013-10-14 10:14:05 +08:00
2020-05-02 21:16:39 +08:00
In [12]: result = %sql select * from work
2013-10-14 10:14:05 +08:00
43 rows affected.
2020-05-02 21:16:39 +08:00
In [13]: result['richard2']
Out[14]: (u'richard2', u'Richard II', u'History of Richard II', 1595, u'h', None, u'Moby', 22411, 628)
2013-10-14 10:14:05 +08:00
2017-06-03 19:26:50 +08:00
Results can also be retrieved as an iterator of dictionaries (`` result.dicts() `` )
or a single dictionary with a tuple of scalar values per key (`` result.dict() `` )
2020-05-02 21:16:39 +08:00
Variable substitution
---------------------
Bind variables (bind parameters) can be used in the "named" (:x) style.
The variable names used should be defined in the local namespace.
.. code-block :: python
In [15]: name = 'Countess'
In [16]: %sql select description from character where charname = :name
Out[16]: [(u'mother to Bertram',)]
In [17]: %sql select description from character where charname = '{name}'
Out[17]: [(u'mother to Bertram',)]
Alternately, `` $variable_name `` or `` {variable_name} `` can be
used to inject variables from the local namespace into the SQL
statement before it is formed and passed to the SQL engine.
(Using `` $ `` and `` {} `` together, as in `` ${variable_name} `` ,
is not supported.)
Bind variables are passed through to the SQL engine and can only
be used to replace strings passed to SQL. `` $ `` and `` {} `` are
substituted before passing to SQL and can be used to form SQL
statements dynamically.
2017-03-04 22:09:22 +08:00
Assignment
----------
2017-05-30 01:48:22 +08:00
Ordinary IPython assignment works for single-line `%sql` queries:
2017-03-04 22:09:22 +08:00
.. code-block :: python
2020-05-02 21:16:39 +08:00
In [18]: works = %sql SELECT title, year FROM work
2017-03-04 22:09:22 +08:00
43 rows affected.
The `<<` operator captures query results in a local variable, and
2017-05-30 01:48:22 +08:00
can be used in multi-line `` %%sql `` :
2017-03-04 22:09:22 +08:00
.. code-block :: python
2020-05-02 21:16:39 +08:00
In [19]: %%sql works << SELECT title, year
2017-03-04 22:09:22 +08:00
...: FROM work
...:
43 rows affected.
Returning data to local variable works
2013-03-29 22:30:27 +08:00
Connecting
----------
2020-05-02 22:07:14 +08:00
Connection strings are `SQLAlchemy URL`_ standard.
2013-03-29 22:30:27 +08:00
2013-03-29 22:34:24 +08:00
Some example connection strings::
2013-03-29 22:30:27 +08:00
mysql+pymysql://scott:tiger@localhost/foo
oracle://scott:tiger@127.0.0.1:1521/sidname
sqlite://
sqlite:///foo.db
2017-03-13 15:45:36 +08:00
mssql+pyodbc://username:password@host/database?driver=SQL+Server+Native+Client+11.0
2014-08-26 18:35:41 +08:00
2020-05-02 22:07:14 +08:00
.. _`SQLAlchemy URL`: http://docs.sqlalchemy.org/en/latest/core/engines.html#database-urls
2013-03-29 22:30:27 +08:00
2014-08-26 18:35:41 +08:00
Note that `` mysql `` and `` mysql+pymysql `` connections (and perhaps others)
2013-06-15 22:58:34 +08:00
don't read your client character set information from .my.cnf. You need
to specify it in the connection string::
mysql+pymysql://scott:tiger@localhost/foo?charset=utf8
2019-05-08 00:20:54 +08:00
Note that an `` impala `` connection with `impyla`_ for HiveServer2 requires disabling autocommit::
2017-04-20 17:41:10 +08:00
%config SqlMagic.autocommit=False
%sql impala://hserverhost:port/default?kerberos_service_name=hive&auth_mechanism=GSSAPI
.. _impyla: https://github.com/cloudera/impyla
2020-02-26 16:32:03 +08:00
Connection arguments not whitelisted by SQLALchemy can be provided as
2020-02-27 07:33:54 +08:00
a flag with (-a|--connection_arguments)the connection string as a JSON string.
2020-05-02 22:07:14 +08:00
See `SQLAlchemy Args`_ .
2019-12-07 05:30:03 +08:00
2020-09-26 02:15:14 +08:00
| %sql --connection_arguments {"timeout":10,"mode":"ro"} sqlite:// SELECT * FROM work;
| %sql -a '{"timeout":10, "mode":"ro"}' sqlite:// SELECT * from work;
2020-02-26 16:32:03 +08:00
2020-05-02 22:07:14 +08:00
.. _`SQLAlchemy Args`: https://docs.sqlalchemy.org/en/13/core/engines.html#custom-dbapi-args
2020-02-26 16:32:03 +08:00
DSN connections
~~~~~~~~~~~~~~~
Alternately, you can store connection info in a
configuration file, under a section name chosen to
refer to your database.
For example, if dsn.ini contains
2020-09-26 02:15:14 +08:00
| [DB_CONFIG_1]
| drivername=postgres
| host=my.remote.host
| port=5433
| database=mydatabase
| username=myuser
| password=1234
2020-02-26 16:32:03 +08:00
then you can
2020-09-26 02:15:14 +08:00
| %config SqlMagic.dsn_filename='./dsn.ini'
| %sql --section DB_CONFIG_1
2019-12-07 05:30:03 +08:00
2013-03-29 22:30:27 +08:00
Configuration
-------------
2013-09-21 10:38:37 +08:00
Query results are loaded as lists, so very large result sets may use up
2014-08-26 18:35:41 +08:00
your system's memory and/or hang your browser. There is no autolimit
2013-10-14 10:14:05 +08:00
by default. However, `autolimit` (if set) limits the size of the result
set (usually with a `LIMIT` clause in the SQL). `displaylimit` is similar,
but the entire result set is still pulled into memory (for later analysis);
only the screen display is truncated.
2013-09-21 10:38:37 +08:00
2014-09-27 10:35:52 +08:00
.. code-block :: python
2013-09-21 10:38:37 +08:00
2020-02-26 16:32:03 +08:00
In [2]: %config SqlMagic
SqlMagic options
--------------
SqlMagic.autocommit=<Bool>
Current: True
Set autocommit mode
SqlMagic.autolimit=<Int>
Current: 0
Automatically limit the size of the returned result sets
SqlMagic.autopandas=<Bool>
Current: False
Return Pandas DataFrames instead of regular result sets
SqlMagic.column_local_vars=<Bool>
Current: False
Return data into local variables from column names
SqlMagic.displaycon=<Bool>
Current: False
Show connection string after execute
SqlMagic.displaylimit=<Int>
Current: None
Automatically limit the number of rows displayed (full result set is still
stored)
SqlMagic.dsn_filename=<Unicode>
Current: 'odbc.ini'
Path to DSN file. When the first argument is of the form [section], a
sqlalchemy connection string is formed from the matching section in the DSN
file.
SqlMagic.feedback=<Bool>
Current: False
Print number of rows affected by DML
SqlMagic.short_errors=<Bool>
Current: True
Don't display the full traceback on SQL Programming Error
SqlMagic.style=<Unicode>
Current: 'DEFAULT'
Set the table printing style to any of prettytable's defined styles
(currently DEFAULT, MSWORD_FRIENDLY, PLAIN_COLUMNS, RANDOM)
In[3]: %config SqlMagic.feedback = False
2014-08-26 18:35:41 +08:00
2016-09-22 05:36:10 +08:00
Please note: if you have autopandas set to true, the displaylimit option will not apply. You can set the pandas display limit by using the pandas `` max_rows `` option as described in the `pandas documentation <http://pandas.pydata.org/pandas-docs/version/0.18.1/options.html#frequently-used-options> `_ .
2013-03-30 22:45:46 +08:00
Pandas
------
2014-08-26 18:35:41 +08:00
If you have installed `` pandas `` , you can use a result set's
2014-09-27 10:35:52 +08:00
`` .DataFrame() `` method
.. code-block :: python
2013-03-30 22:45:46 +08:00
2013-09-21 07:52:11 +08:00
In [3]: result = %sql SELECT * FROM character WHERE speechcount > 25
2014-08-26 18:35:41 +08:00
2013-09-21 07:52:11 +08:00
In [4]: dataframe = result.DataFrame()
2014-08-26 18:35:41 +08:00
2020-02-26 16:32:03 +08:00
2020-05-02 20:22:09 +08:00
The `` --persist `` argument, with the name of a
2020-02-26 16:32:03 +08:00
DataFrame object in memory,
will create a table name
2020-05-02 20:22:09 +08:00
in the database from the named DataFrame.
Or use `` --append `` to add rows to an existing
table by that name.
2014-08-26 18:35:41 +08:00
2014-09-27 10:35:52 +08:00
.. code-block :: python
2020-02-26 16:32:03 +08:00
In [5]: %sql --persist dataframe
2014-08-26 18:35:41 +08:00
In [6]: %sql SELECT * FROM dataframe;
2013-03-30 22:45:46 +08:00
.. _Pandas: http://pandas.pydata.org/
2013-09-21 07:52:11 +08:00
Graphing
--------
If you have installed `` matplotlib `` , you can use a result set's
2014-09-27 10:35:52 +08:00
`` .plot() `` , `` .pie() `` , and `` .bar() `` methods for quick plotting
.. code-block :: python
2013-09-21 07:52:11 +08:00
In[5]: result = %sql SELECT title, totalwords FROM work WHERE genretype = 'c'
In[6]: %matplotlib inline
In[7]: result.pie()
2014-08-26 18:35:41 +08:00
.. image :: https://raw.github.com/catherinedevlin/ipython-sql/master/examples/wordcount.png
2013-09-21 07:52:11 +08:00
:alt: pie chart of word count of Shakespeare's comedies
2014-08-26 18:35:41 +08:00
2014-01-15 05:27:11 +08:00
Dumping
-------
2014-08-26 18:35:41 +08:00
2014-01-15 05:27:11 +08:00
Result sets come with a `` .csv(filename=None) `` method. This generates
comma-separated text either as a return value (if `` filename `` is not
2014-09-27 10:35:52 +08:00
specified) or in a file of the given name.
2014-01-15 05:27:11 +08:00
2016-12-21 04:01:30 +08:00
.. code-block :: python
2016-12-21 04:02:16 +08:00
In[8]: result = %sql SELECT title, totalwords FROM work WHERE genretype = 'c'
2016-12-21 04:01:30 +08:00
2016-12-21 04:02:16 +08:00
In[9]: result.csv(filename='work.csv')
2016-12-21 04:01:30 +08:00
2017-06-07 09:50:07 +08:00
PostgreSQL features
-------------------
2017-06-09 23:41:15 +08:00
`` psql `` -style "backslash" `meta-commands`_ commands (`` \d `` , `` \dt `` , etc.)
2018-02-18 12:45:55 +08:00
are provided by `PGSpecial`_ . Example:
.. code-block :: python
In[9]: %sql \d
2017-06-07 09:50:07 +08:00
2017-06-09 23:41:15 +08:00
.. _PGSpecial: https://pypi.python.org/pypi/pgspecial
2017-06-07 09:50:07 +08:00
.. _meta-commands: https://www.postgresql.org/docs/9.6/static/app-psql.html#APP-PSQL-META-COMMANDS
2020-05-11 04:58:03 +08:00
Options
-------
`` -l `` / `` --connections ``
List all active connections
`` -x `` / `` --close <session-name> ``
Close named connection
`` -c `` / `` --creator <creator-function> ``
Specify creator function for new connection
`` -s `` / `` --section <section-name> ``
Section of dsn_file to be used for generating a connection string
`` -p `` / `` --persist ``
Create a table name in the database from the named DataFrame
`` --append ``
Like `` --persist `` , but appends to the table if it already exists
`` -a `` / `` --connection_arguments <"{connection arguments}"> ``
Specify dictionary of connection arguments to pass to SQL driver
`` -f `` / `` --file <path> ``
Run SQL from file at this path
Caution
-------
Comments
~~~~~~~~
Because ipyton-sql accepts `` -- `` -delimited options like `` --persist `` , but `` -- ``
is also the syntax to denote a SQL comment, the parser needs to make some assumptions.
- If you try to pass an unsupported argument, like `` --lutefisk `` , it will
be interpreted as a SQL comment and will not throw an unsupported argument
exception.
- If the SQL statement begins with a first-line comment that looks like one
of the accepted arguments - like `` %sql --persist is great! `` - it will be
parsed like an argument, not a comment. Moving the comment to the second
line or later will avoid this.
2017-06-07 09:50:07 +08:00
Installing
----------
2022-06-02 22:43:47 +08:00
Install the latest release with::
2017-06-07 09:50:07 +08:00
pip install ipython-sql
or download from https://github.com/catherinedevlin/ipython-sql and::
cd ipython-sql
sudo python setup.py install
2013-03-29 22:30:27 +08:00
Development
-----------
https://github.com/catherinedevlin/ipython-sql
Credits
-------
- Matthias Bussonnier for help with configuration
2014-09-27 10:35:52 +08:00
- Olivier Le Thanh Duong for `` %config `` fixes and improvements
- Distribute_
- Buildout_
- modern-package-template_
2013-11-14 03:22:11 +08:00
- Mike Wilson for bind variable code
- Thomas Kluyver and Steve Holden for debugging help
2014-07-19 06:50:00 +08:00
- Berton Earnshaw for DSN connection syntax
2020-02-26 16:32:03 +08:00
- Bruno Harbulot for DSN example
2015-09-11 07:47:07 +08:00
- Andrés Celis for SQL Server bugfix
2016-10-11 05:22:28 +08:00
- Michael Erasmus for DataFrame truth bugfix
2016-10-11 07:57:05 +08:00
- Noam Finkelstein for README clarification
2017-06-03 19:26:50 +08:00
- Xiaochuan Yu for `<<` operator, syntax colorization
2017-06-07 09:50:07 +08:00
- Amjith Ramanujam for PGSpecial and incorporating it here
2020-02-26 16:32:03 +08:00
- Alexander Maznev for better arg parsing, connections accepting specified creator
- Jonathan Larkin for configurable displaycon
2020-03-17 11:21:18 +08:00
- Jared Moore for `` connection-arguments `` support
2020-05-02 20:22:09 +08:00
- Gilbert Brault for `` --append ``
2020-05-02 21:16:39 +08:00
- Lucas Zeer for multi-line bugfixes for var substitution, `` << ``
2020-05-02 22:49:20 +08:00
- vkk800 for `` --file ``
2022-06-11 00:52:29 +08:00
- Jens Albrecht for MySQL DatabaseError bugfix
2022-06-11 23:56:52 +08:00
- meihkv for connection-closing bugfix
2023-02-26 12:57:18 +08:00
- Abhinav C for SQLAlchemy 2.0 compatibility
2014-09-27 10:35:52 +08:00
.. _Distribute: http://pypi.python.org/pypi/distribute
.. _Buildout: http://www.buildout.org/
.. _modern-package-template: http://pypi.python.org/pypi/modern-package-template
2023-04-22 04:49:35 +08:00
.. _JupySQL: https://github.com/ploomber/jupysql