
We often encounter data as Relational Databases. To work with them we generally would need to write raw SQL queries, pass them to the database engine and parse the returned results as a normal array of records.
SQLAlchemy provides a nice “Pythonic” way of interacting with databases. So rather than dealing with the differences between specific dialects of traditional SQL such as MySQL or PostgreSQL or Oracle, you can leverage the Pythonic framework of SQLAlchemy to streamline your workflow and more efficiently query your data.
Other stories on datascience can be found here
Installing The Package
pip install sqlalchemy
Connecting to a database
To start interacting with the database we first we need to establish a connection.
import sqlalchemy as db
engine = db.create_engine('dialect+driver://user:pass@host:port/db')
Some examples of connecting to various databases can be found here
Viewing Table Details
SQLAlchemy can be used to automatically load tables from a database using something called reflection. Reflection is the process of reading the database and building the metadata based on that information.
example
Querying
Table
and MetaData
have already been imported. The metadata is available as metadata.
ResultProxy: The object returned by the .execute()
method. It can be used in a variety of ways to get the data returned by the query.
ResultSet: The actual data asked for in the query when using a fetch method such as .fetchall()
on a ResultProxy.
Dealing with Large ResultSet
We use .fetchmany()
to load optimal no of rows and overcome memory issues in case of large datasets
while flag:
partial_results = ResultProxy.fetchmany(50)
if(partial_results == []):
flag = False
//
code
//
ResultProxy.close()
Convert to dataframe
df = pd.DataFrame(ResultSet)
df.columns = ResultSet[0].keys()
Filtering data
Lets see some examples of raw SQLite Queries and queries using SQLAlchemy.
where
SQL:
SELECT * FROM census
WHERE sex = F
SQLAlchemy:
db.select([census]).where(census.columns.sex == 'F')
in
SQL:
SELECT state, sex
FROM census
WHERE state IN (Texas, New York)
SQLAlchemy:
db.select([census.columns.state,
census.columns.sex]).where(census.columns.state.in_(['Texas', 'New York']))
and, or, not
SQL:
SELECT * FROM census
WHERE state = 'California' AND NOT sex = 'M'
SQLAlchemy:
db.select([census]).where(db.and_(census.columns.state ==
'California', census.columns.sex != 'M'))
order by
SQL:
SELECT * FROM census
ORDER BY State DESC, pop2000
SQLAlchemy:
db.select([census]).order_by(db.desc(census.columns.state),
census.columns.pop2000)
functions
SQL:
SELECT SUM(pop2008)
FROM census
SQLAlchemy:
db.select([db.func.sum(census.columns.pop2008)])
other functions include avg, count, min, max…
group by
SQL:
SELECT SUM(pop2008) as pop2008, sex FROM census
SQLAlchemy:
db.select([db.func.sum(census.columns.pop2008).label('pop2008'),
census.columns.sex]).group_by(census.columns.sex)
distinct
SQL:
SELECT DISTINCT state
FROM census
SQLAlchemy:
db.select([census.columns.state.distinct()])
case & cast
The case()
expression accepts a list of conditions to match and the column to return if the condition matches, followed by an else_
if none of the conditions match.
cast()
function to convert an expression to a particular type
example
We use .scalar
to the result when the result contains only single value
joins
If you have two tables that already have an established relationship, you can automatically use that relationship by just adding the columns we want from each table to the select statement.
select([census.columns.pop2008, state_fact.columns.abbreviation])
Creating and Inserting Data into Tables
By passing the database which is not present, to the engine then sqlalchemy automatically creates a new database.
Updating data in Databases
Delete Table
db.delete(table_name).where(condition)
Dropping a Table
table_name.drop(engine) #drops a single table
metadata.drop_all(engine) #drops all the tables in the database
IPython Notebook’s and other assets of this story can be found here
See ya again ????
Originally posted: https://towardsdatascience.com/sqlalchemy-python-tutorial-79a577141a91
Author
