What is SQLAlchemy?
SQLAlchemy is Python's most popular SQL toolkit and Object-Relational Mapping (ORM) library. It provides a full suite of well-known enterprise-level persistence patterns, designed for efficient and high-performing database access.
Think of SQLAlchemy as a translator between Python objects and database tables. Instead of writing SQL queries, you work with Python classes and objects, and SQLAlchemy handles the translation behind the scenes.
Why Use an ORM (SQLAlchemy)?
ORMs provide several advantages for web development:
- Python, Not SQL: Write database operations in Python you know, not SQL you're learning
- Database Agnostic: Switch from SQLite to PostgreSQL with minimal code changes
- Type Safety: Python IDE autocomplete and type checking for database operations
- Relationship Management: Automatically handle foreign keys and joins
- Prevents SQL Injection: Parameterized queries by default
- Migration Tools: Track and version database schema changes (with Alembic)
- Less Boilerplate: No manual cursor management or connection handling
- Powerful Query API: Complex queries using method chaining
ORM vs Raw SQL: When to Use Each
Use ORM (SQLAlchemy) when:
- Standard CRUD operations: Create, Read, Update, Delete are ORM's bread and butter
- Working with relationships: ORM excels at managing related data
- Rapid development: Faster to write and maintain than raw SQL
- Database independence: Code works across different databases
- Team with varying SQL skills: Python developers can contribute
Use raw SQL when:
- Complex analytics queries: Window functions, complex aggregations
- Performance critical: Hand-optimized SQL can be faster
- Bulk operations: Inserting 100,000 rows is faster with raw SQL
- Database-specific features: PostgreSQL full-text search, JSON operators
- Reporting queries: Complex joins and subqueries sometimes clearer in SQL
Best approach: Use ORM for 80% of operations, raw SQL for performance-critical 20%!
Getting Started: Setup and Connection
# Install SQLAlchemy
pip install sqlalchemy
# Basic setup
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker, declarative_base
# Create database engine
# SQLite (file-based, good for development)
engine = create_engine('sqlite:///myapp.db', echo=True)
# PostgreSQL (production)
engine = create_engine('postgresql://user:password@localhost/dbname')
# MySQL
engine = create_engine('mysql+pymysql://user:password@localhost/dbname')
# Create session factory
Session = sessionmaker(bind=engine)
# Create base class for models
Base = declarative_base()
# Session usage pattern
session = Session()
try:
# Perform database operations
session.commit()
except:
session.rollback()
raise
finally:
session.close()
# Or use context manager (recommended)
with Session() as session:
# Operations here
session.commit() # Auto-commits and closes
Defining Models: Python Classes as Tables
from sqlalchemy import Column, Integer, String, Boolean, DateTime, ForeignKey
from sqlalchemy.orm import relationship
from datetime import datetime
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
username = Column(String(50), unique=True, nullable=False)
email = Column(String(100), unique=True, nullable=False)
password_hash = Column(String(255), nullable=False)
is_active = Column(Boolean, default=True)
created_at = Column(DateTime, default=datetime.utcnow)
# Relationship (one-to-many)
posts = relationship('Post', back_populates='author')
def __repr__(self):
return f"<User(username='{self.username}')>"
class Post(Base):
__tablename__ = 'posts'
id = Column(Integer, primary_key=True)
title = Column(String(200), nullable=False)
content = Column(String, nullable=False)
user_id = Column(Integer, ForeignKey('users.id'))
created_at = Column(DateTime, default=datetime.utcnow)
# Relationships
author = relationship('User', back_populates='posts')
comments = relationship('Comment', back_populates='post')
def __repr__(self):
return f"<Post(title='{self.title}')>"
class Comment(Base):
__tablename__ = 'comments'
id = Column(Integer, primary_key=True)
content = Column(String, nullable=False)
post_id = Column(Integer, ForeignKey('posts.id'))
user_id = Column(Integer, ForeignKey('users.id'))
created_at = Column(DateTime, default=datetime.utcnow)
# Relationships
post = relationship('Post', back_populates='comments')
author = relationship('User')
# Create all tables
Base.metadata.create_all(engine)
CRUD Operations with SQLAlchemy
# CREATE - Add new records
with Session() as session:
# Single record
new_user = User(
username='alice',
email='alice@example.com',
password_hash='hashed_password'
)
session.add(new_user)
session.commit()
# Access the ID after commit
print(f"Created user with ID: {new_user.id}")
# Multiple records
users = [
User(username='bob', email='bob@example.com', password_hash='hash'),
User(username='charlie', email='charlie@example.com', password_hash='hash')
]
session.add_all(users)
session.commit()
# READ - Query records
with Session() as session:
# Get all users
all_users = session.query(User).all()
# Get first result
user = session.query(User).first()
# Get by primary key
user = session.query(User).get(1)
# Filter
active_users = session.query(User).filter(User.is_active == True).all()
# Multiple filters
user = session.query(User).filter(
User.username == 'alice',
User.is_active == True
).first()
# Or use filter_by (simpler for equality)
user = session.query(User).filter_by(username='alice').first()
# LIKE pattern
users = session.query(User).filter(User.email.like('%@gmail.com')).all()
# IN clause
users = session.query(User).filter(User.id.in_([1, 2, 3])).all()
# Ordering
users = session.query(User).order_by(User.created_at.desc()).all()
# Limit and offset (pagination)
users = session.query(User).limit(10).offset(20).all()
# UPDATE - Modify records
with Session() as session:
user = session.query(User).filter_by(username='alice').first()
user.email = 'newemail@example.com'
session.commit()
# Bulk update
session.query(User).filter(User.is_active == False).update({
'is_active': True
})
session.commit()
# DELETE - Remove records
with Session() as session:
user = session.query(User).filter_by(username='alice').first()
session.delete(user)
session.commit()
# Bulk delete
session.query(User).filter(User.is_active == False).delete()
session.commit()
Working with Relationships
with Session() as session:
# Create user with posts (one-to-many)
user = User(username='alice', email='alice@example.com', password_hash='hash')
post1 = Post(title='First Post', content='Hello World!', author=user)
post2 = Post(title='Second Post', content='Learning SQLAlchemy', author=user)
session.add(user) # Also adds posts (cascade)
session.commit()
# Access related data (lazy loading by default)
user = session.query(User).filter_by(username='alice').first()
print(user.posts) # Triggers another query to get posts
# Eager loading (load relationships immediately)
from sqlalchemy.orm import joinedload
user = session.query(User).options(
joinedload(User.posts)
).filter_by(username='alice').first()
print(user.posts) # No additional query!
# Access relationship in reverse
post = session.query(Post).first()
print(post.author.username) # Access user from post
# Count related items
user = session.query(User).filter_by(username='alice').first()
post_count = len(user.posts)
# Or better:
from sqlalchemy import func
post_count = session.query(func.count(Post.id)).filter(
Post.user_id == user.id
).scalar()
# Many-to-many relationships
from sqlalchemy import Table
# Association table
post_tags = Table('post_tags', Base.metadata,
Column('post_id', Integer, ForeignKey('posts.id')),
Column('tag_id', Integer, ForeignKey('tags.id'))
)
class Tag(Base):
__tablename__ = 'tags'
id = Column(Integer, primary_key=True)
name = Column(String(50), unique=True)
posts = relationship('Post', secondary=post_tags, back_populates='tags')
# Update Post model
class Post(Base):
# ... existing columns ...
tags = relationship('Tag', secondary=post_tags, back_populates='posts')
# Usage
with Session() as session:
post = Post(title='Python Tips', content='...')
tag1 = Tag(name='python')
tag2 = Tag(name='tutorial')
post.tags.append(tag1)
post.tags.append(tag2)
session.add(post)
session.commit()
# Query posts by tag
posts = session.query(Post).join(Post.tags).filter(
Tag.name == 'python'
).all()
Advanced Queries
from sqlalchemy import func, and_, or_
with Session() as session:
# Joins
results = session.query(User, Post).join(Post).all()
for user, post in results:
print(f"{user.username}: {post.title}")
# Left join
results = session.query(User).outerjoin(Post).all()
# Aggregation
post_counts = session.query(
User.username,
func.count(Post.id).label('post_count')
).join(Post).group_by(User.username).all()
# Complex filters (AND)
users = session.query(User).filter(
and_(
User.is_active == True,
User.created_at > '2024-01-01'
)
).all()
# OR conditions
users = session.query(User).filter(
or_(
User.username == 'alice',
User.email.like('%@gmail.com')
)
).all()
# Subqueries
from sqlalchemy import select
subq = select(func.count(Post.id)).where(
Post.user_id == User.id
).scalar_subquery()
users_with_count = session.query(User, subq.label('post_count')).all()
# Window functions
from sqlalchemy import over
query = session.query(
Post.title,
func.row_number().over(
partition_by=Post.user_id,
order_by=Post.created_at.desc()
).label('post_rank')
)
# Select specific columns
usernames = session.query(User.username).all()
# Distinct
unique_emails = session.query(User.email).distinct().all()
# Count
user_count = session.query(User).count()
# Exists
has_posts = session.query(User).filter(
User.posts.any()
).all()
Raw SQL in SQLAlchemy
Sometimes you need raw SQL for complex queries or performance:
from sqlalchemy import text
with Session() as session:
# Execute raw SQL
result = session.execute(
text("SELECT * FROM users WHERE username = :username"),
{"username": "alice"}
)
users = result.fetchall()
# Return as model instances
result = session.execute(
text("SELECT * FROM users WHERE is_active = TRUE")
)
users = [User(**dict(row)) for row in result]
# Insert with raw SQL
session.execute(
text("INSERT INTO users (username, email) VALUES (:user, :email)"),
{"user": "bob", "email": "bob@example.com"}
)
session.commit()
# Mix ORM and raw SQL
user_ids = [1, 2, 3]
result = session.execute(
text("SELECT * FROM posts WHERE user_id IN :ids"),
{"ids": tuple(user_ids)}
)
# Using SQLAlchemy Core (lower level, still not raw SQL)
from sqlalchemy import select
with engine.connect() as conn:
stmt = select(User).where(User.username == 'alice')
result = conn.execute(stmt)
users = result.fetchall()
Database Migrations with Alembic
Alembic is SQLAlchemy's migration tool, similar to Django migrations.
# Install Alembic
pip install alembic
# Initialize Alembic
alembic init alembic
# Configure alembic.ini
sqlalchemy.url = postgresql://user:password@localhost/mydb
# Edit alembic/env.py to use your Base
from myapp.models import Base
target_metadata = Base.metadata
# Create a migration
alembic revision --autogenerate -m "Create users table"
# Review the migration file in alembic/versions/
# Example: alembic/versions/001_create_users_table.py
def upgrade():
op.create_table(
'users',
sa.Column('id', sa.Integer(), nullable=False),
sa.Column('username', sa.String(50), nullable=False),
sa.PrimaryKeyConstraint('id')
)
def downgrade():
op.drop_table('users')
# Apply migrations
alembic upgrade head
# Rollback
alembic downgrade -1 # One version back
alembic downgrade base # All the way back
# Check current version
alembic current
# View history
alembic history
SQLAlchemy with Flask
# Using Flask-SQLAlchemy extension
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'postgresql://user:pass@localhost/db'
db = SQLAlchemy(app)
class User(db.Model):
id = db.Column(db.Integer, primary_key=True)
username = db.Column(db.String(80), unique=True, nullable=False)
def __repr__(self):
return f'<User {self.username}>'
# Create tables
with app.app_context():
db.create_all()
# Use in routes
@app.route('/users')
def users():
all_users = User.query.all()
return {'users': [u.username for u in all_users]}
@app.route('/user/')
def user(username):
user = User.query.filter_by(username=username).first_or_404()
return {'username': user.username}
SQLAlchemy with FastAPI
from fastapi import FastAPI, Depends
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker, Session
DATABASE_URL = "postgresql://user:password@localhost/dbname"
engine = create_engine(DATABASE_URL)
SessionLocal = sessionmaker(bind=engine)
app = FastAPI()
# Dependency
def get_db():
db = SessionLocal()
try:
yield db
finally:
db.close()
# Use in endpoints
@app.get("/users")
def get_users(db: Session = Depends(get_db)):
users = db.query(User).all()
return users
@app.post("/users")
def create_user(username: str, email: str, db: Session = Depends(get_db)):
user = User(username=username, email=email, password_hash="hash")
db.add(user)
db.commit()
db.refresh(user)
return user
Best Practices
- Use sessions properly: Always commit or rollback, close sessions
- Eager load relationships: Use
joinedload()to avoid N+1 queries - Use indexes: Add
index=Trueto frequently queried columns - Avoid lazy loading in loops: Causes N+1 query problem
- Use scoped sessions: For web apps, use
scoped_session - Migration versioning: Use Alembic for all schema changes
- Connection pooling: Configure pool size for production
- Use
declarative_base: Not classical mapping (outdated) - Validate at application level: ORM doesn't replace validation
- Profile queries: Use
echo=Truein development to see SQL
Common Pitfalls to Avoid
- N+1 Query Problem: Loading related objects in a loop triggers one query per item
- Forgetting to commit: Changes aren't saved without
session.commit() - Detached instances: Objects are detached after session closes
- Using == for None: Use
is None, not== None - Not handling exceptions: Always rollback on errors
- Mixing sessions: Don't use objects from different sessions
Master SQLAlchemy with Expert Mentorship
Our Full Stack Python program covers SQLAlchemy from basics to advanced ORM patterns. Learn database modeling, query optimization, and integration with Flask and FastAPI.
Explore Full Stack Python Program