What is SQLAlchemy?

SQLAlchemy is Python's most popular SQL toolkit and Object-Relational Mapping (ORM) library. It provides a full suite of well-known enterprise-level persistence patterns, designed for efficient and high-performing database access.

Think of SQLAlchemy as a translator between Python objects and database tables. Instead of writing SQL queries, you work with Python classes and objects, and SQLAlchemy handles the translation behind the scenes.

Why Use an ORM (SQLAlchemy)?

ORMs provide several advantages for web development:

  • Python, Not SQL: Write database operations in Python you know, not SQL you're learning
  • Database Agnostic: Switch from SQLite to PostgreSQL with minimal code changes
  • Type Safety: Python IDE autocomplete and type checking for database operations
  • Relationship Management: Automatically handle foreign keys and joins
  • Prevents SQL Injection: Parameterized queries by default
  • Migration Tools: Track and version database schema changes (with Alembic)
  • Less Boilerplate: No manual cursor management or connection handling
  • Powerful Query API: Complex queries using method chaining

ORM vs Raw SQL: When to Use Each

Use ORM (SQLAlchemy) when:

  • Standard CRUD operations: Create, Read, Update, Delete are ORM's bread and butter
  • Working with relationships: ORM excels at managing related data
  • Rapid development: Faster to write and maintain than raw SQL
  • Database independence: Code works across different databases
  • Team with varying SQL skills: Python developers can contribute

Use raw SQL when:

  • Complex analytics queries: Window functions, complex aggregations
  • Performance critical: Hand-optimized SQL can be faster
  • Bulk operations: Inserting 100,000 rows is faster with raw SQL
  • Database-specific features: PostgreSQL full-text search, JSON operators
  • Reporting queries: Complex joins and subqueries sometimes clearer in SQL

Best approach: Use ORM for 80% of operations, raw SQL for performance-critical 20%!

Getting Started: Setup and Connection

# Install SQLAlchemy
pip install sqlalchemy

# Basic setup
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker, declarative_base

# Create database engine
# SQLite (file-based, good for development)
engine = create_engine('sqlite:///myapp.db', echo=True)

# PostgreSQL (production)
engine = create_engine('postgresql://user:password@localhost/dbname')

# MySQL
engine = create_engine('mysql+pymysql://user:password@localhost/dbname')

# Create session factory
Session = sessionmaker(bind=engine)

# Create base class for models
Base = declarative_base()

# Session usage pattern
session = Session()
try:
    # Perform database operations
    session.commit()
except:
    session.rollback()
    raise
finally:
    session.close()

# Or use context manager (recommended)
with Session() as session:
    # Operations here
    session.commit()  # Auto-commits and closes

Defining Models: Python Classes as Tables

from sqlalchemy import Column, Integer, String, Boolean, DateTime, ForeignKey
from sqlalchemy.orm import relationship
from datetime import datetime

class User(Base):
    __tablename__ = 'users'

    id = Column(Integer, primary_key=True)
    username = Column(String(50), unique=True, nullable=False)
    email = Column(String(100), unique=True, nullable=False)
    password_hash = Column(String(255), nullable=False)
    is_active = Column(Boolean, default=True)
    created_at = Column(DateTime, default=datetime.utcnow)

    # Relationship (one-to-many)
    posts = relationship('Post', back_populates='author')

    def __repr__(self):
        return f"<User(username='{self.username}')>"

class Post(Base):
    __tablename__ = 'posts'

    id = Column(Integer, primary_key=True)
    title = Column(String(200), nullable=False)
    content = Column(String, nullable=False)
    user_id = Column(Integer, ForeignKey('users.id'))
    created_at = Column(DateTime, default=datetime.utcnow)

    # Relationships
    author = relationship('User', back_populates='posts')
    comments = relationship('Comment', back_populates='post')

    def __repr__(self):
        return f"<Post(title='{self.title}')>"

class Comment(Base):
    __tablename__ = 'comments'

    id = Column(Integer, primary_key=True)
    content = Column(String, nullable=False)
    post_id = Column(Integer, ForeignKey('posts.id'))
    user_id = Column(Integer, ForeignKey('users.id'))
    created_at = Column(DateTime, default=datetime.utcnow)

    # Relationships
    post = relationship('Post', back_populates='comments')
    author = relationship('User')

# Create all tables
Base.metadata.create_all(engine)

CRUD Operations with SQLAlchemy

# CREATE - Add new records
with Session() as session:
    # Single record
    new_user = User(
        username='alice',
        email='alice@example.com',
        password_hash='hashed_password'
    )
    session.add(new_user)
    session.commit()

    # Access the ID after commit
    print(f"Created user with ID: {new_user.id}")

    # Multiple records
    users = [
        User(username='bob', email='bob@example.com', password_hash='hash'),
        User(username='charlie', email='charlie@example.com', password_hash='hash')
    ]
    session.add_all(users)
    session.commit()

# READ - Query records
with Session() as session:
    # Get all users
    all_users = session.query(User).all()

    # Get first result
    user = session.query(User).first()

    # Get by primary key
    user = session.query(User).get(1)

    # Filter
    active_users = session.query(User).filter(User.is_active == True).all()

    # Multiple filters
    user = session.query(User).filter(
        User.username == 'alice',
        User.is_active == True
    ).first()

    # Or use filter_by (simpler for equality)
    user = session.query(User).filter_by(username='alice').first()

    # LIKE pattern
    users = session.query(User).filter(User.email.like('%@gmail.com')).all()

    # IN clause
    users = session.query(User).filter(User.id.in_([1, 2, 3])).all()

    # Ordering
    users = session.query(User).order_by(User.created_at.desc()).all()

    # Limit and offset (pagination)
    users = session.query(User).limit(10).offset(20).all()

# UPDATE - Modify records
with Session() as session:
    user = session.query(User).filter_by(username='alice').first()
    user.email = 'newemail@example.com'
    session.commit()

    # Bulk update
    session.query(User).filter(User.is_active == False).update({
        'is_active': True
    })
    session.commit()

# DELETE - Remove records
with Session() as session:
    user = session.query(User).filter_by(username='alice').first()
    session.delete(user)
    session.commit()

    # Bulk delete
    session.query(User).filter(User.is_active == False).delete()
    session.commit()

Working with Relationships

with Session() as session:
    # Create user with posts (one-to-many)
    user = User(username='alice', email='alice@example.com', password_hash='hash')

    post1 = Post(title='First Post', content='Hello World!', author=user)
    post2 = Post(title='Second Post', content='Learning SQLAlchemy', author=user)

    session.add(user)  # Also adds posts (cascade)
    session.commit()

    # Access related data (lazy loading by default)
    user = session.query(User).filter_by(username='alice').first()
    print(user.posts)  # Triggers another query to get posts

    # Eager loading (load relationships immediately)
    from sqlalchemy.orm import joinedload

    user = session.query(User).options(
        joinedload(User.posts)
    ).filter_by(username='alice').first()
    print(user.posts)  # No additional query!

    # Access relationship in reverse
    post = session.query(Post).first()
    print(post.author.username)  # Access user from post

    # Count related items
    user = session.query(User).filter_by(username='alice').first()
    post_count = len(user.posts)
    # Or better:
    from sqlalchemy import func
    post_count = session.query(func.count(Post.id)).filter(
        Post.user_id == user.id
    ).scalar()

# Many-to-many relationships
from sqlalchemy import Table

# Association table
post_tags = Table('post_tags', Base.metadata,
    Column('post_id', Integer, ForeignKey('posts.id')),
    Column('tag_id', Integer, ForeignKey('tags.id'))
)

class Tag(Base):
    __tablename__ = 'tags'
    id = Column(Integer, primary_key=True)
    name = Column(String(50), unique=True)

    posts = relationship('Post', secondary=post_tags, back_populates='tags')

# Update Post model
class Post(Base):
    # ... existing columns ...
    tags = relationship('Tag', secondary=post_tags, back_populates='posts')

# Usage
with Session() as session:
    post = Post(title='Python Tips', content='...')
    tag1 = Tag(name='python')
    tag2 = Tag(name='tutorial')

    post.tags.append(tag1)
    post.tags.append(tag2)

    session.add(post)
    session.commit()

    # Query posts by tag
    posts = session.query(Post).join(Post.tags).filter(
        Tag.name == 'python'
    ).all()

Advanced Queries

from sqlalchemy import func, and_, or_

with Session() as session:
    # Joins
    results = session.query(User, Post).join(Post).all()
    for user, post in results:
        print(f"{user.username}: {post.title}")

    # Left join
    results = session.query(User).outerjoin(Post).all()

    # Aggregation
    post_counts = session.query(
        User.username,
        func.count(Post.id).label('post_count')
    ).join(Post).group_by(User.username).all()

    # Complex filters (AND)
    users = session.query(User).filter(
        and_(
            User.is_active == True,
            User.created_at > '2024-01-01'
        )
    ).all()

    # OR conditions
    users = session.query(User).filter(
        or_(
            User.username == 'alice',
            User.email.like('%@gmail.com')
        )
    ).all()

    # Subqueries
    from sqlalchemy import select
    subq = select(func.count(Post.id)).where(
        Post.user_id == User.id
    ).scalar_subquery()

    users_with_count = session.query(User, subq.label('post_count')).all()

    # Window functions
    from sqlalchemy import over
    query = session.query(
        Post.title,
        func.row_number().over(
            partition_by=Post.user_id,
            order_by=Post.created_at.desc()
        ).label('post_rank')
    )

    # Select specific columns
    usernames = session.query(User.username).all()

    # Distinct
    unique_emails = session.query(User.email).distinct().all()

    # Count
    user_count = session.query(User).count()

    # Exists
    has_posts = session.query(User).filter(
        User.posts.any()
    ).all()

Raw SQL in SQLAlchemy

Sometimes you need raw SQL for complex queries or performance:

from sqlalchemy import text

with Session() as session:
    # Execute raw SQL
    result = session.execute(
        text("SELECT * FROM users WHERE username = :username"),
        {"username": "alice"}
    )
    users = result.fetchall()

    # Return as model instances
    result = session.execute(
        text("SELECT * FROM users WHERE is_active = TRUE")
    )
    users = [User(**dict(row)) for row in result]

    # Insert with raw SQL
    session.execute(
        text("INSERT INTO users (username, email) VALUES (:user, :email)"),
        {"user": "bob", "email": "bob@example.com"}
    )
    session.commit()

    # Mix ORM and raw SQL
    user_ids = [1, 2, 3]
    result = session.execute(
        text("SELECT * FROM posts WHERE user_id IN :ids"),
        {"ids": tuple(user_ids)}
    )

# Using SQLAlchemy Core (lower level, still not raw SQL)
from sqlalchemy import select

with engine.connect() as conn:
    stmt = select(User).where(User.username == 'alice')
    result = conn.execute(stmt)
    users = result.fetchall()

Database Migrations with Alembic

Alembic is SQLAlchemy's migration tool, similar to Django migrations.

# Install Alembic
pip install alembic

# Initialize Alembic
alembic init alembic

# Configure alembic.ini
sqlalchemy.url = postgresql://user:password@localhost/mydb

# Edit alembic/env.py to use your Base
from myapp.models import Base
target_metadata = Base.metadata

# Create a migration
alembic revision --autogenerate -m "Create users table"

# Review the migration file in alembic/versions/
# Example: alembic/versions/001_create_users_table.py
def upgrade():
    op.create_table(
        'users',
        sa.Column('id', sa.Integer(), nullable=False),
        sa.Column('username', sa.String(50), nullable=False),
        sa.PrimaryKeyConstraint('id')
    )

def downgrade():
    op.drop_table('users')

# Apply migrations
alembic upgrade head

# Rollback
alembic downgrade -1  # One version back
alembic downgrade base  # All the way back

# Check current version
alembic current

# View history
alembic history

SQLAlchemy with Flask

# Using Flask-SQLAlchemy extension
from flask import Flask
from flask_sqlalchemy import SQLAlchemy

app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'postgresql://user:pass@localhost/db'
db = SQLAlchemy(app)

class User(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    username = db.Column(db.String(80), unique=True, nullable=False)

    def __repr__(self):
        return f'<User {self.username}>'

# Create tables
with app.app_context():
    db.create_all()

# Use in routes
@app.route('/users')
def users():
    all_users = User.query.all()
    return {'users': [u.username for u in all_users]}

@app.route('/user/')
def user(username):
    user = User.query.filter_by(username=username).first_or_404()
    return {'username': user.username}

SQLAlchemy with FastAPI

from fastapi import FastAPI, Depends
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker, Session

DATABASE_URL = "postgresql://user:password@localhost/dbname"
engine = create_engine(DATABASE_URL)
SessionLocal = sessionmaker(bind=engine)

app = FastAPI()

# Dependency
def get_db():
    db = SessionLocal()
    try:
        yield db
    finally:
        db.close()

# Use in endpoints
@app.get("/users")
def get_users(db: Session = Depends(get_db)):
    users = db.query(User).all()
    return users

@app.post("/users")
def create_user(username: str, email: str, db: Session = Depends(get_db)):
    user = User(username=username, email=email, password_hash="hash")
    db.add(user)
    db.commit()
    db.refresh(user)
    return user

Best Practices

  • Use sessions properly: Always commit or rollback, close sessions
  • Eager load relationships: Use joinedload() to avoid N+1 queries
  • Use indexes: Add index=True to frequently queried columns
  • Avoid lazy loading in loops: Causes N+1 query problem
  • Use scoped sessions: For web apps, use scoped_session
  • Migration versioning: Use Alembic for all schema changes
  • Connection pooling: Configure pool size for production
  • Use declarative_base: Not classical mapping (outdated)
  • Validate at application level: ORM doesn't replace validation
  • Profile queries: Use echo=True in development to see SQL

Common Pitfalls to Avoid

  • N+1 Query Problem: Loading related objects in a loop triggers one query per item
  • Forgetting to commit: Changes aren't saved without session.commit()
  • Detached instances: Objects are detached after session closes
  • Using == for None: Use is None, not == None
  • Not handling exceptions: Always rollback on errors
  • Mixing sessions: Don't use objects from different sessions

Master SQLAlchemy with Expert Mentorship

Our Full Stack Python program covers SQLAlchemy from basics to advanced ORM patterns. Learn database modeling, query optimization, and integration with Flask and FastAPI.

Explore Full Stack Python Program

Related Articles