Inside SQLite B-Tree Storage: How Tables and Indexes Are Stored
Understanding How SQLite Organizes Data on Disk
At first glance, SQLite feels simple. You create tables, insert data, and run queries. Everything “just works.” But under the surface, SQLite uses a carefully designed storage engine built around one core structure: the B-tree.
Understanding how SQLite stores tables and indexes internally helps you write faster queries, design better schemas, and troubleshoot performance issues with confidence.
In this blog, we explore how SQLite organizes data on disk, how B-trees work, and how tables and indexes are actually stored.
What Is a B-Tree
A B-tree (balanced tree) is a data structure optimized for:
Fast lookups
Efficient inserts and deletes
Minimal disk reads
Instead of scanning rows sequentially, SQLite uses B-trees to locate data quickly, even in very large datasets.
Think of a B-tree like a well-organized filing system:
The root page points to branches
Branches point to more pages
Leaf pages contain actual data
This structure allows SQLite to find records in logarithmic time.
SQLite Database File Structure
A SQLite database file is divided into fixed-size pages.
Each page:
Typically 4 KB in size
Stores part of a B-tree
Can be a root, internal node, or leaf node
The database file is essentially a collection of B-trees.
Key B-trees include:
One B-tree per table
One B-tree per index
This means every table and every index is stored separately, but using the same structure.
How Tables Are Stored (Table B-Trees)
Tables in SQLite are stored as B-trees keyed by rowid.
Key Concepts
Each row has a unique
rowidThe B-tree is ordered by
rowidLeaf nodes store the actual row data
Example Table
CREATE TABLE users (
id INTEGER PRIMARY KEY,
name TEXT,
email TEXT
);Internally:
idbecomes the rowidRows are stored in ascending order of
id
Leaf Node Structure
Each leaf page contains:
Rowid
Record payload (column values)
This means when you query:
SELECT * FROM users WHERE id = 100;SQLite can jump directly to the correct page instead of scanning the entire table.
WITHOUT ROWID Tables
SQLite also supports tables without rowid.
CREATE TABLE products (
sku TEXT PRIMARY KEY,
name TEXT
) WITHOUT ROWID;In this case:
The primary key becomes the B-tree key
No hidden rowid is used
Storage is more compact for certain schemas
This is useful when the primary key is not an integer.
How Indexes Are Stored (Index B-Trees)
Indexes are also stored as B-trees, but with a different structure.
Key Differences
The key is the indexed column value
The value points to the rowid in the table
Example Index
CREATE INDEX idx_users_email ON users(email);Internally:
B-tree sorted by
emailEach entry stores:
email value
corresponding rowid
When you run:
SELECT * FROM users WHERE email = ‘[email protected]’;SQLite:
Searches the index B-tree
Finds the rowid
Fetches the row from the table B-tree
This is why indexes improve query performance dramatically.
Internal vs Leaf Pages
Each B-tree consists of:
Internal Pages
Store keys and pointers
Guide the search process
Do not contain full row data
Leaf Pages
Store actual records (tables)
Store key + rowid pairs (indexes)
Traversal Example
To find a row:
Start at root page
Follow pointers down the tree
Reach leaf page
Retrieve data
This minimizes disk reads and improves efficiency.
Page Splitting and Growth
As data grows, pages fill up.
When a page is full:
SQLite splits the page
Moves half the data to a new page
Updates parent nodes
This keeps the B-tree balanced.
Balanced trees ensure:
Consistent performance
No long chains
Fast lookups even at scale
How This Affects Performance
Understanding B-trees explains many performance behaviors.
Sequential Inserts Are Faster
INSERT INTO users (id, name) VALUES (1, ‘A’);
INSERT INTO users (id, name) VALUES (2, ‘B’);Appends to the end of the tree
Minimal page splits
Random Inserts Are Slower
INSERT INTO users (id, name) VALUES (1000, ‘X’);
INSERT INTO users (id, name) VALUES (10, ‘Y’);Causes page splits
More disk activity
Disk I O and Page Access
SQLite reads and writes entire pages, not individual rows.
This means:
Accessing one row loads its entire page
Nearby rows are often already in memory
This is why:
Range queries are efficient
Clustering data improves performance
Indexes vs Table Scans
Without an index:
SELECT * FROM users WHERE email = ‘[email protected]’;SQLite must scan every row.
With an index:
SQLite jumps directly to matching entries
Only relevant rows are accessed
This reduces disk reads significantly.
B-Trees and Concurrency
SQLite uses page-level locking internally.
Reads can happen concurrently
Writes modify specific pages
When using WAL mode:
PRAGMA journal_mode = WAL;Readers and writers can operate together
Changes are written sequentially
This improves concurrency behavior.
If you want deeper insight into how this impacts multi-user environments, see
Optimizing SQLite for Multi User Applications.
Vacuum and Fragmentation
Over time:
Deleted rows leave gaps
Pages become fragmented
Running:
VACUUM;Rebuilds the database
Compacts pages
Improves B-tree structure
This keeps performance consistent.
Real World Insight
Imagine a SaaS analytics system:
Millions of rows stored in SQLite
Indexes on key columns
Frequent range queries
Understanding B-trees helps:
Choose correct indexes
Avoid unnecessary scans
Optimize insert patterns
This connects directly with large dataset handling strategies discussed in
Handling Large Datasets in SQLite.
Conclusion
SQLite’s performance and reliability come from its B-tree storage engine. Every table and index is built on this structure, allowing fast lookups, efficient writes, and predictable behavior.
By understanding how SQLite organizes data on disk, you gain the ability to:
Design better schemas
Write faster queries
Avoid performance pitfalls
Debug issues with confidence
SQLite may be lightweight, but its internals are deeply engineered for efficiency.
Subscribe Now
If you want practical, real-world SQLite architecture tutorials, subscribe to SQLite Forum.
Upcoming topics include:
Building offline-first sync systems with SQLite
SQLite replication strategies
Distributed data ownership patterns
SQLite B-tree storage internals
Subscribe to receive new articles directly.


