Create a Database of 3000 Lines with 0 Dependencies | Blog

01. Complex systems are built from simple ideas
Complex software such as databases, compilers, and browsers are treated like black boxes. You use it every day as a usersbut you may not understand them as a programmeralthough they are nothing but code. Why?
- They have little in common with the day-to-day work of programmers.
- Their code bases are so large, so debilitating.
But that doesn’t mean you can’t learn these things. People make small, toy versions of these things for learning purposes, such as C in 4 Functions, Translators in the Making, Web Browser Engineering. It turns out that if you minimize the uninteresting details and focus on the essentials, you can recreate seemingly complex software and learn its main ideas.
For compilers, some of the main ideas include:
- Parsing objects with recursion.
- Represent a program as a tree and simulate it.
- Represents a program as a list of instructions with gotos.
- Stack machine or register machine.
None of these ideas are complicated. So a compiler is a practical and useful exercise for programmers.
02. What are the core ideas of databases?
I built a small database of 3000 lines
from scratch Go to learn the core ideas of databases. It’s not complicated if you approach it one way.
2.1 Loss of power atomicity
A database stores data on disk, but why not just use files? Why is a filesystem not a database? Because databases eliminate the common source of data corruption: partially written files due to crashes or power outages.
Video games warn you not to turn off the power while saving, because a partially written save file will destroy all your progress. Building a database will teach you how to eliminate this concern. The solution is a add-only log with checksum for each object.
- Append-only means that writes do not destroy existing data.
- The checksum means that partial writes are detected and discarded, which increases the atomic power loss.
╔═══╤════╦═══╤════╦══╌╌╌══╦═══╤════╗
║sum│data║sum│data║more...║sum│data║
╚═══╧════╩═══╧════╩══╌╌╌══╩═══╧════╝
╰───┬────╯
possible partial write
These are the first 2 ideas. But they are not enough to create a database.
2.2 Index data with data structures
Logs have uses for systems like Kafka, but you can’t just put everything in a log and forget about it. You need to find things in it, so you need data structures like B + trees, LSM-trees, hashtables. That’s where I started, I built a B+tree of 366 lines.
But if you just put a data structure on disk, you have a partial write problem. Is it possible to apply log-based ideas to data structures? There are 2 ideas:
The first idea is to create an append-only data structure, like a log. There are ways to create a B+tree append-only, also called a
copy-on-write B+tree. A copy-on-write tree does not overwrite existing nodes, it creates new nodes from leaf to root for the entire path. New nodes can be added like a log.
d d D*
/ \ / \ / \
b e ──► b e + B* e
/ \ / \ / \
a c a c a C*
original updated
Another idea is to use a log and a main data structure:
- Before updating the main data structure, save the intended updates to the log. Each log table contains a “write this data to that position” list.
- Then apply the updates to the main data structure.
- When the database starts, it always uses the last log table to fix a possible partial write to the main data structure, regardless of its state.
I chose the copy-on-write B+ tree because it does not require a log, which reduces the LoC. Transferring the B+ tree to disk involves interfacing with the filesystem, including fsync()
. At 601 lines, I have an append-only KV store on disk.
2.3 Recycle and reuse unused space
I need to take care of the append-only result because the storage is not infinite. I just added a free recycle list of unused B+tree nodes. At 731 lines, I have a functional KV store.
first_item
↓
list_head ─► ( next | xxxxx )
↓
( next | xxxxxxxx )
↓
list_tail ─► ( NULL | xxxx )
↑
last_item
2.4 Relational DB in KV
An SQL query can be arbitrarily complex, but it boils down to 2 data structure operations: point query and range query. So I started with B+tree data structures. Databases are built on data structures, not some theory about relationships.
I added a layer on top of KV to store records with primary keys and columns. Each record is encoded as a KV pair. LoC went up to 1107. But still only KV. So I added 2 more parts:
- Range question, 1294 lines.
- Secondary index, 1438 lines.
I’m not bothering with SQL at this point because SQL is just a user interface. I’m building APIs for databases called a storage engine.
2.5 Concurrent control
I can ignore concurrency and make everything serialized. But using a copy-on-write B+ tree means that readers get snapshot isolation for free, because updates don’t break old versions. So the readers will not be blocked by the writer.
I went one step further to allow write transactions to run concurrently, and then merge the updates at the end of the transaction. This requires identifying and rejecting conflicting updates. In 1702 lines, I have a transactional interface.
2.6 SQL-like query language
My user interface is just a set of APIs to operate the database. The next step is the query language. I chose an SQL-like syntax. I’m only implementing simple queries with a point/range query, so no query planner.
SQL parsing is not a database topic, but a compiler topic. This is done by using recursion in a certain way, which is hardly an idea if you know it.
A query language is not just about data, it can perform calculations, such as
SELECT a + b FROM c
. So I need a translator.
Then I need a lot of glue code between the query language and the database. The lines of code increased from 1702 to 2795, most of them uninteresting.
03. Database of 3000 lines, incrementally
This is my minimalist attempt to create a database from scratch. Each step adds an important feature while minimizing uninteresting details.
366 | B+tree data structure. |
601 | Just add KV. |
731 | Practical KV with free list. |
1107 | KV tables. |
1294 | Range questions. |
1438 | Second indexes. |
1461 | Transaction interfaces. |
1702 | Concurrency control. |
2795 | Like SQL query language. |
The lessons are like a book
I made it a BOOK so you can follow my steps. The first half of the book is free online and covers the most interesting parts of the storage engine, which can be used as a KV on its own.
3000 lines is not much, but it is an effective exercise. The first edition only took me 2 months of free time. The book is updated every time I learn something new, so this is the second edition.
https://build-your-own.org/byo_logo_256.png
2025-01-16 16:59:00