Skeema: Streamlining Database Schema Migration Efforts
Written on
Chapter 1: Introduction to Skeema
In this article, we delve deeper into tools for synchronizing database schemas, expanding upon two earlier discussions I’ve had on this subject. I’d like to introduce Skeema, a library created for database schema migration that focuses on differences, developed using Go.
Background
Previously, at my last job, a senior architect devised a tool akin to Skeema, which also synchronizes table structures by assessing differences. However, that tool utilized YAML for defining table structures. During that time, I sought an open-source alternative but found none that met my needs.
Overview
Skeema allows for database schema migration by focusing solely on the final desired table structure. This tool simplifies the synchronization of database schemas across various environments. Skeema also supports linting of SQL statements, making it easier to integrate into Continuous Integration/Continuous Deployment (CI/CD) processes to enhance schema quality. By default, it restricts certain risky database operations like table and column deletions.
However, it is worth noting that Skeema currently only supports MySQL and MariaDB.
Installation
Installing Skeema is a straightforward process. For MacOS users, it can be installed via Homebrew:
brew install skeema/tap/skeema
Alternatively, you can use go get, provided your Go version is v1.21 or higher:
$ go install github.com/skeema/[email protected]
For other operating systems, binary files can be downloaded from Skeema's GitHub page, along with installation instructions.
Usage
Using Skeema can be broken down into five primary steps:
- Generate the initial table creation SQL with skeema init.
- Modify the SQL file as needed.
- Use skeema diff to confirm differences before submission.
- Run skeema lint to ensure the SQL adheres to linter rules.
- Finally, execute skeema push to apply the SQL changes and update the database schema.
Let’s explore these steps in detail.
Initialization
To start, use the command below to initialize your directory with the existing database structure, generating the initial table creation statements:
$ skeema init -h 127.0.0.1 -uroot -ppassword --schema blog -d .
The --schema flag specifies the target database name. If omitted, it will generate SQL for all databases. The -d/--dir flag designates the target directory.
After executing, you’ll see a .skeema configuration file in the specified directory, which might look like this:
default-character-set=utf8mb4
default-collation=utf8mb4_0900_ai_ci
generator=skeema:1.11.1-community
schema=blog
[production]
flavor=mysql:8.3
host=127.0.0.1
port=3306
user=root
To avoid entering your password for subsequent commands, you can include a password configuration in the file.
Generating Differences
Next, let's modify a table. For instance, adding an author_id field to the comments table in the article database. The modified creation statement will be:
CREATE TABLE comments (
...author_id varchar(255) NOT NULL,
...
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
Use the command below to view differences:
$ skeema diff
This will output the SQL command needed to add the author_id column:
ALTER TABLE comments ADD COLUMN author_id int NOT NULL AFTER post_id;
Reviewing these differences allows you to confirm that the SQL aligns with your expectations.
Linter
Skeema includes a built-in linter tool to assess table structure definitions, helping to catch common errors. Run:
$ skeema lint
Skeema’s linter offers various rule options, which you can explore in the official documentation. Some noteworthy rules include:
- Ensuring the default character set is set to utf8mb4.
- Validating that tables and columns utilize the recommended storage engine.
- Advising on appropriate data types for primary keys.
Let’s create a sample table with potential issues to test the linter. For example:
CREATE TABLE comments (
id CHAR(32) NOT NULL,
...
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
After running skeema lint, you might receive feedback about incorrect data types and duplicate indexes, highlighting the importance of this tool.
Applying Changes
The final step involves applying the changes using the skeema push command. This command executes the updates in your database.
Multi-environment Configuration
Skeema allows for distinct configurations across various environments, such as production and development, as demonstrated in the earlier configuration example. To add a new environment, execute:
$ skeema add-environment dev --host 127.0.0.1 -uroot -ppassword
Safe Operations
By default, Skeema restricts potentially harmful operations, like deleting tables or columns, protecting your data integrity. For development environments where such operations might be necessary, you can configure the tool to allow unsafe actions.
Conclusion
Skeema is a powerful, user-friendly tool for managing and synchronizing database schemas based on differences. Whether you’re dealing with architectural changes during feature development or need to sync schemas across environments, Skeema offers effective solutions.
If, like me, you’ve struggled with managing and synchronizing database table structures, consider giving Skeema a try—you might find it just what you need.
Discover how to automate MySQL schema management with Skeema in this informative video.
Learn how to automate schema migration flows using GitHub Actions, Skeema, and gh-ost in this comprehensive guide.