Focused Diffs: New --skinny and --include-cols Options for DOLT_DIFF() table function

FEATURE RELEASESQL
4 min read

Working with database diffs can be overwhelming when you're dealing with wide tables containing dozens of columns. You often find yourself scrolling through endless output just to identify what actually changed between commits. At DoltHub, we've been addressing this challenge through our work on Dolt, our flagship SQL database with Git-style version control and branching that serves as a drop-in MySQL replacement.

Today, we're excited to announce that the --skinny option from the dolt diff CLI command is now available in the DOLT_DIFF() table function, along with a brand new --include-cols option. While the --skinny flag has been helping CLI users focus on changed columns for some time, bringing it to the SQL table function unlocks new possibilities for programmatic analysis and integration with existing SQL workflows.

Example: Using the Skinny Option

Let's see this in action with a concrete example. Imagine we have a table with multiple columns, but we only update two of them:

-- Create a test table with many columns
CREATE TABLE test (
  pk BIGINT NOT NULL PRIMARY KEY,
  c1 BIGINT,
  c2 BIGINT,
  c3 BIGINT,
  c4 BIGINT,
  c5 BIGINT,
  c6 BIGINT
);

-- Insert initial data
INSERT INTO test VALUES (0, 1, 2, 3, 4, 5, NULL);
INSERT INTO test VALUES (1, 10, 20, 30, 40, 50, NULL);

-- Commit the initial state
CALL dolt_add('test');
CALL dolt_commit('-m', 'Initial data');

UPDATE test SET c6=600 WHERE pk=0;
UPDATE test SET c2=200 WHERE pk=1;

-- Commit the changes
CALL dolt_add('test');
CALL dolt_commit('-m', 'Updated some columns');

The Problem: Information Overload in Wide Table Diffs

Without --skinny (shows all columns):

SELECT * FROM dolt_diff('HEAD~1', 'HEAD', 'test');
+-------+-------+-------+-------+-------+-------+-------+-----------+-------------------------+---------+---------+---------+---------+---------+---------+---------+-------------+------------------------+-----------+
| to_pk | to_c1 | to_c2 | to_c3 | to_c4 | to_c5 | to_c6 | to_commit | to_commit_date          | from_pk | from_c1 | from_c2 | from_c3 | from_c4 | from_c5 | from_c6 | from_commit | from_commit_date       | diff_type |
+-------+-------+-------+-------+-------+-------+-------+-----------+-------------------------+---------+---------+---------+---------+---------+---------+---------+-------------+------------------------+-----------+
| 0     | 1     | 2     | 3     | 4     | 5     | 600   | HEAD      | 2025-09-15 21:41:16.724 | 0       | 1       | 2       | 3       | 4       | 5       | NULL    | HEAD~1      | 2025-09-15 21:41:14.34 | modified  |
| 1     | 10    | 200   | 30    | 40    | 50    | NULL  | HEAD      | 2025-09-15 21:41:16.724 | 1       | 10      | 20      | 30      | 40      | 50      | NULL    | HEAD~1      | 2025-09-15 21:41:14.34 | modified  |
+-------+-------+-------+-------+-------+-------+-------+-----------+-------------------------+---------+---------+---------+---------+---------+---------+---------+-------------+------------------------+-----------+

When working with tables containing many columns, such as with the above, diff output can become unwieldy. This is a smaller case too, consider a user table with 20+ columns where only a few fields actually changed between commits. The standard DOLT_DIFF() output shows every column with to_* and from_* versions, even those that remained unchanged, this creates significant visual noise that obscures the actual changes.

Enter --skinny: Show Only What Matters

The new --skinny flag saves the day for DOLT_DIFF() by showing only the columns that actually changed, plus the primary key columns for row identification. This dramatically reduces visual clutter while maintaining all the essential information.

With --skinny (shows only changed columns, primary keys and commit metadata):

SELECT * FROM dolt_diff('--skinny', 'HEAD~1', 'HEAD', 'test');
+-------+-------+-------+-----------+-------------------------+---------+---------+---------+-------------+------------------------+-----------+
| to_pk | to_c2 | to_c6 | to_commit | to_commit_date          | from_pk | from_c2 | from_c6 | from_commit | from_commit_date       | diff_type |
+-------+-------+-------+-----------+-------------------------+---------+---------+---------+-------------+------------------------+-----------+
| 0     | 2     | 600   | HEAD      | 2025-09-15 21:41:16.724 | 0       | 2       | NULL    | HEAD~1      | 2025-09-15 21:41:14.34 | modified  |
| 1     | 200   | NULL  | HEAD      | 2025-09-15 21:41:16.724 | 1       | 20      | NULL    | HEAD~1      | 2025-09-15 21:41:14.34 | modified  |
+-------+-------+-------+-----------+-------------------------+---------+---------+---------+-------------+------------------------+-----------+

Notice how the skinny output excludes c1, c3, c4, and c5 columns since they didn't change, making it much easier to focus on the actual modifications.

Adding Context with --include-cols

Sometimes you want the focused view of --skinny but need additional columns for context, even if they didn't change. The --include-cols option lets you explicitly specify columns to include in the diff output.

-- Include specific columns for context
SELECT * FROM dolt_diff('--skinny', '--include-cols=c4,c5', 'HEAD~1', 'HEAD', 'test');
+-------+-------+-------+-------+-------+-----------+-------------------------+---------+---------+---------+---------+---------+-------------+------------------------+-----------+
| to_pk | to_c2 | to_c4 | to_c5 | to_c6 | to_commit | to_commit_date          | from_pk | from_c2 | from_c4 | from_c5 | from_c6 | from_commit | from_commit_date       | diff_type |
+-------+-------+-------+-------+-------+-----------+-------------------------+---------+---------+---------+---------+---------+-------------+------------------------+-----------+
| 0     | 2     | 4     | 5     | 600   | HEAD      | 2025-09-15 21:41:16.724 | 0       | 2       | 4       | 5       | NULL    | HEAD~1      | 2025-09-15 21:41:14.34 | modified  |
| 1     | 200   | 40    | 50    | NULL  | HEAD      | 2025-09-15 21:41:16.724 | 1       | 20      | 40      | 50      | NULL    | HEAD~1      | 2025-09-15 21:41:14.34 | modified  |
+-------+-------+-------+-------+-------+-----------+-------------------------+---------+---------+---------+---------+---------+-------------+------------------------+-----------+

Flexible Column Specification

The --include-cols option supports multiple syntax formats for convenience, making use of our common CLI parser:

-- Comma-separated list
SELECT * FROM dolt_diff('--skinny', '--include-cols=c1,c2,c3', 'HEAD~1', 'HEAD', 'test');

-- Space-separated arguments at the end
SELECT * FROM dolt_diff('--skinny', 'HEAD~1', 'HEAD', 'test', '--include-cols', 'c1', 'c2', 'c3');

Both approaches produce identical results, giving you flexibility in how you structure your queries.

Why SQL Table Functions Matter

While the --skinny flag has been available in the dolt diff CLI command for some time, bringing it to the DOLT_DIFF() table function opens up powerful new use cases. Dolt's version control operations are exposed as procedures, functions, or system tables in SQL, making them accessible to any application that can connect to a MySQL-compatible database.

CLI Parity and New Features

The dolt diff CLI command continues to support --skinny and now also supports the new --include-cols option, maintaining consistency across interfaces:

# CLI usage with skinny flag (existing functionality)
dolt diff --skinny HEAD~1 HEAD test

# CLI usage with new include-cols option
dolt diff --skinny --include-cols=c1,c2 HEAD~1 HEAD test

This unified approach ensures that whether you're working in SQL or on the command line, you have access to the same powerful filtering capabilities.

As we've discussed in our work on agentic workflows and database testing, focused tooling enables more efficient analysis and reduces cognitive overhead when working with complex data changes.

Conclusion

The --skinny and --include-cols options represent a significant step forward in making database diff analysis more manageable and focused. By showing only what matters while allowing explicit inclusion of context, these features help you understand your data changes more efficiently.

These features are available in Dolt v1.59.5 and later. Update to the latest release to start using them. Whether you're tracking schema evolution or simply trying to understand what changed between commits, these new options provide the focused view you need to work more effectively with version-controlled databases.

Ready to try focused diffs in your own database? Have questions or want to share your experience? Join us on Discord to discuss with the community.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.