RFC 001 - The History Database

  • Feature Name: The History Database
  • Start Date: 2021-09-16
  • RFC PR: realismlabs/dataland-engineering-handbook#3
  • Author: Aarash Heydari

Summary

Introduce the history database, enabling analytical SQL queries against past points in time of a database. For each tenant database in db-server, a copy of the history of that database is persisted in a separate "history database". The primary APIs of the history database execute read-only SQL queries with an extra parameter tquery, having time-machine semantics: "Execute the SQL query against the past state of the base database at time tquery".

Motivation

A number of features require, in some way, the persistence of past states of a database.

  1. Changelog / "Diff" view of a database
  • Figma Link
  • The goal is to provide an experience similar to Git's commit history, but for edits to a database.
  • There are primarily two modes of changes:
    1. Schema-level changes, including adding/removing/changing columns or tables.
    • The user should be able to easily query the history of schema changes.
    1. Row-level changes.
    • The user should be able to easily query the history of changes to a particular row.
  • Solely leveraging the transaction log within db-server for the above would be doable (by dynamically playing back a transaction log to resolve the state of an individual object at a given time), but this approach falls apart if we want to do any kind of analytical query on the past, or query related data.
  1. "Consistent view of the world" for async workers
  • Workers consume and act upon DB changes/events by listening to a FIFO queue.

  • Workers want to be able to run SQL queries against their source DB in response to events. Their predictable use-cases include the following:

    • What was the pre-/post-image of a row before/after a transaction?
    • What was the state of some other row related to the updated row (potentially from a different table)
    • Arbitrary analytical queries, such as averaging over a column or doing complex joins.
  • But, reading directly from the source DB suffers from a "reading from the future" problem.

    The 'reading from the future' problem

    Suppose we have 3 events:

    1. Insert Row 1 into Table T1 with {a: 1, b: 2}
    2. Update Row 1 with {b: 5}
    3. Drop Table T1

    A worker may be responding to event (2) here and be interested in the pre-image of Row 1 before event (2). However, if the workers query db-server directly, they will find that the table T1 has already been dropped, such that the data that the worker needs no longer exists.

    To ensure workers can always query their relevant data, they require the ability to query a "frozen in time" past version of the DB corresponding to the timestamp of the event that they are responding to.

  1. Providing time-machine query-ability to databases in general is powerful and may be useful as a standalone feature.

Guide-level explanation

What does it mean to query a frozen-in-time version of a database?

  1. If a row has a particular cell value changed at tx, then querying the row with query timestamp tx-1 should yield the old value, and querying with tx+1 should yield the new value.

  2. If a table is renamed from my_table to my_table_2 at time tx, then loading my_table with timestamp tx-1 should succeed, but loading my_table_2 at that time should fail, and vice versa for tx+1. Dropping or creating a new table behaves similarly to the renaming case.

  3. Similarly, if a column is renamed (or dropped & another one is created) then loading the table with a query timestamp before the rename should yield the column under the old name, and a query timestamp after the rename should yield the column under the new name.

Relationship with Tenant Databases

Each history database mirrors one db-server tenant database, subject to the following relation:

Tables

The schema of a table can change over time. Suppose my_table is created at time tschema1, then row insertions and deletions happen. A column is dropped at time tschema2 and then a new column is added at tschema3.

Depending on the tquery, the columns which should be returned from SELECT * FROM my_table will differ. To support all previous schemas of a table, the history table schema for a given table must be a super-set of all past schemas of the table. This table shall be called the supertable of the base table. The base table and supertable share the same table_uuid.

To provide the illusion of the desired schema at time tquery, subqueries which select the desired subset of the supertable schema are used.

Schema History

Each time the schema of a base table changes, a new row is added to the Schema History table, which has the following structure:

-- Each time a schema change to any table within db-server occurs,
--  a new row is added to this table.
create table _history_catalog.schema_history (
  -- The UUID of the table / supertable which were altered.
  table_uuid uuid not null,

  -- The user-assigned name of the table.
  table_name text not null,

  -- The logical timestamp of this schema change.
  logical_timestamp bigint not null unique,

  -- A struct-ified representation of a schema, including column UUIDs/names/data-types.
  -- Null for a table which has been dropped (or renamed).
  schema_descriptor bytea
);

Where schema_descriptor is actually the TableDescriptor proto message from rtdp.proto. It resembles the following:

{
  // Identifier for both the base table and the supertable.
  "table_uuid": "xxxxx-xxxxxxxxx-xxxx",
  // User-assigned name for the base table.
  "table_name": "my_table_name",

  // The schema of the table at this point in time.
  "column_descriptors": [
        // Identifier for both the base column and the supercolumn.
        "column_uuid": "xxxxx-xxxxxxxxx-xxxx",
        // User-assigned name for the base column.
        "column_name": "my_column_name",
        "data_type": "<enum>", // the SQL type of the column
        "is_nullable": true,

        /// The position of the column in the total ordering across all of the table's columns,
        /// specified as a lexicographic key.
        /// The synthetic key and ordinal key columns should always come first and second.
        "column_ordinal_key": "<key>",
  ]
}

To fetch the relevant schema for a given table at a given timestamp, the following SQL can be used:

SELECT table_uuid, schema_descriptor from _history_catalog.schema_history
WHERE table_name = $1 AND logical_timestamp < $2
ORDER BY logical_timestamp DESC LIMIT 1

Columns

All columns which ever existed in the base table are associated with a "supercolumn" in the supertable. Supercolumns are never dropped, because even if the base column is dropped, queries under past timestamps must still see the column and its values.

Supercolumns are created whenever base columns are created or have their data type changed. They are NOT created when base columns are renamed.

The supercolumns are named according to the pattern <column_uuid>-<logical_timestamp of column creation>.

Including the logical timestamp in the supercolumn name assists in cases of data-type changes to the base table schema. Data-type changes require the old version and new version of the column to exist side-by-side as separate supercolumns.

All supercolumns are always nullable. This is to support dropping of a column while maintaining the supercolumn. After a base column is dropped, any subsequent row insertions should write null to the corresponding supercolumn.

Rows

  1. All row mutations (altering a cell value, deleting a row, inserting a new row) for a given base table all cause a new row to be inserted into the corresponding history table.
  2. All rows in the base database are augmented with following two reserved columns:
  • logical_timestamp: The logical timestamp of the base database event which triggered the history change. This is used to facilitate querying a frozen point-in-time by ignoring all items with a logical_timestamp greater than the query_timestamp of the request.
  • is_deleted: When a base database row is deleted, a history row is inserted with this boolean set to true. This is used in conjunction with logical_timestamp to filter out rows that have already been deleted at a given tquery.

The following invariant should always be true:

The Reconstruction Invariant

The following query against the supertable of my_table should always match the state of the base my_table as it had appeared at time $t_query.

WITH ranked_by_age_within_key AS (
  SELECT
    *,
    ROW_NUMBER() OVER (partition by synthetic_key
      order by logical_timestamp desc) AS rank_in_key
  FROM <table_uuid>
  WHERE logical_timestamp < $t_query
)
SELECT
  -- Subset of the supertable's columns based on the schema from time $t_query
  <column_uuid1> as <column_name1>,
  <column_uuid2> as <column_name2>,
  ...,
FROM ranked_by_age_within_key
-- Only accept the newest version of every base row, and only
WHERE rank_in_key = 1 and is_deleted = false

...Where ROW_NUMBER() and rank_in_key facilitate only selecting the newest version of each row, even though many versions of that row exist in the supertable having different logical_timestamps.

NOTE: SELECT DISTINCT ON, or other window functions may lead to a more performant version of the above query. See https://stackoverflow.com/questions/3800551/select-first-row-in-each-group-by-group

Tying it together

To load the table table_uuid_1 in the history table at time tquery:

  1. First, verify that tquery is "in the past" from the perspective of the history database. Throw an error if tquery is greater than the last timestamp which was queried. (Each database tracks the last processed logical_timestamp).

  2. Resolve the schema of the table from the POV of tquery using a query like

SELECT table_uuid, schema_descriptor from _history_catalog.schema_history
WHERE table_name = $1 AND logical_timestamp < $2
ORDER BY logical_timestamp DESC LIMIT 1
  1. The schema_descriptor will specify which columns of the supertable were relevant at the point in time tquery, leading to a query resembling the following:
WITH frozen_in_time AS (
  WITH ranked_by_age_within_key AS (
    SELECT
      *,
      ROW_NUMBER() OVER (partition by synthetic_key
        order by logical_timestamp desc) AS rank_in_key
    FROM <table_uuid>
    WHERE logical_timestamp < $t_query
  )
  SELECT
    -- Subset of the supertable's columns based on the schema from time $t_query
    <column_uuid1> as <column_name1>,
    <column_uuid2> as <column_name2>,
    ...,
  FROM ranked_by_age_within_key
  -- Only accept the newest version of every base row, and only
  WHERE rank_in_key = 1 and is_deleted = false
)
SELECT <arbitrary customer SQL> FROM frozen_in_time

APIs

The initial APIs query the past state of a row, or the past state of a table/schema.

service HistoryService {
  // Fetch an individual row at a given point in time.
  rpc FetchRow(FetchRowRequest) returns (FetchRowResponse);

  // Load an entire table at a given point in time.
  rpc LoadTable(LoadTableRequest) returns (LoadTableResponse);

  // Get the catalog/schemas at a given point in time.
  rpc GetCatalog(GetCatalogRequest) returns (GetCatalogResponse);
}

message FetchRowRequest {
  // Table Identifier
  bytes table_uuid = 1;

  // The `primary_key` used to uniquely identify a row from the source/tenant DB.
  sint64 synthetic_key = 2;

  // Load the row as it appeared at this point in time.
  uint64 query_timestamp = 3;
}

message FetchRowResponse {
  // An ordered list of the column UUIDs in the response.
  repeated bytes column_uuid_mapping = 1;

  // An ordered list of the values within the row, corresponding to the column ordering above.
  repeated Scalar values = 2;
}

message LoadTableRequest {
  // Table Identifier
  bytes table_uuid = 1;

  // Load the table as it appeared at this point in time.
  uint64 query_timestamp = 2;
}

message LoadTableResponse {
  // An ordered list of the column UUIDs in the response.
  repeated bytes column_uuid_mapping = 1;

  // An ordered list of the values within the row, corresponding to the column ordering above.
  repeated Scalar values = 2;
}

message GetCatalogRequest {
  // View the catalog schemas as they appeared at this point in time.
  uint64 query_timestamp = 1;
}

message GetCatalogResponse {
  // The name/uuid/columns of all tables.
  repeated TableDescriptor table_descriptors = 1;
}

Reference-level explanation

Each workspace's history database includes the following pieces of state:

  1. _history_catalog.schema_history -- A table which tracks all changes to schemas of all tables within the database.
  2. _supertables.XXXXXX -- A schema which includes one supertable (identified by the table_uuid) for each table that has ever existed in the base database.
  3. _history_metadata.last_logical_timestamp -- A table with 1 row which always stores (and atomically updates) the most recent logical timestamp which completed processing. This assists in rejecting queries that try to read the future.

CreateDatabase handler (New workspace creation)

db-server runs tenant-db-migrations to bootstrap some resources when a new workspace is created. The history database will do the same.

  1. Initialize the _history_catalog.schema_history table, the _history_catalog.last_logical_timestamp table, and the _supertables schema.
-- 1 `schemas` table per workspace
create schema _history_catalog;

-- 1 supertable per base table
create schema _supertables;

-- Each time a schema change to any table within the base database occurs,
--  a new row is added to this table.
create table _history_catalog.schema_history (
  -- The UUID of the table / supertable which were altered.
  table_uuid uuid not null,

  -- The user-assigned name of the table.
  table_name text not null,

  -- The logical timestamp of this schema change.
  logical_timestamp bigint not null unique,

  -- A struct-ified representation of a schema, including column UUIDs/names/data-types.
  -- Null for a table which has been dropped (or renamed).
  schema_descriptor bytea
);

-- This table just stores 1 row which is atomically updated every time a new event from the base table is processed
create table _history_catalog.last_logical_timestamp (
  logical_timestamp bigint not null unique
)

Supertables - Row Mutations

Consider the following example of how inserting/updating/deleting rows and changing the schema would affect the supertable.

Table State 1:

synthetic_keycolumn1
id1value1

Supertable state 1:

synthetic_key<column1_uuid>-t1is_deletedlogical_timestamp
id1value1f1

Table State 2: insert a row

synthetic_keycolumn1
id1value1
id2value2

Supertable state 2:

synthetic_key<column1_uuid>-t1is_deletedlogical_timestamp
id1value1f1
id2value2f2

Table State 3: edit value2

synthetic_keycolumn1
id1value1
id2value2_prime

Supertable state 3: There are now two rows corresponding to synthetic key 2 with 2 different logical timestamps!

synthetic_key<column1_uuid>-t1is_deletedlogical_timestamp
id1value1f1
id2value2f2
id2value2_primef3

Table State 4: Delete the row of id1

synthetic_keycolumn1
id2value2_prime

Supertable State 4: The deletion causes the insertion of a new row with is_deleted = true

synthetic_key<column1_uuid>-t1is_deletedlogical_timestamp
id1value1f1
id2value2f2
id2value2_primef3
id1nullt4

RowMutation - Insert handler

Straightforwardly insert a new row into the supertable using the synthetic key from the base table's item, copying the data over, with is_deleted=false and logical_timestamp set. The new row will be the first and only entry corresponding to the given synthetic key.

RowMutation - Update handler

The row update doesn't fully specify the post-update attributes of the row. Therefore, this handler will need to query the most recent state of the row identified by the synthetic key (before the update), copy-over the values from the latest version, apply the update, and write-back the new fully-specified item to the supertable. There will now be more than one row in the supertable corresponding to this synthetic key, and either one could be served to the user depending on tquery.

RowMutation - Delete handler

Similarly to the update case, deletions will insert a new row to the supertable having is_deleted=true. If a query requests a point in time after the deletion, then the newest row corresponding to the synthetic key will be this 'deleted' row, which would be filtered out of the results set.

Unlike the Update case, it's not essential to copy-over the row's previous values into the new deletion row, because this row only exists for the purpose of being filtered out based on its timestamp - the values are never read. It is acceptable to use null for all values other than synthetic key and the reserved Dataland fields.

Supertables and Schema History - Schema changes

Table State 1:

synthetic_keycolumn1 (BIGINT)
id11

Supertable State 1:

synthetic_key<column1_uuid>-t1 (BIGINT)is_deletedlogical_timestamp
id11ft1

Schema History State 1:

table_uuidtable_namelogical_timestampschema_descriptor
<table_uuid_1>my_tablet1{"column1": { "type": "BIGINT", "supercolumn_name": "<column1_uuid>-t1" }}

Table State 2: Add column2 with default value 2 and type TEXT.

synthetic_keycolumn1 (BIGINT)column2 (TEXT)
id112

Supertable State 2: Add the column.

synthetic_key<column1_uuid>-t1 (BIGINT)<column2_uuid>-t2 (TEXT)is_deletedlogical_timestamp
id112ft1

Schema History State 2: Add a new schema including the new column.

table_uuidtable_namelogical_timestampschema_descriptor
<table_uuid_1>my_tablet1{"column1": { "type": "BIGINT", "supercolumn_name": "<column1_uuid>-t1" }}
<table_uuid_1>my_tablet2{"column1": { "type": "BIGINT", "supercolumn_name": "<column1_uuid>-t1" }, "column2": { "type": "TEXT", "supercolumn_name": "<column2_uuid>-t2" }}

Table State 3: drop column1

synthetic_keycolumn2 (TEXT)
id12

Supertable State 3: No-op against the supertable.

Schema History State 3: Add a new schema which doesn't include the dropped column1.

table_uuidtable_namelogical_timestampschema_descriptor
<table_uuid_1>my_tablet1{"column1": { "type": "BIGINT", "supercolumn_name": "<column1_uuid>-t1" }}
<table_uuid_1>my_tablet2{"column1": { "type": "BIGINT", "supercolumn_name": "<column1_uuid>-t1" }, "column2": { "type": "TEXT", "supercolumn_name": "<column2_uuid>-t2" }}
<table_uuid_1>my_tablet3{"column2": { "type": "TEXT", "supercolumn_name": "<column2_uuid>-t2" }}

Table State 4: Change the type of column2 to BIGINT.

synthetic_keycolumn2 (BIGINT)
id12

Supertable State 4: Add a new supertable column corresponding to the new type, preserving the ability to query against the "old" type.

synthetic_key<column1_uuid>-t1 (BIGINT)<column2_uuid>-t2 (TEXT)<column2_uuid>-t4 (BIGINT)is_deletedlogical_timestamp
id1122ft1

Schema History State 4: Add a new schema which changes the type and supercolumn_name of column2.

table_uuidtable_namelogical_timestampschema_descriptor
<table_uuid_1>my_tablet1{"column1": { "type": "BIGINT", "supercolumn_name": "<column1_uuid>-t1" }}
<table_uuid_1>my_tablet2{"column1": { "type": "BIGINT", "supercolumn_name": "<column1_uuid>-t1" }, "column2": { "type": "TEXT", "supercolumn_name": "<column2_uuid>-t2" }}
<table_uuid_1>my_tablet3{"column2": { "type": "TEXT", "supercolumn_name": "<column2_uuid>-t2" }}
<table_uuid_1>my_tablet4{"column2": { "type": "BIGINT", "supercolumn_name": "<column2_uuid>-t4" }}

Table State 5: Rename column2 to mycolumn2.

synthetic_keymycolumn2 (BIGINT)
id12

Supertable State 5: No-op against the supertable.

Schema History State 5: Add a new schema which changes the symbol column2 without changing the supercolumn_name.

table_uuidtable_namelogical_timestampschema_descriptor
<table_uuid_1>my_tablet1{"column1": { "type": "BIGINT", "supercolumn_name": "<column1_uuid>-t1" }}
<table_uuid_1>my_tablet2{"column1": { "type": "BIGINT", "supercolumn_name": "<column1_uuid>-t1" }, "column2": { "type": "TEXT", "supercolumn_name": "<column2_uuid>-t2" }}
<table_uuid_1>my_tablet3{"column2": { "type": "TEXT", "supercolumn_name": "<column2_uuid>-t2" }}
<table_uuid_1>my_tablet4{"column2": { "type": "BIGINT", "supercolumn_name": "<column2_uuid>-t4" }}
<table_uuid_1>my_tablet5{"mycolumn2": { "type": "BIGINT", "supercolumn_name": "<column2_uuid>-t4" }}


From this point, dropping the base table would be a no-op against the supertable, and would insert one last entry into the schema history, having null schema_descriptor to indicate that the table no longer exists:

table_uuidtable_namelogical_timestampschema_descriptor
............
<table_uuid_1>my_tablet6null

AddColumn handler

  • Supertable: Add a new column of matching type into the supertable, with the name <new_column_uuid>-<logical_timestamp>.
  • Schema History: Find the most recent schema history for the table. Insert a new row which copies all values, only modifying the schema_descriptor by appending a key-value mapping for the new column.

DropColumn handler

  • Supertable: No-op.
  • Schema History: Find the most recent schema history for the table. Insert a new row which copies all values, only modifying the schema_descriptor by removing a key-value mapping for the dropped column.

RenameColumn handler

  • Supertable: No-op, because the supercolumn remains the same.
  • Schema History: Find the most recent schema history for the table. Insert a new row which copies all values, only modifying the schema_descriptor by editing the key of the key-value mapping for the renamed column.

ChangeColumnDataType handler

  • Supertable: Add a new column corresponding to the new data type with default=Null. Loop over the rows which currently exist in the base table and update those rows in the supertable by setting the value of the new column based on the old value for that row and the provided casting function.
  • Schema History: Find the most recent schema history for the table. Insert a new row which copies all values, only modifying the schema_descriptor by editing the type and supercolumn_name for the altered column.

ChangeColumnIsNullable handler

  • Supertable: No-op for the supertable. The supertable always allows all columns to be nullable, so that it can behave correctly if a column is dropped -- when new insertions occur, they should write null to the no-longer-existing column. Because the supertable's data comes from the base table, the values sent to a NOT NULL column of the base table should always produce non-null values in the supertable.

  • Schema History: In order for GetCatalog to return a correct table_descriptor, we still need to insert a new row into the schema_history table.

ReorderColumn handler

  • Supertable: Obvious no-op.
  • Schema History: Because the table_descriptor of the base table has changed, we must insert a new entry into the schema_history table to record the new column order.

CreateTable handler

  • Supertable: Create a new supertable, using the naming scheme <column_uuid>-<logical_timestamp> for each column, with three reserved columns: synthetic_key, is_deleted, and logical_timestamp.
  • Schema History: Insert a new row into the schema_history table corresponding to the initial schema of the table.

DropTable handler

Supertable: No-op, because all of the data must remain queryable for query timestamps before the drop time.

Schema History: Newer-timestamped attempts to query the table should fail. To facilitate this, a new row is inserted into schema_history marking the table name/uuid as deleted, by setting the schema_descriptor to null.

RenameTable handler

Supertable: No-op because none of the underlying data changes. The supertable is actually named using the table_uuid.

The schema_history needs to mark the symbol name of the old table as now-invalid, and the new name as valid. Therefore, two insertions are done: One which looks similar to dropping the old table (marking the schema_descriptor as null), and another which looks similar to creating a new table with the new name.

table_uuidtable_namelogical_timestampschema_descriptor
............
<table_uuid_1>old_table_namet_renamenull
<table_uuid_1>new_table_namet_rename<current_table_schema>

Compaction

A general piece of SQL wisdom is to avoid "infinitely-growing tables", because even the simplest queries eventually become intractable if they don't perfectly leverage indices. Supertables do grow infinitely, because even deletes in the base table cause insertions in the supertable. This section will describe a compaction scheme which maintains invariants but 'forgets' the past up until a new "beginning of time" tpurge, a timestamp after which the Reconstruction Invariant will still be honored.

Can't simply drop all rows with logical timestamp less than tpurge

Imagine the majority of the rows in the base table change every day, but one row was inserted at the beginning of time and never altered. We can never drop this particular row, because doing so would violate the reconstruction invariant: The row which was inserted at the beginning of time still exists in the base table.

Rather than blindly dropping all data before the purge timestamp, the following query will delete the maximum number of rows while preserving the Reconstruction Invariant for any time after tpurge.

WITH to_be_deleted AS (
  WITH ranked_by_age_within_key AS (
    SELECT synthetic_key, logical_timestamp, ROW_NUMBER() OVER (partition by synthetic_key
      order by logical_timestamp desc) AS rank_in_key
    FROM <table_uuid>

     -- Only consider candidates from before the purge timestamp
    WHERE logical_timestamp < $purge_timestamp
  )
  SELECT synthetic_key, logical_timestamp
  FROM ranked_by_age_within_key
  WHERE
    -- "delete everything except the newest version of the item as of the purge time"
    rank_in_key > 1
    -- If the item is fully deleted by this time, it's safe to drop it completely.
    OR is_deleted = true
)
DELETE FROM <table_uuid> WHERE (synthetic_key, logical_timestamp) IN to_be_deleted;

Rationale and alternatives

Why store semi-redundant copies of data every time the data is updated?

It's true that the storage cost of this scheme is high, and would grow unbounded until compaction. However, semi-redundant copies of data is the only way to offer a fully consistent view of the past with rich SQL query-ability.

Using a transaction-log-only, or only storing incremental changes rather than fully copies, could confer the ability to do point queries for the past state of an individual row item (by dynamically replaying the transaction log for a given synthetic key). But, running a SQL query against a consistent past point-in-time is completely infeasible.

Because Workers require the richer feature-set of analytical SQL queries against the past state of the world, we are forced to swallow the cost of semi-redundant duplicated data. However, truncation and exporting old data to cold storage could alleviate this problem.

Postgres, DuckDB, Snowflake, or another DB?

Based on the following analysis, the decision was made to use DuckDB.

  • Postgres

    • Pros
      • Well-understood with good async Rust library
      • Generally great feature set
    • Cons
      • Row-based memory layout makes queries like ROW_NUMBER() OVER(partition by...) slow
      • Shared physical resources within a Postgres cluster
  • DuckDB

    • Pros
      • Columnar format and vectorized queries should be conducive to our query access pattern.
      • Embedded in-process runtime -- low overhead / no network calls / very high performance
      • "Full SQL support", uses PostgresSQL parser, with great Parquet integration
    • Cons
      • Relatively young technology (2018) & <200 users of the duckdb-rs crate
      • Sharing physical resources of the history-server docker container disk (but 1 separate storage file per tenant database)
  • Snowflake

    • Pros
      • Infinite S3 distributed storage warehouse, scalable compute with EC2
      • Columnar format
    • Cons
      • Monetarily expensive! $$$
        • Pay for compute upon every insertion or query -- the warehouse would rarely have a chance to shut down.
      • Substantial external dependencies
      • Network traffic latency
      • Not designed for individual insertions, much prefers bulk insertion / Often delay between write and readability.

Future possibilities

Rich & Queryable Database Audit Logs

db-server currently stores a transaction log of all base tables. But, a naked transaction can only confer a limited amount of information: It does not store the pre-image of a change, nor would it allow for running analytical queries at a prior point in time.

Suppose a user browses the transaction log and finds a transaction they didn't expect. Using the history database, they can run arbitrary queries on the state of the database at the time of the unexpected transaction.

Export parquet to S3 - Backups & Cold Tier

As aforementioned, infinitely-growing SQL tables will always eventually become a CPU/memory/performance issue. To address this, one could imagine moving truncated data to S3 rather than fully destroying it. It could then be loaded on-demand as needed. DuckDB has great support for parquet integration, which makes exporting and loading data easy.

This provides value in two ways:

  1. Solve the infinite data growth problem by moving pruned parts of the history database to cold tier. This enables compaction without sacrificing queryability of old data.

  2. Materialize point-in-time backups for all tenant DBs. The history database can be used to generate arbitrary point-in-time backups in parquet format, which the user can keep for compliance purposes or ransomware protection.