问题:
如何获得 TiDB 中事务的开始时间戳与提交时间戳?
我们的方案:
在 TiCDC 中添加了 log 语句,然后将 TiDB 作为上游,MySQL 作为下游,TiCDC 中的日志保存了事务的开始时间戳与提交时间戳。
我们的疑问:
1、上面获取事务的开始时间戳与提交时间戳的方法是否正确?
2、如何获取只读事务的开始时间戳与提交时间戳?
目前我们无法获取只读事务的开始时间戳与提交时间戳。根据 TiCDC 文档,只有发生数据变更时,上下游才会进行同步。
更详细的英文描述如下:
Our Problem
How to obtain the start-timestamp and commit-timestamp of transactions in TiDB?
Our Solution
Environment
- DB version: TiDB v7.3
- Driver: go-sql-driver/mysql
To obtain the start and commit timestamps of transactions
We run the optimistic transaction model of TiDB.
We use TiCDC to connect TiDB as the upstream and MySQL as the downstream. We modify TiCDC slightly to log the start and commit timestamps into a file.
Modification code: lines 214-220 of cdc/sinkv2/eventsink/txn/mysql/mysql.go
func convert2RowChanges(
...
) ... {
log.Info("CDC:RowChangeEvent",
zap.Int64("row-id", row.RowID),
zap.String("table-name", row.Table.Table),
zap.Uint64("start-ts", row.StartTs),
zap.Uint64("commit-ts", row.CommitTs),
zap.Any("columns", row.Columns),
zap.Any("pre-columns", row.PreColumns))
...
The log is as follows from which it is easy to extract the timestamps start-ts
and commit-ts
:
[2023/10/07 21:07:22.983 +08:00] [INFO] [mysql.go:214] [CDC:RowChangeEvent] [row-id=4] [table-name=txn0] [start-ts=444775541403025430] [commit-ts=444775541494775834] [columns="[{\"name\":\"k\",\"type\":3,\"charset\":\"binary\",\"flag\":27,\"value\":4,\"default\":null},{\"name\":\"v\",\"type\":3,\"charset\":\"binary\",\"flag\":1,\"value\":3607,\"default\":null}]"] [pre-columns=null]
[2023/10/07 21:07:22.983 +08:00] [INFO] [mysql.go:214] [CDC:RowChangeEvent] [row-id=7] [table-name=txn0] [start-ts=444775541403025430] [commit-ts=444775541494775834] [columns="[{\"name\":\"k\",\"type\":3,\"charset\":\"binary\",\"flag\":27,\"value\":7,\"default\":null},{\"name\":\"v\",\"type\":3,\"charset\":\"binary\",\"flag\":1,\"value\":3605,\"default\":null}]"] [pre-columns=null]
[2023/10/07 21:07:22.983 +08:00] [INFO] [mysql.go:214] [CDC:RowChangeEvent] [row-id=11] [table-name=txn0] [start-ts=444775541403025430] [commit-ts=444775541494775834] [columns="[{\"name\":\"k\",\"type\":3,\"charset\":\"binary\",\"flag\":27,\"value\":11,\"default\":null},{\"name\":\"v\",\"type\":3,\"charset\":\"binary\",\"flag\":1,\"value\":3629,\"default\":null}]"] [pre-columns=null]
[2023/10/07 21:07:22.983 +08:00] [INFO] [mysql.go:214] [CDC:RowChangeEvent] [row-id=12] [table-name=txn0] [start-ts=444775541403025430] [commit-ts=444775541494775834] [columns="[{\"name\":\"k\",\"type\":3,\"charset\":\"binary\",\"flag\":27,\"value\":12,\"default\":null},{\"name\":\"v\",\"type\":3,\"charset\":\"binary\",\"flag\":1,\"value\":3626,\"default\":null}]"] [pre-columns=null]
[2023/10/07 21:07:22.983 +08:00] [INFO] [mysql.go:214] [CDC:RowChangeEvent] [row-id=21] [table-name=txn0] [start-ts=444775541403025430] [commit-ts=444775541494775834] [columns="[{\"name\":\"k\",\"type\":3,\"charset\":\"binary\",\"flag\":27,\"value\":21,\"default\":null},{\"name\":\"v\",\"type\":3,\"charset\":\"binary\",\"flag\":1,\"value\":3556,\"default\":null}]"] [pre-columns=null]
[2023/10/07 21:07:22.983 +08:00] [INFO] [mysql.go:214] [CDC:RowChangeEvent] [row-id=22] [table-name=txn0] [start-ts=444775541403025430] [commit-ts=444775541494775834] [columns="[{\"name\":\"k\",\"type\":3,\"charset\":\"binary\",\"flag\":27,\"value\":22,\"default\":null},{\"name\":\"v\",\"type\":3,\"charset\":\"binary\",\"flag\":1,\"value\":3645,\"default\":null}]"] [pre-columns=null]
[2023/10/07 21:07:22.983 +08:00] [INFO] [mysql.go:214] [CDC:RowChangeEvent] [row-id=23] [table-name=txn0] [start-ts=444775541403025430] [commit-ts=444775541494775834] [columns="[{\"name\":\"k\",\"type\":3,\"charset\":\"binary\",\"flag\":27,\"value\":23,\"default\":null},{\"name\":\"v\",\"type\":3,\"charset\":\"binary\",\"flag\":1,\"value\":3656,\"default\":null}]"] [pre-columns=null]
[2023/10/07 21:07:22.983 +08:00] [INFO] [mysql.go:214] [CDC:RowChangeEvent] [row-id=31] [table-name=txn0] [start-ts=444775541403025430] [commit-ts=444775541494775834] [columns="[{\"name\":\"k\",\"type\":3,\"charset\":\"binary\",\"flag\":27,\"value\":31,\"default\":null},{\"name\":\"v\",\"type\":3,\"charset\":\"binary\",\"flag\":1,\"value\":3586,\"default\":null}]"] [pre-columns=null]
Our Questions
-
Have we obtain the start and commit timestamps of TiDB successfully?
-
What about read-only transactions?
Currently we are not able to obtain the start and commit timestamps of read-only transactoins.
According to the docs on TiCDC (in Chinese),
it seems that the synchronization between the upstream and the downstream
is performed only when data is changed.
Thanks.
Motivation
We are working on a lightweight white-box checking algorithm of snapshot isolation (SI): given an execution of a database, to check whether it satisfies SI.
The SI checking problem is NP-hard for general executions.
So it is desirable to make use of the knowledge of how SI is actually implemented in databases.
The insight is that most databases, especially distributed databases, implement SI following the generic protocol using start-timestamps and commit-timestamps.
With these timestamps of transactions in an execution, the SI checking problem becomes solvable in polynomial time. Therefore, we want to obtain these timestamps when generating executions.
It is crucial for us to really understand the meaning and roles of the start-timestamps and commit-timestamps in the database under testing.
We must be very sure that we have obtained the right timestamps in the right way.
That is why we ask for help here.
Background
We are digging into the implementation of snapshot isolation of TiDB, especially into the use of timestamps in transactions.
Consider the classic description of start-timestamp and commit-timestamp in implementing Snapshot Isolation,
quoted from the paper; Section 4.2:
For start-timestamp: A transaction executing with Snapshot Isolation always reads data from a snapshot of the (committed) data as of the time the transaction started, called its Start-Timestamp. This time may be any time before the transaction’s first Read.
For commit-timestamp: When the transaction
T1
is ready to commit, it gets a
Commit-Timestamp, which is larger than any existing Start-Timestamp or Commit-Timestamp.
WhenT1
commits, its changes become visible to all
transactions whose Start-Timestamps are larger thanT1
’s Commit-Timestamp.
For conflict detection:
The transactionT1
successfully commits only if no other transactionT2
with a
Commit-Timestamp inT1
’s execution interval [Start-Timestamp, Commit-Timestamp] wrote data thatT1
also wrote. Otherwise,T1
will abort.
This feature, called First-committer-wins prevents lost updates.