【 TiDB 使用环境】生产环境
【 TiDB 版本】v7.5.3
【复现路径】开启自动删除功能,cdc 同步到kafka 的任务会卡主
【遇到的问题:问题现象及影响】
此业务集群的特点是 每天大量写入大概是1.5亿 ~ 2亿之间。业务只需要保留最近14天的数据,因此开启了自动删除的功能,但是从昨天晚上开始,自动删除任务开启之后,cdc 任务卡主。
今天重试了多次,只要将自动删除功能打开,cdc 就会卡主
cdc 任务配置如下
ctl:v7.5.3 cdc changefeed create --server=http://xxxxxxx:8322
--sink-uri="kafka://xxxxx:9092/tidb-naspam?partition-num=6&max-message-bytes=10485760&compression=lz4&replication-factor=1"
--config=/home/tidb/cdctoml/cdc-tidb-naspam/cdc-naspam-kafka.toml
--changefeed-id=cdc-naspam-kafka6 --sort-engine="unified"
cdc 的任务配置文件如下
case-sensitive = true
[filter]
ignore-txn-start-ts = [1, 2]
rules = ['xxxxx_trace.*','aiwriting.*','spam_xxxx.*']
[mounter]
worker-num = 32
[sink]
dispatchers = [
{matcher = ['aiwriting.*'],partition = "index-value"},
{matcher = ['spam_trace.*'],partition = "index-value"},
]
protocol = "canal-json"
自动同步任务相关参数如下
MySQL [(none)]> show variables like '%ttl%';
+-----------------------------------------+-------------+
| Variable_name | Value |
+-----------------------------------------+-------------+
| log_throttle_queries_not_using_indexes | 0 |
| tidb_mpp_store_fail_ttl | 60s |
| tidb_ttl_delete_batch_size | 200 |
| tidb_ttl_delete_rate_limit | 0 |
| tidb_ttl_delete_worker_count | 2 |
| tidb_ttl_job_enable | ON |
| tidb_ttl_job_schedule_window_end_time | 17:23 +0800 |
| tidb_ttl_job_schedule_window_start_time | 21:33 +0800 |
| tidb_ttl_running_tasks | -1 |
| tidb_ttl_scan_batch_size | 200 |
| tidb_ttl_scan_worker_count | 2 |
+-----------------------------------------+-------------+
相关监控如下: 之前一周运行都是正常的,cdc 的cpu 节点压力稍高,但是未出现cdc 任务卡主或者延迟的情况。从昨天晚上开始出现同步卡主的问题
下图是今天重试多次创建任务后打开自动删除,cdc 的状态
cdc.tar.gz (9.0 MB)
Create Table: CREATE TABLE `feature_trace` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT '自增主键',
`cid` bigint(20) unsigned NOT NULL DEFAULT '0' COMMENT '数据id',
`trace` json DEFAULT NULL COMMENT '特征计算trace',
`cost` bigint(20) unsigned NOT NULL DEFAULT '0' COMMENT '计算耗时',
`ext_pack` json DEFAULT NULL COMMENT '扩展json数据',
`create_time` int(10) unsigned NOT NULL DEFAULT '0' COMMENT '创建时间',
`create_at` timestamp DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '数据写入时间',
`version_id` bigint(20) NOT NULL DEFAULT '0' COMMENT '版本ID',
PRIMARY KEY (`id`) /*T![clustered_index] NONCLUSTERED */,
KEY `idx_uid` (`uid`),
KEY `idx_cid` (`cid`),
KEY `idx_strategy` (`strategy_id`),
KEY `idx_cmd` (`command_no`),
KEY `idx_un` (`uniq_id`,`attach_id`,`ext_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin AUTO_INCREMENT=25764511301 COMMENT='特征trace表' /*T![ttl] TTL=`create_at` + INTERVAL 14 DAY */ /*T![ttl] TTL_ENABLE='ON' */ /*T![ttl] TTL_JOB_INTERVAL='1h' */