ticdc 异常

【 TiDB 使用环境】生产环境
【 TiDB 版本】v5.4.1
【复现路径】
昨天晚上建表,对表进行加字段,写入数据,新建视等操作;然后 cdc 就频繁异常;

建表:(安全起见,更改了字段,表名,comment ,其他未更改)

CREATE TABLE dim.`1111_vest` (
  `id` int(11) NOT NULL COMMENT '主键',
  `1111_id` int(11) DEFAULT NULL COMMENT 'id',
  `1111_name` varchar(48) COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '',
  `1111_name` varchar(100) DEFAULT NULL COMMENT ')',
  `qy_wx_id` varchar(48) COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '',
  `1111_phone` varchar(24) COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '',
  `1111_dep` varchar(100) DEFAULT NULL COMMENT ')',
  `owner_name` varchar(100) DEFAULT NULL COMMENT '',
  `owner_job_id` varchar(100) DEFAULT NULL COMMENT '',
  `owner_dep` varchar(100) DEFAULT NULL COMMENT '',
  `consult_id` int(11) DEFAULT NULL COMMENT '',
  `owner_222_id` int(11) DEFAULT NULL COMMENT '',
  `owner_333_id` int(11) DEFAULT NULL COMMENT 'id',
  `owner_333_name` varchar(50) COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '',
  `owner_333_222_id` int(11) DEFAULT NULL COMMENT 'id,
  `owner_333_222_name` varchar(50) COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '',
  `owner_333_center_id` int(11) DEFAULT NULL COMMENT 'id,',
  `owner_333_center_name` varchar(200) COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '',
  `dddd_type` int(11) NOT NULL DEFAULT '0' COMMENT ' ',
  PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED */
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;

增加字段:

alter table dim.dim_2222_user add column `555_333_center_name` varchar(200) COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '' after consult_id;
alter table dim.dim_2222_user add column `555_333_center_id` int(11) DEFAULT NULL COMMENT ',id' after consult_id;
alter table dim.dim_2222_user add column `555_333_111_name` varchar(50) COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '' after consult_id;
alter table dim.dim_2222_user add column `555_333_111_id` int(11) DEFAULT NULL COMMENT ',id' after consult_id;
alter table dim.dim_2222_user add column `555_333_name` varchar(50) COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '' after consult_id;
alter table dim.dim_2222_user add column `555_333_id` int(11) DEFAULT NULL COMMENT 'id,' after consult_id;
alter table dim.dim_2222_user add column `555_111_name` varchar(45) COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '' after consult_id;
alter table dim.dim_2222_user add column `555_job_number` varchar(32) COLLATE utf8mb4_general_ci DEFAULT '' COMMENT '' after consult_id;
alter table dim.dim_2222_user add column `555_name` varchar(45) COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '' after consult_id;
alter table dim.dim_2222_user add column `555_111_id` int(11) DEFAULT NULL COMMENT ' id' after consult_id;
alter table dim.dim_2222_user add column `555_consult_id` int(11) DEFAULT NULL COMMENT 'id' after consult_id;
alter table dim.dim_2222_user add column `000_333_center_name` varchar(200) COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '' after consult_id;
alter table dim.dim_2222_user add column `000_333_center_id` int(11) DEFAULT NULL COMMENT 'id,id' after consult_id;
alter table dim.dim_2222_user add column `000_333_111_name` varchar(50) COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '' after consult_id;
alter table dim.dim_2222_user add column `000_333_111_id` int(11) DEFAULT NULL COMMENT 'id,' after consult_id;
alter table dim.dim_2222_user add column `000_333_name` varchar(50) COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '' after consult_id;
alter table dim.dim_2222_user add column `000_333_id` int(11) DEFAULT NULL COMMENT 'id,' after consult_id;
alter table dim.dim_2222_user add column `000_111_id` int(11) DEFAULT NULL COMMENT ' 对应的id' after consult_id;

然后cdc 停止同步,3个cdc 组件轮番挂,同步停止;
于是停掉昨天建表和加字段表的数据写入,
定位到就是dim.* 的同步异常,于是,删除,重建cdc 同步任务,

但是只要启动,就会报错;

持续到下午,启动,好了 :dotted_line_face:

在这期间,我对cdc 同步的任务多次 停止,删除,重建;

丈二的和尚

【附件:截图/日志/监控】

日志找到异常:

[2024/06/27 23:43:27.459 +08:00] [ERROR] [client.go:752] ["[pd] fetch pending tso requests error"] [dc-location=global] [error="[PD:client:ErrClientGetTSO]context canceled: context canceled"]
[2024/06/27 23:43:27.459 +08:00] [INFO] [client.go:666] ["[pd] exit tso dispatcher"] [dc-location=global]
[2024/06/27 23:43:27.459 +08:00] [INFO] [capture.go:323] ["the processor routine has exited"] [error="[CDC:ErrPDEtcdAPIError]etcd api call error: context canceled"] [errorVerbose="[CDC:ErrPDEtcdAPIError]etcd api call error: context canceled\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20211224045212-9687c2b0f87c/errors.go:174\ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs\n\tgithub.com/pingcap/errors@v0.11.5-0.20211224045212-9687c2b0f87c/normalize.go:164\ngithub.com/pingcap/tiflow/pkg/errors.WrapError\n\tgithub.com/pingcap/tiflow/pkg/errors/helper.go:30\ngithub.com/pingcap/tiflow/cdc/capture.(*Capture).runEtcdWorker\n\tgithub.com/pingcap/tiflow/cdc/capture/capture.go:476\ngithub.com/pingcap/tiflow/cdc/capture.(*Capture).run.func3\n\tgithub.com/pingcap/tiflow/cdc/capture/capture.go:322\nruntime.goexit\n\truntime/asm_amd64.s:1371"]
[2024/06/27 23:43:27.459 +08:00] [INFO] [acquirer.go:72] ["TimeAcquirer exit"]
[2024/06/27 23:43:27.459 +08:00] [INFO] [client.go:234] ["WatchWithChan exited"] [role=processor]
[2024/06/27 23:43:27.459 +08:00] [INFO] [capture.go:299] ["the owner routine has exited"] []
[2024/06/27 23:43:27.459 +08:00] [ERROR] [http_status.go:74] ["http server error"] [error="[CDC:ErrServeHTTP]serve http error: mux: server closed"] [errorVerbose="[CDC:ErrServeHTTP]serve http error: mux: server closed\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20211224045212-9687c2b0f87c/errors.go:174\ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs\n\tgithub.com/pingcap/errors@v0.11.5-0.20211224045212-9687c2b0f87c/normalize.go:164\ngithub.com/pingcap/tiflow/pkg/errors.WrapError\n\tgithub.com/pingcap/tiflow/pkg/errors/helper.go:30\ngithub.com/pingcap/tiflow/cdc.(*Server).startStatusHTTP.func1\n\tgithub.com/pingcap/tiflow/cdc/http_status.go:74\nruntime.goexit\n\truntime/asm_amd64.s:1371"]
[2024/06/27 23:43:27.460 +08:00] [INFO] [capture.go:257] ["capture recovered"] [capture-id=99bbe3c0-fe7d-47f4-8619-684e223a4d7a]
[2024/06/27 23:43:27.460 +08:00] [INFO] [capture.go:234] ["the capture routine has exited"]
[2024/06/27 23:43:27.460 +08:00] [ERROR] [client.go:752] ["[pd] fetch pending tso requests error"] [dc-location=global] [error="[PD:client:ErrClientGetTSO]context canceled: context canceled"]

说明一点:我们这个cdc 同步的时候是 dim库下所有表

https://github.com/pingcap/tidb/issues/31335

6.x 上解决了,请升级到 6.x 以上的版本

1 个赞

关闭这个sql

cdc.log (3.1 MB)
测试环境日志