v7.1 好像ddl卡住了?BUG吗

tidb上的日志就我上面发的那些, 还需要看什么的日志?

tidb、tikv、pd都看下,有可能卡住了。

1 个赞

抽查了下,好像只有1个tidb server报这些异常,一直在刷这些日志信息。
刚开始这个异常是在另一个tidb server上,我重启了后,异常跳到这个tidb server上了。

[2025/04/01 11:36:29.267 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.290 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.312 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.335 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.358 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.380 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.403 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.426 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.448 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.471 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.493 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.515 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.538 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.561 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.583 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.598 +08:00] [INFO] [coprocessor.go:1288] ["[TIME_COP_PROCESS] resp_time:913.781633ms txnStartTS:18446744073709551615 region_id:6625142 store_addr:192.168.241.71:20160 kv_process_ms:870 kv_wait_ms:0 kv_read_ms:0 processed_versions:28436 total_versions:28437 rocksdb_delete_skipped_count:0 rocksdb_key_skipped_count:56871 rocksdb_cache_hit_count:18 rocksdb_read_count:1121 rocksdb_read_byte:15157605"]
[2025/04/01 11:36:29.598 +08:00] [INFO] [region_request.go:1467] ["throwing pseudo region error due to no replica available"] [req-ts=18446744073709551615] [req-type=Cop] [region="{ region id: 6626649, ver: 145666, confVer: 286085 }"] [region-is-valid=unknown] [retry-times=0] [replica-read-type=leader] [replica-selector-state=nil] [stale-read=false] [replica-status=] [total-backoff-ms=0] [total-backoff-times=0] [total-region-errors=]
[2025/04/01 11:36:29.606 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.628 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.651 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.674 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.696 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.719 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.742 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.764 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.803 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.826 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.848 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.870 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.893 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.916 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.938 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.961 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:29.984 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:30.008 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:30.031 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:30.053 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:30.075 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:30.099 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:30.120 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:30.143 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:30.165 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:30.187 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:30.210 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:30.233 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]
[2025/04/01 11:36:30.256 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7aa68da4-4a9b-464d-8ac8-79eed371e10f"] ["ddl job id"=2444] [ver=3328]

我也遇到了,怎么解决呢

1 个赞

192.168.241.85, port 4000,这个节点重启过不

1 个赞

kill了重跑

1 个赞

如果测试环境可以重启下,看看。

1 个赞

没有,我重启下试试

重启这个tidb server就好了~ 那个ddl 没了~

这个问题只能通过重启tidb server解决吗有没有其他办法?
这是个BUG吗 在哪个版本中得到了修复?
这俩问题谁了解呀

你用的版本已经比较新了,基本不太可能是已知 BUG。
报错的地方在这里

简单说就是某些 tidb-server 没有拿到最新的 ddl version 。这个 是从 PD 获取的。
目前的解决方案只能是重启这个有问题的 tidb 实例。如果想查 root cause,需要去分析那个异常的 tidb 实例的日志。

1 个赞

tidb server负责DDL操作,如果不能完成,或者是队列过多,或者tidb server异常

1 个赞

通过重启tidb server后,再次执行ddl 还是会卡住~我这问题好像是必现了
查看了下日志,好像有很多这种错误,不知道我这卡DDL跟这有什么关系?
还有auto analyze 这个能关了吗

[2025/04/03 09:56:10.208 +08:00] [INFO] [server.go:855] [kill] [conn=991431833788874755] [query=true]
[2025/04/03 09:56:10.308 +08:00] [WARN] [expensivequery.go:77] ["auto analyze timeout, kill it"] [costTime=12h0m31.144393171s] [maxAutoAnalyzeTime=12h0m0s] [processInfo="{id:991431833788874755, user:, host:
, db:<nil>, command:Sleep, time:43231, state:autocommit, info:analyze table `bsppr`.`xpost`}"]
[2025/04/03 09:56:10.308 +08:00] [INFO] [server.go:855] [kill] [conn=991431833788874755] [query=true]
[2025/04/03 09:56:10.385 +08:00] [WARN] [coprocessor.go:431] ["buildCopTasks takes too much time"] [elapsed=1.192837679s] ["range len"=1] ["task len"=8508]
[2025/04/03 09:56:10.386 +08:00] [ERROR] [analyze.go:321] ["analyze failed"] [error="[tikv:1317]Query execution was interrupted"]
[2025/04/03 09:56:10.389 +08:00] [INFO] [analyze.go:600] ["analyze table `bsppr`.`xpost` has failed"] [partition=] ["job info"="auto analyze columns"] ["start time"=2025/04/02 21:55:39.284 +08:00] ["end tim
e"=2025/04/03 09:56:10.386 +08:00] [cost=12h0m31.10162098s] ["sample rate reason"=]
[2025/04/03 09:56:10.389 +08:00] [ERROR] [analyze.go:321] ["analyze failed"] [error="[executor:1317]Query execution was interrupted"]
[2025/04/03 09:56:10.393 +08:00] [INFO] [analyze.go:600] ["analyze table `bsppr`.`xpost` has failed"] [partition=] ["job info"="auto analyze index idx_fid_url"] ["start time"=2025/04/03 09:55:40.040 +08:00]
 ["end time"=2025/04/03 09:56:10.389 +08:00] [cost=30.349265636s] ["sample rate reason"=]
[2025/04/03 09:56:10.393 +08:00] [ERROR] [analyze.go:321] ["analyze failed"] [error="[executor:1317]Query execution was interrupted"]
[2025/04/03 09:56:10.396 +08:00] [INFO] [analyze.go:600] ["analyze table `bsppr`.`xpost` has failed"] [partition=] ["job info"="auto analyze index postid"] ["start time"=2025/04/03 09:55:41.434 +08:00] ["en
d time"=2025/04/03 09:56:10.393 +08:00] [cost=28.958920873s] ["sample rate reason"=]
[2025/04/03 09:56:10.396 +08:00] [ERROR] [analyze.go:321] ["analyze failed"] [error="[executor:1317]Query execution was interrupted"]
[2025/04/03 09:56:10.399 +08:00] [INFO] [analyze.go:600] ["analyze table `bsppr`.`xpost` has failed"] [partition=] ["job info"="auto analyze index idx_xpost_url"] ["start time"=2025/04/03 09:55:43.087 +08:0
0] ["end time"=2025/04/03 09:56:10.396 +08:00] [cost=27.308783492s] ["sample rate reason"=]
[2025/04/03 09:56:10.399 +08:00] [ERROR] [analyze.go:321] ["analyze failed"] [error="[executor:1317]Query execution was interrupted"]
[2025/04/03 09:56:10.402 +08:00] [INFO] [analyze.go:600] ["analyze table `bsppr`.`xpost` has failed"] [partition=] ["job info"="auto analyze index idx_xpost_entryid"] ["start time"=2025/04/03 09:55:44.918 +
08:00] ["end time"=2025/04/03 09:56:10.399 +08:00] [cost=25.480690474s] ["sample rate reason"=]
[2025/04/03 09:56:10.402 +08:00] [ERROR] [analyze.go:321] ["analyze failed"] [error="[executor:1317]Query execution was interrupted"]
[2025/04/03 09:56:10.406 +08:00] [INFO] [analyze.go:600] ["analyze table `bsppr`.`xpost` has failed"] [partition=] ["job info"="auto analyze index idx_xpost_type"] ["start time"=2025/04/03 09:55:46.431 +08:
00] ["end time"=2025/04/03 09:56:10.402 +08:00] [cost=23.970762222s] ["sample rate reason"=]
[2025/04/03 09:56:10.406 +08:00] [ERROR] [analyze.go:321] ["analyze failed"] [error="[executor:1317]Query execution was interrupted"]
[2025/04/03 09:56:10.408 +08:00] [WARN] [expensivequery.go:77] ["auto analyze timeout, kill it"] [costTime=12h0m31.244495254s] [maxAutoAnalyzeTime=12h0m0s] [processInfo="{id:991431833788874755, user:, host:
, db:<nil>, command:Sleep, time:43231, state:autocommit, info:analyze table `bsppr`.`xpost`}"]
[2025/04/03 09:56:10.408 +08:00] [INFO] [server.go:855] [kill] [conn=991431833788874755] [query=true]
[2025/04/03 09:56:10.409 +08:00] [INFO] [analyze.go:600] ["analyze table `bsppr`.`xpost` has failed"] [partition=] ["job info"="auto analyze index idx_xpost_sourcetype"] ["start time"=2025/04/03 09:55:47.99
5 +08:00] ["end time"=2025/04/03 09:56:10.406 +08:00] [cost=22.411129064s] ["sample rate reason"=]
[2025/04/03 09:56:10.409 +08:00] [ERROR] [analyze.go:321] ["analyze failed"] [error="[executor:1317]Query execution was interrupted"]
[2025/04/03 09:56:10.413 +08:00] [INFO] [analyze.go:600] ["analyze table `bsppr`.`xpost` has failed"] [partition=] ["job info"="auto analyze index idx_xpost_pt"] ["start time"=2025/04/03 09:55:49.459 +08:00
] ["end time"=2025/04/03 09:56:10.409 +08:00] [cost=20.950650292s] ["sample rate reason"=]
[2025/04/03 09:56:10.413 +08:00] [ERROR] [analyze.go:321] ["analyze failed"] [error="[executor:1317]Query execution was interrupted"]
[2025/04/03 09:56:10.417 +08:00] [INFO] [analyze.go:600] ["analyze table `bsppr`.`xpost` has failed"] [partition=] ["job info"="auto analyze index idx_xpost_source"] ["start time"=2025/04/03 09:55:51.200 +0
8:00] ["end time"=2025/04/03 09:56:10.413 +08:00] [cost=19.213033367s] ["sample rate reason"=]
[2025/04/03 09:56:10.417 +08:00] [ERROR] [analyze.go:321] ["analyze failed"] [error="[executor:1317]Query execution was interrupted"]
[2025/04/03 09:56:10.420 +08:00] [INFO] [analyze.go:600] ["analyze table `bsppr`.`xpost` has failed"] [partition=] ["job info"="auto analyze index idx_fid"] ["start time"=2025/04/03 09:55:52.666 +08:00] ["e
nd time"=2025/04/03 09:56:10.417 +08:00] [cost=17.750374293s] ["sample rate reason"=]
[2025/04/03 09:56:10.420 +08:00] [ERROR] [analyze.go:321] ["analyze failed"] [error="[executor:1317]Query execution was interrupted"]
[2025/04/03 09:56:10.423 +08:00] [INFO] [analyze.go:600] ["analyze table `bsppr`.`xpost` has failed"] [partition=] ["job info"="auto analyze index idx_xpost_hidden"] ["start time"=2025/04/03 09:55:54.034 +0
8:00] ["end time"=2025/04/03 09:56:10.420 +08:00] [cost=16.38608797s] ["sample rate reason"=]
[2025/04/03 09:56:10.423 +08:00] [ERROR] [analyze.go:321] ["analyze failed"] [error="[executor:1317]Query execution was interrupted"]
[2025/04/03 09:56:10.427 +08:00] [INFO] [analyze.go:600] ["analyze table `bsppr`.`xpost` has failed"] [partition=] ["job info"="auto analyze index idx_tidb_in_time_posttime"] ["start time"=2025/04/03 09:55:
55.606 +08:00] ["end time"=2025/04/03 09:56:10.423 +08:00] [cost=14.817462958s] ["sample rate reason"=]
[2025/04/03 09:56:10.427 +08:00] [ERROR] [analyze.go:321] ["analyze failed"] [error="[executor:1317]Query execution was interrupted"]
[2025/04/03 09:56:10.430 +08:00] [INFO] [analyze.go:600] ["analyze table `bsppr`.`xpost` has failed"] [partition=] ["job info"="auto analyze index idx_search_by_includetime"] ["start time"=2025/04/03 09:55:
57.166 +08:00] ["end time"=2025/04/03 09:56:10.427 +08:00] [cost=13.261317422s] ["sample rate reason"=]
[2025/04/03 09:56:10.430 +08:00] [ERROR] [analyze.go:321] ["analyze failed"] [error="[executor:1317]Query execution was interrupted"]
[2025/04/03 09:56:10.434 +08:00] [INFO] [analyze.go:600] ["analyze table `bsppr`.`xpost` has failed"] [partition=] ["job info"="auto analyze index idx_search_by_posttime"] ["start time"=2025/04/03 09:55:58.
488 +08:00] ["end time"=2025/04/03 09:56:10.430 +08:00] [cost=11.941732149s] ["sample rate reason"=]
[2025/04/03 09:56:10.434 +08:00] [ERROR] [analyze.go:321] ["analyze failed"] [error="[executor:1317]Query execution was interrupted"]
[2025/04/03 09:56:10.437 +08:00] [INFO] [analyze.go:600] ["analyze table `bsppr`.`xpost` has failed"] [partition=] ["job info"="auto analyze index idx_url_parent_url"] ["start time"=2025/04/03 09:55:59.734 
+08:00] ["end time"=2025/04/03 09:56:10.434 +08:00] [cost=10.69930269s] ["sample rate reason"=]
[2025/04/03 09:56:10.437 +08:00] [ERROR] [analyze.go:321] ["analyze failed"] [error="[executor:1317]Query execution was interrupted"]
[2025/04/03 09:56:10.441 +08:00] [INFO] [analyze.go:600] ["analyze table `bsppr`.`xpost` has failed"] [partition=] ["job info"="auto analyze index idx_update_time"] ["start time"=2025/04/03 09:56:01.535 +08
:00] ["end time"=2025/04/03 09:56:10.437 +08:00] [cost=8.901658118s] ["sample rate reason"=]
[2025/04/03 09:56:10.441 +08:00] [ERROR] [analyze.go:321] ["analyze failed"] [error="[executor:1317]Query execution was interrupted"]
[2025/04/03 09:56:10.444 +08:00] [INFO] [analyze.go:600] ["analyze table `bsppr`.`xpost` has failed"] [partition=] ["job info"="auto analyze index idx_facetid_update_time"] ["start time"=2025/04/03 09:56:04
.542 +08:00] ["end time"=2025/04/03 09:56:10.441 +08:00] [cost=5.898334456s] ["sample rate reason"=]
[2025/04/03 09:56:10.444 +08:00] [ERROR] [analyze.go:321] ["analyze failed"] [error="[executor:1317]Query execution was interrupted"]
[2025/04/03 09:56:10.448 +08:00] [INFO] [analyze.go:600] ["analyze table `bsppr`.`xpost` has failed"] [partition=] ["job info"="auto analyze index idx_search_personal_by_posttime"] ["start time"=2025/04/03 
09:56:06.703 +08:00] ["end time"=2025/04/03 09:56:10.444 +08:00] [cost=3.74148036s] ["sample rate reason"=]
[2025/04/03 09:56:10.448 +08:00] [ERROR] [analyze.go:321] ["analyze failed"] [error="[executor:1317]Query execution was interrupted"]
[2025/04/03 09:56:10.451 +08:00] [INFO] [analyze.go:600] ["analyze table `bsppr`.`xpost` has failed"] [partition=] ["job info"="auto analyze index idx_personal_domain"] ["start time"=2025/04/03 09:56:07.854
 +08:00] ["end time"=2025/04/03 09:56:10.448 +08:00] [cost=2.593572507s] ["sample rate reason"=]
[2025/04/03 09:56:10.451 +08:00] [ERROR] [analyze.go:321] ["analyze failed"] [error="[executor:1317]Query execution was interrupted"]
[2025/04/03 09:56:10.455 +08:00] [INFO] [analyze.go:600] ["analyze table `bsppr`.`xpost` has failed"] [partition=] ["job info"="auto analyze index idx_search_personal_by_includetime"] ["start time"=2025/04/
03 09:56:09.188 +08:00] ["end time"=2025/04/03 09:56:10.451 +08:00] [cost=1.262966601s] ["sample rate reason"=]
[2025/04/03 09:56:10.455 +08:00] [INFO] [tidb.go:285] ["rollbackTxn called due to ddl/autocommit failure"]
[2025/04/03 09:56:10.455 +08:00] [WARN] [session.go:2283] ["run statement failed"] [schemaVersion=3372] [error="[executor:1317]Query execution was interrupted"] [session="{\n  \"currDBName\": \"\",\n  \"id\
": 991431833788874755,\n  \"status\": 2,\n  \"strictMode\": true,\n  \"user\": null\n}"]
[2025/04/03 09:56:10.455 +08:00] [ERROR] [update.go:1307] ["[stats] auto analyze failed"] [sql="analyze table `bsppr`.`xpost`"] [cost_time=12h0m31.291076475s] [error="[executor:1317]Query execution was inte
rrupted"]
[2025/04/03 09:56:10.457 +08:00] [INFO] [update.go:1181] ["[stats] auto analyze triggered"] [sql="analyze table `bsppr`.`xpost`"] [reason="too many modifications(23304517196/7.347925096e+09>0.5)"]
[2025/04/03 09:56:28.555 +08:00] [INFO] [region_request.go:1467] ["send request meet region error without retry"] [req-ts=457086176714293285] [req-type=Prewrite] [region="{ region id: 24404086, ver: 113236,
 confVer: 203776 }"] [region-is-valid=false] [retry-times=0] [replica-read-type=leader] [replica-selector-state=accessKnownLeader] [stale-read=false] [replica-status="peer: 32192895, store: 2, isEpochStale:
 false, attempts: 0, replica-epoch: 1, store-epoch: 1, store-state: resolved, store-liveness-state: reachable; peer: 32192956, store: 2949399, isEpochStale: false, attempts: 0, replica-epoch: 1, store-epoch
: 1, store-state: resolved, store-liveness-state: reachable; peer: 32193037, store: 6124709, isEpochStale: false, attempts: 1, replica-epoch: 0, store-epoch: 0, store-state: resolved, store-liveness-state: 
reachable"] [total-backoff-ms=0] [total-backoff-times=0] [total-region-errors=region_not_found:1]

这是卡住的节点的日志吗?正常来说 analyze 不影响 DDL 。日志依然还是报 ”syncer check all versions, someone is not synced“?

1 个赞

是的,卡住DDL后,有个tidb节点就一直在刷这些日志
我上面发的auto analyze的错误日志,是前面找到的错误日志,不知道跟这有没关系

[2025/04/03 10:31:28.711 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7437fbe2-6533-4d58-80ec-9028800c06c2"] ["ddl job id"=2462] [ver=3374]
[2025/04/03 10:31:28.733 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7437fbe2-6533-4d58-80ec-9028800c06c2"] ["ddl job id"=2462] [ver=3374]
[2025/04/03 10:31:28.756 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7437fbe2-6533-4d58-80ec-9028800c06c2"] ["ddl job id"=2462] [ver=3374]
[2025/04/03 10:31:28.777 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7437fbe2-6533-4d58-80ec-9028800c06c2"] ["ddl job id"=2462] [ver=3374]
[2025/04/03 10:31:28.800 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7437fbe2-6533-4d58-80ec-9028800c06c2"] ["ddl job id"=2462] [ver=3374]
[2025/04/03 10:31:28.822 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7437fbe2-6533-4d58-80ec-9028800c06c2"] ["ddl job id"=2462] [ver=3374]
[2025/04/03 10:31:28.845 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7437fbe2-6533-4d58-80ec-9028800c06c2"] ["ddl job id"=2462] [ver=3374]
[2025/04/03 10:31:28.849 +08:00] [INFO] [coprocessor.go:1288] ["[TIME_COP_PROCESS] resp_time:1.159124276s txnStartTS:18446744073709551615 region_id:5221603 store_addr:192.168.241.83:20160 kv_process_ms:1114 kv_wait_ms:0 kv_read_ms:0 processed_versions:22934 total_versions:22935 rocksdb_delete_skipped_count:0 rocksdb_key_skipped_count:51331 rocksdb_cache_hit_count:19 rocksdb_read_count:1378 rocksdb_read_byte:17432054"]
[2025/04/03 10:31:28.849 +08:00] [INFO] [region_request.go:1467] ["throwing pseudo region error due to no replica available"] [req-ts=18446744073709551615] [req-type=Cop] [region="{ region id: 5222439, ver: 136509, confVer: 216731 }"] [region-is-valid=unknown] [retry-times=0] [replica-read-type=leader] [replica-selector-state=nil] [stale-read=false] [replica-status=] [total-backoff-ms=0] [total-backoff-times=0] [total-region-errors=]
[2025/04/03 10:31:28.868 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7437fbe2-6533-4d58-80ec-9028800c06c2"] ["ddl job id"=2462] [ver=3374]
[2025/04/03 10:31:28.890 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7437fbe2-6533-4d58-80ec-9028800c06c2"] ["ddl job id"=2462] [ver=3374]
[2025/04/03 10:31:28.912 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7437fbe2-6533-4d58-80ec-9028800c06c2"] ["ddl job id"=2462] [ver=3374]
[2025/04/03 10:31:28.935 +08:00] [INFO] [syncer.go:362] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.241.85, port 4000, id 7437fbe2-6533-4d58-80ec-9028800c06c2"] ["ddl job id"=2462] [ver=3374]

可以搜一下 关键字

all_schema_by_job_versions

如果方便的话可以拿一下 192.168.241.85, port 4000 的 tidb.log

1 个赞

没有过滤到关键词:all_schema_by_job_versions

tidb-85.log (19.5 MB)
日志太大了,我截取了一部分日志,辛苦大佬了

[2025/04/03 10:09:52.531 +08:00] [INFO] [session.go:4353] [“old running transaction block DDL”] [“table ID”=2005] [jobID=2462] [“connection ID”=4378508189508792259] [“elapsed time”=17h31m21.864749904s]
有涉及 [“table ID”=2005] 的表的长事务存在,所以阻塞了 DDL [jobID=2462] 执行。

过滤下日志根据 connection ID 看看这些会话怎么处理吧

1 个赞

grep “old running transaction” tidb-85.log|awk -F “connection ID"=” ‘{print $2}’|awk -F “]” ‘{print $1}’|sort -n|uniq -c

3 4378508189505259527
5 4378508189505260273
1 4378508189505263845
3 4378508189505263853
1 4378508189505264353
4 4378508189505265945
2 4378508189505269031
8 4378508189505294529
2 4378508189505416665
3 4378508189505485209
2 4378508189505679265
1 4378508189505683483
2 4378508189505683553
3 4378508189505687255
2 4378508189505687277
2 4378508189505687291
2 4378508189508767587
3 4378508189508767591
5 4378508189508792257
8 4378508189508792259
3 4378508189508792261
1 4378508189508792263
2 4378508189508792265

1 个赞