TiDB v6版本升级至v7出现tidb-server计算节点失败报错Table 'mysql.tidb_runaway_watch' doesn't exist

【 TiDB 使用环境】测试
【 TiDB 版本】v6.1.5升级至v7.5.4
【复现路径】
使用tiup升级,升级前无ddl,无大查询SQL
升级前BR做次全量备份!
tiup update --self && tiup update cluster
tiup cluster check tidb-test --cluster
tiup cluster upgrade tidb-test v7.5.4

tidb计算节点重启失败,其它组件升级都成功,老6.1.5版本是没有这系统表

{"code": 1, "error": "failed to restart: 10.xxxx tidb-4000.service, please check the instance's log(/data/tidb-deploy/tidb-4000/log) for more detail.: timed out waiting for port 4000 to be started after 2m0s", "errorVerbose": "timed out waiting for port 4000 to be started after 2m

[错误]: (1146, "Table 'mysql.tidb_runaway_watch' doesn't exist")

【处理过程】
参考社区出现的Table ‘mysql.tidb_runaway_watch’ doesn’t exist添加如下两参数后依旧tidb计算节点启动失败

    performance.cross-join: true
    status.record-db-qps: false

在官方技术军老师,真老师支持下,用v6.1.5老版本tidb-server能启动起来(在tiup目录.tiup/storage/cluster/packages/有历史老版本的包,以前迁移tiup发现就这packages目录空间几个G最大)

ll -h  /root/.tiup/storage/cluster/packages/tidb-v6.1.5-linux-amd64.tar.gz 
cp  /root/.tiup/storage/cluster/packages/tidb-v6.1.5-linux-amd64.tar.gz  /tmp

tar -zxf tidb-v6.1.5-linux-amd64.tar.gz 

mv /data/tidb-deploy/tidb-4000/bin/tidb-server /data/tidb-deploy/tidb-4000/bin/tidb-server_bak754
cp ./tidb-server /data/tidb-deploy/tidb-4000/bin/

再edit-config中增加下面2个配置后,再用v7.5.4 tidb-server文件启动(正常启动),再进行续升级后tidb-server正常启动,display集群版本已正常升到v7.5.4

server_configs:
  tidb:
    tmp-dir: /data/tidb-deploy/tidb-4000/dcl_tmp
    tmp-storage-path: /data/tidb-deploy/tidb-4000/oom_tmp

tiup cluster replay gxCmk7CbmjL

img_v3_02ge_3506508f-8f05-448f-8d3b-0168da18d91g

img_v3_02ge_bcf0b0fa-e006-4f9a-bf98-3fb8ed541a1g

【附件:截图/日志/监控】
less /data/tidb-deploy/tidb-4000/log/tidb.log

[2024/11/08 15:45:59.293 +08:00] [ERROR] [runaway.go:145] ["try to get new runaway watch"] [error="[schema:1146]Table 'mysql.tidb_runaway_watch' doesn't exist"]
[2024/11/08 15:45:59.293 +08:00] [WARN] [runaway.go:172] ["get runaway watch record failed"] [error="[schema:1146]Table 'mysql.tidb_runaway_watch' doesn't exist"]
[2024/11/08 15:46:00.293 +08:00] [ERROR] [runaway.go:145] ["try to get new runaway watch"] [error="[schema:1146]Table 'mysql.tidb_runaway_watch' doesn't exist"]
[2024/11/08 15:46:00.293 +08:00] [WARN] [runaway.go:172] ["get runaway watch record failed"] [error="[schema:1146]Table 'mysql.tidb_runaway_watch' doesn't exist"]
[2024/11/08 15:46:01.293 +08:00] [ERROR] [runaway.go:145] ["try to get new runaway watch"] [error="[schema:1146]Table 'mysql.tidb_runaway_watch' doesn't exist"]
[2024/11/08 15:46:01.293 +08:00] [WARN] [runaway.go:172] ["get runaway watch record failed"] [error="[schema:1146]Table 'mysql.tidb_runaway_watch' doesn't exist"]
[2024/11/08 15:46:02.293 +08:00] [ERROR] [runaway.go:145] ["try to get new runaway watch"] [error="[schema:1146]Table 'mysql.tidb_runaway_watch' doesn't exist"]
[2024/11/08 15:46:02.293 +08:00] [WARN] [runaway.go:172] ["get runaway watch record failed"] [error="[schema:1146]Table 'mysql.tidb_runaway_watch' doesn't exist"]
[2024/11/08 15:46:03.293 +08:00] [ERROR] [runaway.go:145] ["try to get new runaway watch"] [error="[schema:1146]Table 'mysql.tidb_runaway_watch' doesn't exist"]
[2024/11/08 15:46:03.293 +08:00] [WARN] [runaway.go:172] ["get runaway watch record failed"] [error="[schema:1146]Table 'mysql.tidb_runaway_watch' doesn't exist"]

应该是没有这几个tmp目录导致升级的失败。这种问题一般也比较难发现,感谢楼主分享一次排障经历

我遇到过这个问题,手工建表解决的

记录下,后面计划有升级动作。

v6->v7之前做过次6.1.5->7.5.4没遇到这问题,建议升级还是先用新建套独立测试环境来做,这次升级失败的是业务测试环境(影响不大,升级前最好做次全量备份),好几个业务方在问好久可用 :joy:

看之前已经升级成功7.5.4一套tidb集群计算节点是没这2目录

多个计算节点情况是可以这样

做好测试验证,备份 :face_with_peeking_eye:

此话题已在最后回复的 7 天后被自动关闭。不再允许新回复。