Tiflash强制下线后, 存在regions 处于down的状态无法清理, 导致tiflash scale-out的节点无法同步数据

【 TiDB 使用环境】Poc
【 TiDB 版本】6.5.3
【复现路径】 tiflash 执行了scale-in --force 的操作
然后执行了 scale-out的操作发现 tiflasha的表不同步.

【遇到的问题:问题现象及影响】scale-in 的节点在tiup display 里面不显示, 但是pd-ctl store 里面显示为offline的状态
regions 里面很多处于down的状态
【资源配置】4* 96c鲲鹏 512G内存 nvme SSD
【附件:截图/日志/监控】
集群情况:
Starting component cluster: /root/.tiup/components/cluster/v1.12.4/tiup-cluster display erptidb
Cluster type: tidb
Cluster name: erptidb
Cluster version: v6.5.3
Deploy user: tidb
SSH type: builtin
Dashboard URL: http://192.168.255.119:2379/dashboard
Grafana URL: http://192.168.255.121:3000
ID Role Host Ports OS/Arch Status Data Dir Deploy Dir


192.168.255.121:9093 alertmanager 192.168.255.121 9093/9094 linux/aarch64 Up /nvme00/tidb/alertmanager-9093 /deploy/tidb/alertmanager-9093
192.168.255.121:3000 grafana 192.168.255.121 3000 linux/aarch64 Up - /deploy/tidb/grafana-3000
192.168.255.119:2379 pd 192.168.255.119 2379/2380 linux/aarch64 Up|L|UI /nvme00/tidb/pd-2379 /deploy/tidb/pd-2379
192.168.255.121:2379 pd 192.168.255.121 2379/2380 linux/aarch64 Up /nvme00/tidb/pd-2379 /deploy/tidb/pd-2379
192.168.255.121:9090 prometheus 192.168.255.121 9090/12020 linux/aarch64 Up /nvme00/tidb/prometheus-9090 /deploy/tidb/prometheus-9090
192.168.255.119:4000 tidb 192.168.255.119 4000/10080 linux/aarch64 Up - /deploy/tidb/tidb-4000
192.168.255.120:4000 tidb 192.168.255.120 4000/10080 linux/aarch64 Up - /deploy/tidb-4000
192.168.255.120:4001 tidb 192.168.255.120 4001/10081 linux/aarch64 Up - /deploy/tidb-4001
192.168.255.121:4000 tidb 192.168.255.121 4000/10080 linux/aarch64 Up - /deploy/tidb/tidb-4000
192.168.255.121:9003 tiflash 192.168.255.121 9003/8125/3932/20172/20294/8236 linux/aarch64 Up /nvme02/tiflash/data/tiflash-9003 /deploy/tidb/tiflash-9003
192.168.255.119:20160 tikv 192.168.255.119 20160/20180 linux/aarch64 Up /nvme00/tidb/tikv/data/tikv-20160 /deploy/tidb/tikv-20160
192.168.255.119:20161 tikv 192.168.255.119 20161/20181 linux/aarch64 Up /nvme01/tidb/data/tikv-20161 /deploy/tidb/tikv-20161
192.168.255.119:50160 tikv 192.168.255.119 50160/50180 linux/aarch64 Up /nvme03/tidb/data/tikv-50160 /deploy/tidb/tikv-50160
192.168.255.120:20160 tikv 192.168.255.120 20160/20180 linux/aarch64 Up /nvme00/tidb/data/tikv-20160 /deploy/tidb/tikv-20160
192.168.255.120:20161 tikv 192.168.255.120 20161/20181 linux/aarch64 Up /nvme01/tidb/data/tikv-20161 /deploy/tidb/tikv-20161
192.168.255.120:40160 tikv 192.168.255.120 40160/40180 linux/aarch64 Up /nvme00/tidb/data/tikv-40161 /deploy/tidb/tikv-40160
192.168.255.120:40161 tikv 192.168.255.120 40161/40181 linux/aarch64 Up /nvme01/tidb/data/tikv-40162 /deploy/tidb/tikv-40161
192.168.255.121:20160 tikv 192.168.255.121 20160/20180 linux/aarch64 Up /nvme00/tidb/tikv/data/tikv-20160 /deploy/tidb/tikv-20160
192.168.255.121:20161 tikv 192.168.255.121 20161/20181 linux/aarch64 Up /nvme01/tidb/data/tikv-20161 /deploy/tidb/tikv-20161
192.168.255.122:30160 tikv 192.168.255.122 30160/30180 linux/aarch64 Up /nvme00/tidb/data/tikv-30161 /deploy/tidb/tikv-30160
192.168.255.122:30161 tikv 192.168.255.122 30161/30181 linux/aarch64 Up /nvme01/tidb/data/tikv-30162 /deploy/tidb/tikv-30161
192.168.255.122:30162 tikv 192.168.255.122 30162/30182 linux/aarch64 Up /nvme02/tidb/data/tikv-30162 /deploy/tidb/tikv-30162
192.168.255.122:40160 tikv 192.168.255.122 40160/40180 linux/aarch64 Up /nvme00/tidb/data/tikv-40161 /deploy/tidb/tikv-40160
192.168.255.122:40161 tikv 192.168.255.122 40161/40181 linux/aarch64 Up /nvme01/tidb/data/tikv-40162 /deploy/tidb/tikv-40161
192.168.255.122:40162 tikv 192.168.255.122 40162/40182 linux/aarch64 Up /nvme02/tidb/data/tikv-40162 /deploy/tidb/tikv-40162

  1. pd-ctl的情况
    {
    “count”: 19,
    “stores”: [
    {
    “store”: {
    “id”: 91,
    “address”: “192.168.255.119:3931”,
    “labels”: [
    {
    “key”: “engine”,
    “value”: “tiflash”
    }
    ],
    “version”: “v6.5.3”,
    “peer_address”: “192.168.255.119:20171”,
    “status_address”: “192.168.255.119:20293”,
    “git_hash”: “e63e24991079fff1e5afe03e859f743cbb6cf4a7”,
    “start_timestamp”: 1694990902,
    “deploy_path”: “/deploy/tidb/tiflash-9001/bin/tiflash”,
    “last_heartbeat”: 1695016186000543038,
    “state_name”: “Offline”
    },
    “status”: {
    “capacity”: “1.718TiB”,
    “available”: “1.07TiB”,
    “used_size”: “55.47GiB”,
    “leader_count”: 0,
    “leader_weight”: 1,
    “leader_score”: 0,
    “leader_size”: 0,
    “region_count”: 5088,
    “region_weight”: 1,
    “region_score”: 1402721.0927894693,
    “region_size”: 1152681,
    “learner_count”: 5088,
    “slow_score”: 1,
    “start_ts”: “2023-09-18T06:48:22+08:00”,
    “last_heartbeat_ts”: “2023-09-18T13:49:46.000543038+08:00”,
    “uptime”: “7h1m24.000543038s”
    }
    },
    {
    “store”: {
    “id”: 92,
    “address”: “192.168.255.121:3931”,
    “labels”: [
    {
    “key”: “engine”,
    “value”: “tiflash”
    }
    ],
    “version”: “v6.5.3”,
    “peer_address”: “192.168.255.121:20171”,
    “status_address”: “192.168.255.121:20293”,
    “git_hash”: “e63e24991079fff1e5afe03e859f743cbb6cf4a7”,
    “start_timestamp”: 1694990955,
    “deploy_path”: “/deploy/tidb/tiflash-9001/bin/tiflash”,
    “last_heartbeat”: 1695016328659292069,
    “state_name”: “Offline”
    },
    “status”: {
    “capacity”: “1.718TiB”,
    “available”: “821.3GiB”,
    “used_size”: “50.54GiB”,
    “leader_count”: 0,
    “leader_weight”: 1,
    “leader_score”: 0,
    “leader_size”: 0,
    “region_count”: 4361,
    “region_weight”: 1,
    “region_score”: 1402379.9771483315,
    “region_size”: 1111245,
    “learner_count”: 4361,
    “slow_score”: 1,
    “start_ts”: “2023-09-18T06:49:15+08:00”,
    “last_heartbeat_ts”: “2023-09-18T13:52:08.659292069+08:00”,
    “uptime”: “7h2m53.659292069s”
    }

  2. tikv_regions_peers 的情况


    tiflash_error.log (1.9 KB)
    tiflash_stderr.log (574 字节)

TiFlash 的 scale-in --force 操作可能会导致一些 region 处于 down 状态,因为它会强制下线 TiFlash 节点,而不会等待数据迁移完成。这样,一些 region 的副本可能会丢失,导致数据不一致和同步失败。

为了解决这个问题,你可以尝试以下步骤:

  • [使用 pd-ctl 工具查看哪些 store 处于 offline 状态,并记录它们的 ID。]
  • [使用 pd-ctl 工具将 offline 的 store 强制转换为 tombstone 状态,这样它们就不会再参与调度和数据迁移。命令格式为:store remove-tombstone <store_id>。]
  • [使用 pd-ctl 工具查看哪些 region 处于 down 状态,并记录它们的 ID。]
  • [使用 pd-ctl 工具删除 down 的 region,这样它们就会在其他副本上重新创建。命令格式为:region remove <region_id>。]
  • [使用 TiDB Dashboard 或者 TiFlash-Summary 监控面板查看 TiFlash 节点和表的同步状态,确保它们都是正常的。]

你的上一个帖子 就已经贴出解决方案了

不管用 在尝试 ti-tiger的方案.

你做了哪些操作

直接用pd-ctl工具执行 unsafe remove-failed-stores 后状态变更为 tombstone ,再执行 store remove-tombstone吧,反正你的tiflash全部不能用了已经。

[root@clickhouse1 data]# pd-ctl unsafe remove-failed-stores 91
Failed! [500] “[PD:unsaferecovery:ErrUnsafeRecoveryIsRunning]unsafe recovery is running”

尝试中. 这个偶成很漫长吗 ?

  1. 部分表tiflasha 提示 9012 timeout
  2. 修改这些表replica 1 → 0 然后再重新置为 1 无效.
  3. scale-in 全部的三个tiflash节点.
  4. scale-out 三个新端口新目录的tiflash节点
  5. 发现前期scale-in 的节点 处于offline的状态.
  6. 新的tiflash 节点无法同步数据, 所有的tiflash 的表 都处于progress 为0 的状态 tiflash 的cpu和磁盘使用量几乎为0 .

部分节点还是在leaving 的状态. 一个在执行unsafe remove

我的意思是 你按上一个帖子提供的链接文档做了哪些操作? 你不是说不管用吗

每个帖子的问题不一样呢.

down 的节点都处理完了. 但是tiflash 还是不同步数据 应该如何处理一下呢.

2023-09-19 13:40:52 (UTC+08:00)

TiFlash 192.168.255.121:3932

[RateLimiter.cpp:715] [“limiter 0 write 0 read 0 NOT need to tune.”] [source=IOLimitTuner] [thread_id=187]

2023-09-19 13:40:57 (UTC+08:00)

TiFlash 192.168.255.121:3932

[RateLimiter.cpp:715] [“limiter 0 write 0 read 0 NOT need to tune.”] [source=IOLimitTuner] [thread_id=187]

2023-09-19 13:41:02 (UTC+08:00)

TiFlash 192.168.255.121:3932

[RateLimiter.cpp:715] [“limiter 0 write 0 read 0 NOT need to tune.”] [source=IOLimitTuner] [thread_id=187]

2023-09-19 13:41:07 (UTC+08:00)

TiFlash 192.168.255.121:3932

[RateLimiter.cpp:715] [“limiter 0 write 0 read 0 NOT need to tune.”] [source=IOLimitTuner] [thread_id=187]

tiflasha 一直出这样的日志, 就是不同步表信息进入tiflash 应该如何处理一下.

[2023/09/19 15:04:32.552 +08:00] [WARN] [TiDBSchemaSyncer.h:225] ["apply diff meets exception : DB::TiFlashException: miss table in TiKV : 600140 \n stack is \n 0x15395a4\tDB::TiFlashException::TiFlashException(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, DB::TiFlashError const&) [tiflash+22255012]\n \tdbms/src/Common/TiFlashException.h:250\n 0x60286cc\tDB::SchemaBuilder<DB::SchemaGetter, DB::SchemaNameMapper>::applyDiff(DB::SchemaDiff const&) [tiflash+100828876]\n \tdbms/src/TiDB/Schema/SchemaBuilder.cpp:521\n 0x5f811d0\tDB::TiDBSchemaSyncer<false, false>::syncSchemas(DB::Context&) [tiflash+100143568]\n \tdbms/src/TiDB/Schema/TiDBSchemaSyncer.h:128\n 0x605d750\tstd::__1::__function::__func<DB::SchemaSyncService::SchemaSyncService(DB::Context&)::$_0, std::__1::allocator<DB::SchemaSyncService::SchemaSyncService(DB::Context&)::$_0>, bool ()>::operator()() [tiflash+101046096]\n \t/usr/local/bin/../include/c++/v1/__functional/function.h:345\n 0x5cc872c\tvoid* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, DB::BackgroundProcessingPool::BackgroundProcessingPool(int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >)::$_1> >(void*) [tiflash+97290028]\n \t/usr/local/bin/../include/c++/v1/thread:291\n 0xfffd104487ac\t<unknown symbol> [libpthread.so.0+34732]\n 0xfffd101360fc\t<unknown symbol> [libc.so.6+876796]"] [source=SchemaSyncer] [thread_id=110]

像是这个报错造成的。

https://github.com/pingcap/tiflash/blob/v6.5.3/dbms/src/TiDB/Schema/SchemaBuilder.cpp#L403C28-L403C28

代码位置在上面这里。报错里面的600140是个tableid

select * from INFORMATION_SCHEMA.TIFLASH_REPLICA where table_id=600140

执行上面这个sql看看有东西吗?

没有这个表呢

select * from INFORMATION_SCHEMA.TABLES where tidb_table_id=600140

不清楚tiflash为啥要找这个表的schema。看看是什么表。要是再没东西就郁闷了。 :sweat_smile:

tiflash_tikv.log这个文件里面有东西吗?

木有这个表…

好吧,至少我感觉你上传的这两个文件里面的报错都可以排除了。我看了下,感觉不同步都没啥关联。

脑壳疼, 还有招吗… 我一点办法都木了.

保底方案就是重装了。 :joy: