测试环境尝试pd-recover修复元数据方案,初步测试成功了
获取pd-id
cat pd.log |grep "init cluster id"
[2023/11/01 07:35:58.449 +00:00] [INFO] [server.go:351] ["init cluster id"] [cluster-id=7275302868813208333]
停掉pd tikv tidb然后删除pd的数据
rm -rf /mnt/locals/tidb-pd/volume0/*
启动新的pd tikv 时 pd重新初始化,tikv报错
["failed to bootstrap node id: \"[src/server/node.rs:236]: cluster ID mismatch, local 7275302868813208333 != remote 7296392347708071649, you are trying to connect to another cluster, please reconnect to the correct PD\""]
使用pd-recover 恢复cluster-id
wget https://download.pingcap.org/tidb-community-toolkit-v5.2.2-linux-amd64.tar.gz
tar zxf tidb-community-toolkit-v5.2.2-linux-amd64.tar.gz
cd tidb-community-toolkit-v5.2.2-linux-amd64
./bin/pd-recover -endpoints http://127.0.0.1:2379 -cluster-id 7275302868813208333 -alloc-id 10000
然后重启pd和tikv都正常启动, 查看pd的safe-point
./pd-ctl service-gc-safepoint
{
"service_gc_safe_points": [],
"gc_safe_point": 0
}
进入tidb执行select查询 ,不再报错9006
mysql -h 127.0.0.1 -P 4000 -u root
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MySQL connection id is 403
Server version: 5.7.25-TiDB-v5.2.2 TiDB Server (Apache License 2.0) Community Edition, MySQL 5.7 compatible
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MySQL [(none)]> select * from mysql.tidb;
+--------------------------+---------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------+
| VARIABLE_NAME | VARIABLE_VALUE | COMMENT |
+--------------------------+---------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------+
| bootstrapped | True | Bootstrap flag. Do not delete. |
| tidb_server_version | 72 | Bootstrap version. Do not delete. |
| system_tz | UTC | TiDB Global System Timezone. |
| new_collation_enabled | False | If the new collations are enabled. Do not edit it. |
| tikv_gc_leader_uuid | 629bd7ea8d80002 | Current GC worker leader UUID. (DO NOT EDIT) |
| tikv_gc_leader_desc | host:tidb-default-tidb-0, pid:1, start at 2023-09-07 11:20:55.101415326 +0000 UTC m=+42.568797401 | Host name and pid of current GC leader. (DO NOT EDIT) |
| tikv_gc_enable | true | Current GC enable status |
| tikv_gc_run_interval | 10m0s | GC run interval, at least 10m, in Go format. |
| tikv_gc_life_time | 10m0s | All versions within life time will not be collected by GC, at least 10m, in Go format. |
| tikv_gc_auto_concurrency | true | Let TiDB pick the concurrency automatically. If set false, tikv_gc_concurrency will be used |
| tikv_gc_scan_lock_mode | legacy | Mode of scanning locks, "physical" or "legacy" |
| tikv_gc_mode | distributed | Mode of GC, "central" or "distributed" |
+--------------------------+---------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------+
12 rows in set (0.002 sec)
但是最后有一个疑问 pd启动等待多个gc 10分钟后这个safe-point仍然为空, 这个何时会触发更新?