为提高效率,请提供以下信息,问题描述清晰能够更快得到解决:
【TiDB 版本】
v4.0.11
【问题描述】
Tikv节点无法正常启动
- 集群拓扑
global:
user: tidb
ssh_port: 22
ssh_type: builtin
deploy_dir: /tidb-deploy
data_dir: /tidb-data
os: linux
arch: amd64
monitored:
node_exporter_port: 9100
blackbox_exporter_port: 9115
deploy_dir: /tidb-deploy/monitor-9100
data_dir: /tidb-data/monitor-9100
log_dir: /tidb-deploy/monitor-9100/log
server_configs:
tidb:
alter-primary-key: false
binlog.enable: true
binlog.ignore-error: true
enable-telemetry: false
log.enable-slow-log: true
log.file.max-backups: 7
log.file.max-days: 7
log.slow-threshold: 200
prepared-plan-cache.enabled: true
tikv-client.copr-cache.enable: true
tikv: {}
pd: {}
tiflash: {}
tiflash-learner: {}
pump: {}
drainer: {}
cdc: {}
tidb_servers:
- host: 172.16.12.171
ssh_port: 22
port: 4000
status_port: 10080
deploy_dir: /tidb-deploy/tidb-4000
arch: amd64
os: linux
- host: 172.16.12.213
ssh_port: 22
port: 4000
status_port: 10080
deploy_dir: /tidb-deploy/tidb-4000
arch: amd64
os: linux
- host: 172.16.12.208
ssh_port: 22
port: 4000
status_port: 10080
deploy_dir: /tidb-deploy/tidb-4000
arch: amd64
os: linux
tikv_servers:
- host: 172.16.12.190
ssh_port: 22
port: 20160
status_port: 20180
deploy_dir: /tidb-deploy/tikv-20160
data_dir: /tidb-data/tikv-20160
arch: amd64
os: linux
- host: 172.16.12.176
ssh_port: 22
port: 20160
status_port: 20180
deploy_dir: /tidb-deploy/tikv-20160
data_dir: /tidb-data/tikv-20160
arch: amd64
os: linux
- host: 172.16.12.138
ssh_port: 22
port: 20160
status_port: 20180
deploy_dir: /tidb-deploy/tikv-20160
data_dir: /tidb-data/tikv-20160
arch: amd64
os: linux
tiflash_servers: []
pd_servers:
- host: 172.16.12.128
ssh_port: 22
name: pd-172.16.12.128-2379
client_port: 2379
peer_port: 2380
deploy_dir: /tidb-deploy/pd-2379
data_dir: /tidb-data/pd-2379
arch: amd64
os: linux
- host: 172.16.12.217
ssh_port: 22
name: pd-172.16.12.217-2379
client_port: 2379
peer_port: 2380
deploy_dir: /tidb-deploy/pd-2379
data_dir: /tidb-data/pd-2379
arch: amd64
os: linux
- host: 172.16.12.150
ssh_port: 22
name: pd-172.16.12.150-2379
client_port: 2379
peer_port: 2380
deploy_dir: /tidb-deploy/pd-2379
data_dir: /tidb-data/pd-2379
arch: amd64
os: linux
pump_servers:
- host: 172.16.12.123
ssh_port: 22
port: 8250
deploy_dir: /tidb-deploy/pump-8250
data_dir: /tidb-data/pump-8250
arch: amd64
os: linux
- host: 172.16.12.142
ssh_port: 22
port: 8250
deploy_dir: /tidb-deploy/pump-8250
data_dir: /tidb-data/pump-8250
arch: amd64
os: linux
- host: 172.16.12.161
ssh_port: 22
port: 8250
deploy_dir: /tidb-deploy/pump-8250
data_dir: /tidb-data/pump-8250
arch: amd64
os: linux
drainer_servers:
- host: 172.16.12.216
ssh_port: 22
port: 8249
deploy_dir: /tidb-deploy/drainer-8249
data_dir: /tidb-data/drainer-8249
config:
syncer.db-type: tidb
syncer.to.host: 172.16.12.171
syncer.to.password:
syncer.to.port: 4000
syncer.to.user: root
arch: amd64
os: linux
monitoring_servers:
- host: 172.16.12.216
ssh_port: 22
port: 9090
deploy_dir: /tidb-deploy/prometheus-9090
data_dir: /tidb-data/prometheus-9090
arch: amd64
os: linux
grafana_servers:
- host: 172.16.12.216
ssh_port: 22
port: 3000
deploy_dir: /tidb-deploy/grafana-3000
arch: amd64
os: linux
username: admin
password: admin
alertmanager_servers:
- host: 172.16.12.216
ssh_port: 22
web_port: 9093
cluster_port: 9094
deploy_dir: /tidb-deploy/alertmanager-9093
data_dir: /tidb-data/alertmanager-9093
arch: amd64
os: linux
- 当前集群状态
tiup cluster display daddylab-tidb-cluster
Found cluster newer version:
The latest version: v1.3.5
Local installed version: v1.3.4
Update current component: tiup update cluster
Update all components: tiup update --all
Starting component `cluster`: /root/.tiup/components/cluster/v1.3.4/tiup-cluster display daddylab-tidb-cluster
Cluster type: tidb
Cluster name: daddylab-tidb-cluster
Cluster version: v4.0.11
SSH type: builtin
Dashboard URL: http://172.16.12.128:2379/dashboard
ID Role Host Ports OS/Arch Status Data Dir Deploy Dir
-- ---- ---- ----- ------- ------ -------- ----------
172.16.12.216:9093 alertmanager 172.16.12.216 9093/9094 linux/x86_64 Up /tidb-data/alertmanager-9093 /tidb-deploy/alertmanager-9093
172.16.12.216:8249 drainer 172.16.12.216 8249 linux/x86_64 Up /tidb-data/drainer-8249 /tidb-deploy/drainer-8249
172.16.12.216:3000 grafana 172.16.12.216 3000 linux/x86_64 Up - /tidb-deploy/grafana-3000
172.16.12.128:2379 pd 172.16.12.128 2379/2380 linux/x86_64 Up|UI /tidb-data/pd-2379 /tidb-deploy/pd-2379
172.16.12.150:2379 pd 172.16.12.150 2379/2380 linux/x86_64 Up|L /tidb-data/pd-2379 /tidb-deploy/pd-2379
172.16.12.217:2379 pd 172.16.12.217 2379/2380 linux/x86_64 Up /tidb-data/pd-2379 /tidb-deploy/pd-2379
172.16.12.216:9090 prometheus 172.16.12.216 9090 linux/x86_64 Up /tidb-data/prometheus-9090 /tidb-deploy/prometheus-9090
172.16.12.123:8250 pump 172.16.12.123 8250 linux/x86_64 Up /tidb-data/pump-8250 /tidb-deploy/pump-8250
172.16.12.142:8250 pump 172.16.12.142 8250 linux/x86_64 Up /tidb-data/pump-8250 /tidb-deploy/pump-8250
172.16.12.161:8250 pump 172.16.12.161 8250 linux/x86_64 Up /tidb-data/pump-8250 /tidb-deploy/pump-8250
172.16.12.171:4000 tidb 172.16.12.171 4000/10080 linux/x86_64 Up - /tidb-deploy/tidb-4000
172.16.12.208:4000 tidb 172.16.12.208 4000/10080 linux/x86_64 Up - /tidb-deploy/tidb-4000
172.16.12.213:4000 tidb 172.16.12.213 4000/10080 linux/x86_64 Up - /tidb-deploy/tidb-4000
172.16.12.138:20160 tikv 172.16.12.138 20160/20180 linux/x86_64 Up /tidb-data/tikv-20160 /tidb-deploy/tikv-20160
172.16.12.176:20160 tikv 172.16.12.176 20160/20180 linux/x86_64 Down /tidb-data/tikv-20160 /tidb-deploy/tikv-20160
172.16.12.190:20160 tikv 172.16.12.190 20160/20180 linux/x86_64 Up /tidb-data/tikv-20160 /tidb-deploy/tikv-20160
Total nodes: 16
- tikv 节点
172.16.12.176:20160
报错信息
[2021/03/12 17:03:18.465 +08:00] [FATAL] [server.rs:303] ["panic_mark_file /tidb-data/tikv-20160/panic_mark_file exists, there must be something wrong with the db."]
目前尝试修复工作及结果
- 删除
/tidb-data/tikv-20160/panic_mark_file
-------- 无法启动 - down 节点执行
[root@db-cluster-tikv2 ~]# ./tikv-ctl --db /tidb-data/tikv-20160/db/ bad-regions
all regions are healthy
- up节点执行
[root@db-cluster-tikv1 ~]# ./tikv-ctl --db /tidb-data/tikv-20160/db/ bad-regions
thread ‘main’ panicked at ‘calledResult::unwrap()
on anErr
value: RocksDb(“IO error: While lock file: /tidb-data/tikv-20160/db/LOCK: Resource temporarily unavailable”)’, src/libcore/result.rs:1188:5
note: run withRUST_BACKTRACE=1
environment variable to display a backtrace.
[root@db-cluster-tikv3 ~]# ./tikv-ctl --db /tidb-data/tikv-20160/db/ bad-regions
thread ‘main’ panicked at ‘calledResult::unwrap()
on anErr
value: RocksDb(“IO error: While lock file: /tidb-data/tikv-20160/db/LOCK: Resource temporarily unavailable”)’, src/libcore/result.rs:1188:5
note: run withRUST_BACKTRACE=1
environment variable to display a backtrace.
求助!!!
若提问为性能优化、故障排查类问题,请下载脚本运行。终端输出的打印结果,请务必全选并复制粘贴上传。