【 TiDB 使用环境】生产环境 ,五台pd+五台tikv
【 TiDB 版本】
【复现路径】重建集群后,过几个小时,出现这个错误,同时部分写失败
【遇到的问题:问题现象及影响】
错误日志:
[2024/08/07 17:29:17.107 +08:00] [INFO] [mod.rs:2261] [“get snapshot failed”] [err=“Error(Txn(Error(Engine(Error(Request(message: "peer is not leader for region 10, leader may Some(id: 11 store_id: 1)" not_leader { region_id: 10 leader { id: 11 store_id: 1 } }))))))”] [tag=raw_put] [thread_id=39]
[2024/08/07 17:29:17.113 +08:00] [INFO] [mod.rs:2261] [“get snapshot failed”] [err=“Error(Txn(Error(Engine(Error(Request(message: "peer is not leader for region 10, leader may Some(id: 11 store_id: 1)" not_leader { region_id: 10 leader { id: 11 store_id: 1 } }))))))”] [tag=raw_put] [thread_id=39]
客户端读写报错:
context deadline exceeded500
epoch_not_match:<> 500
loadRegion from PD failed, key: “6C2F636F6D2E77616E6773752E7777772F6B762D66756C6C2D6C696E6B2D636865636B2F6B762D66756C6C2D6C696E6B2D636865636B2D6579”, err: rpc error: code = DeadlineExceeded desc = context deadline exceeded500
配置文件如下:
pd配置:
pd_servers:
- host: 2.2.2.2.105
client_port: 5985
peer_port: 5986
ignore_exporter: true
data_dir: “/cache70/pd-data”
config:
schedule.max-merge-region-size: 20
schedule.max-merge-region-keys: 200000
schedule.leader-schedule-limit: 20
schedule.hot-region-schedule-limit: 40
schedule.hot-region-cache-hits-threshold: 1
log.file.max-size: 300
log.file.max-days: 7
log.file.max-backups: 20
replication.max-replicas: 5
tikv_servers:
- host: 2.2.2.2.105
port: 5186
status_port: 5185
deploy_dir: “/usr/local/kvser/tikv-5186”
log_dir: “/usr/local/kvser/tikv-5186/log”
ignore_exporter: true
data_dir: “/cache70/tikv-data/tikv-5186”
config:
server:
forward-max-connections-per-address: 500
grpc-raft-conn-num: 5
raftstore:
capacity: 2000GB
apply-pool-size: 15
store-pool-size: 15
snap-generator-pool-size: 10
apply-max-batch-size: 1024
rocksdb.defaultcf:
block-cache-size: 400MB
cdc:
old-value-cache-memory-quota: 50MB
sink-memory-quota: 50MB
storage:
api-version: 2
enable-ttl: true
security:
cert-allowed-cn: [“pd.com”,“client.com”,“tikv.com”]
log.file:
filename: “error.log”
max-days: 7
max-backups: 20