v4.0.3 单机双实例tikv重启之后,leader region不均衡

管理工具:

tiup

tidb 版本:

v4.0.4

问题:

tiup cluster reload clustername

tikv 重载之后leader region不均衡,有一个节点直接为0

grafana 监控图:

topology:

TiDB Version: v4.0.4
ID                  Role          Host          Ports                            OS/Arch       Status  Data Dir                       Deploy Dir
--                  ----          ----          -----                            -------       ------  --------                       ----------
10.59.105.70:9093   alertmanager  10.59.105.70  9093/9094                        linux/x86_64  Up      /data/alertmanager/data        /data/alertmanager
10.59.105.70:3000   grafana       10.59.105.70  3000                             linux/x86_64  Up      -                              /data/grafana
10.59.105.60:2379   pd            10.59.105.60  2379/2380                        linux/x86_64  Up|L    /data/pd/data                  /data/pd
10.59.105.61:2379   pd            10.59.105.61  2379/2380                        linux/x86_64  Up      /data/pd/data                  /data/pd
10.59.105.62:2379   pd            10.59.105.62  2379/2380                        linux/x86_64  Up      /data/pd/data                  /data/pd
10.59.105.70:9090   prometheus    10.59.105.70  9090                             linux/x86_64  Up      /data/prometheus/data          /data/prometheus
10.59.105.60:4000   tidb          10.59.105.60  4000/10080                       linux/x86_64  Up      -                              /data/tidb
10.59.105.61:4000   tidb          10.59.105.61  4000/10080                       linux/x86_64  Up      -                              /data/tidb
10.59.105.62:4000   tidb          10.59.105.62  4000/10080                       linux/x86_64  Up      -                              /data/tidb
10.59.105.70:9000   tiflash       10.59.105.70  9000/8123/3930/20170/20292/8234  linux/x86_64  Up      /data/tiflash/data             /data/tiflash
10.59.105.50:20160  tikv          10.59.105.50  20160/20180                      linux/x86_64  Up      /data/tikv/20160/deploy/store  /data/tikv/20160/deploy
10.59.105.50:20161  tikv          10.59.105.50  20161/20181                      linux/x86_64  Up      /data/tikv/20161/deploy/store  /data/tikv/20161/deploy
10.59.105.51:20160  tikv          10.59.105.51  20160/20180                      linux/x86_64  Up      /data/tikv/20160/deploy/store  /data/tikv/20160/deploy
10.59.105.51:20161  tikv          10.59.105.51  20161/20181                      linux/x86_64  Up      /data/tikv/20161/deploy/store  /data/tikv/20161/deploy
10.59.105.52:20160  tikv          10.59.105.52  20160/20180                      linux/x86_64  Up      /data/tikv/20160/deploy/store  /data/tikv/20160/deploy
10.59.105.52:20161  tikv          10.59.105.52  20161/20181                      linux/x86_64  Up      /data/tikv/20161/deploy/store  /data/tikv/20161/deploy
1 个赞

貌似是reload的问题,我用restart,leader region开始均衡了:

这是个bug?

还有一个问题是:用label之后,在压测过程中leader region也不是太均衡,见上图(15:30 - 15:50),看起来不是太均衡

去掉label,leader region很均衡

请问 reload 之前有调整过配置参数吗?正常情况下,应该就是重启后 leader region 立马开始均衡。

有的啊,调整了tikv的配置

辛苦反馈下,tiup cluster edit-config cluster-name 的结果,并简述下当前集群的 label 考虑,看下是否是 label 配置问题。leader 不均衡不知现在是否有保留场景,可以通过 pd-ctl 看下 scheduler show 看下。

global:
  user: tidb
  ssh_port: 22
  deploy_dir: deploy
  data_dir: data
  os: linux
  arch: amd64
monitored:
  node_exporter_port: 9100
  blackbox_exporter_port: 9115
  deploy_dir: deploy/monitor-9100
  data_dir: data/monitor-9100
  log_dir: deploy/monitor-9100/log
server_configs:
  tidb:
    binlog.enable: false
    binlog.ignore-error: false
    log.slow-threshold: 300
    performance.committer-concurrency: 16384
    prepared-plan-cache.enabled: false
    tikv-client.grpc-connection-count: 128
  tikv:
    log-level: warn
    raftdb.bytes-per-sync: 512MB
    raftdb.defaultcf.compression-per-level:
    - zstd
    - zstd
    - zstd
    - zstd
    - zstd
    - zstd
    - zstd
    raftdb.wal-bytes-per-sync: 256MB
    raftdb.writable-file-max-buffer-size: 512MB
    raftstore.apply-pool-size: 4
    raftstore.hibernate-regions: true
    raftstore.messages-per-tick: 40960
    raftstore.raft-base-tick-interval: 2s
    raftstore.raft-entry-max-size: 32MB
    raftstore.raft-max-inflight-msgs: 8192
    raftstore.store-pool-size: 4
    raftstore.sync-log: false
    readpool.coprocessor.use-unified-pool: true
    readpool.storage.use-unified-pool: false
    readpool.unified.max-thread-count: 13
    rocksdb.bytes-per-sync: 512MB
    rocksdb.defaultcf.compression-per-level:
    - zstd
    - zstd
    - zstd
    - zstd
    - zstd
    - zstd
    - zstd
    rocksdb.defaultcf.level0-slowdown-writes-trigger: 64
    rocksdb.defaultcf.level0-stop-writes-trigger: 64
    rocksdb.defaultcf.max-write-buffer-number: 10
    rocksdb.defaultcf.min-write-buffer-number-to: 1
    rocksdb.lockcf.compression-per-level:
    - zstd
    - zstd
    - zstd
    - zstd
    - zstd
    - zstd
    - zstd
    rocksdb.lockcf.level0-slowdown-writes-trigger: 64
    rocksdb.lockcf.level0-stop-writes-trigger: 64
    rocksdb.lockcf.max-write-buffer-number: 10
    rocksdb.lockcf.min-write-buffer-number-to: 1
    rocksdb.raftcf.compression-per-level:
    - zstd
    - zstd
    - zstd
    - zstd
    - zstd
    - zstd
    - zstd
    rocksdb.raftcf.level0-slowdown-writes-trigger: 64
    rocksdb.raftcf.level0-stop-writes-trigger: 64
    rocksdb.raftcf.max-write-buffer-number: 10
    rocksdb.raftcf.min-write-buffer-number-to: 1
    rocksdb.wal-bytes-per-sync: 256MB
    rocksdb.writable-file-max-buffer-size: 512MB
    rocksdb.writecf.compression-per-level:
    - zstd
    - zstd
    - zstd
    - zstd
    - zstd
    - zstd
    - zstd
    rocksdb.writecf.level0-slowdown-writes-trigger: 64
    rocksdb.writecf.level0-stop-writes-trigger: 64
    rocksdb.writecf.max-write-buffer-number: 10
    rocksdb.writecf.min-write-buffer-number-to: 1
    storage.block-cache.capacity: 24GB
    storage.scheduler-worker-pool-size: 4
  pd:
    replication.enable-placement-rules: true
    replication.location-labels:
    - host
    schedule.leader-schedule-limit: 8
    schedule.region-schedule-limit: 2048
    schedule.replica-schedule-limit: 256
  tiflash:
    logger.level: info
    path_realtime_mode: false
  tiflash-learner: {}
  pump: {}
  drainer: {}
  cdc: {}
tidb_servers:
- host: 10.59.105.60
  ssh_port: 22
  port: 4000
  status_port: 10080
  deploy_dir: /data/tidb
  log_dir: /data/tidb/log
  arch: amd64
  os: linux
- host: 10.59.105.61
  ssh_port: 22
  port: 4000
  status_port: 10080
  deploy_dir: /data/tidb
  log_dir: /data/tidb/log
  arch: amd64
  os: linux
- host: 10.59.105.62
  ssh_port: 22
  port: 4000
  status_port: 10080
  deploy_dir: /data/tidb
  log_dir: /data/tidb/log
  arch: amd64
  os: linux
- host: 10.59.105.70
  ssh_port: 22
  port: 4000
  status_port: 10080
  deploy_dir: /data/tidb
  log_dir: /data/tidb/log
  arch: amd64
  os: linux
tikv_servers:
- host: 10.59.105.50
  ssh_port: 22
  port: 20160
  status_port: 20180
  deploy_dir: /data/tikv/20160/deploy
  data_dir: /data/tikv/20160/deploy/store
  log_dir: /data/tikv/20160/deploy/log
  numa_node: "0"
  config:
    server.labels:
      host: h50
  arch: amd64
  os: linux
- host: 10.59.105.50
  ssh_port: 22
  port: 20161
  status_port: 20181
  deploy_dir: /data/tikv/20161/deploy
  data_dir: /data/tikv/20161/deploy/store
  log_dir: /data/tikv/20161/deploy/log
  numa_node: "1"
  config:
    server.labels:
      host: h50
  arch: amd64
  os: linux
- host: 10.59.105.51
  ssh_port: 22
  port: 20160
  status_port: 20180
  deploy_dir: /data/tikv/20160/deploy
  data_dir: /data/tikv/20160/deploy/store
  log_dir: /data/tikv/20160/deploy/log
  numa_node: "0"
  config:
    server.labels:
      host: h51
  arch: amd64
  os: linux
- host: 10.59.105.51
  ssh_port: 22
  port: 20161
  status_port: 20181
  deploy_dir: /data/tikv/20161/deploy
  data_dir: /data/tikv/20161/deploy/store
  log_dir: /data/tikv/20161/deploy/log
  numa_node: "1"
  config:
    server.labels:
      host: h51
  arch: amd64
  os: linux
- host: 10.59.105.52
  ssh_port: 22
  port: 20160
  status_port: 20180
  deploy_dir: /data/tikv/20160/deploy
  data_dir: /data/tikv/20160/deploy/store
  log_dir: /data/tikv/20160/deploy/log
  numa_node: "0"
  config:
    server.labels:
      host: h52
  arch: amd64
  os: linux
- host: 10.59.105.52
  ssh_port: 22
  port: 20161
  status_port: 20181
  deploy_dir: /data/tikv/20161/deploy
  data_dir: /data/tikv/20161/deploy/store
  log_dir: /data/tikv/20161/deploy/log
  numa_node: "1"
  config:
    server.labels:
      host: h52
  arch: amd64
  os: linux
tiflash_servers:
- host: 10.59.105.70
  ssh_port: 22
  tcp_port: 9000
  http_port: 8123
  flash_service_port: 3930
  flash_proxy_port: 20170
  flash_proxy_status_port: 20292
  metrics_port: 8234
  deploy_dir: /data/tiflash
  data_dir: /data/tiflash/data
  log_dir: /data/tiflash/log
  arch: amd64
  os: linux
pd_servers:
- host: 10.59.105.60
  ssh_port: 22
  name: pd-10.59.105.60-2379
  client_port: 2379
  peer_port: 2380
  deploy_dir: /data/pd
  data_dir: /data/pd/data
  log_dir: /data/pd/log
  arch: amd64
  os: linux
- host: 10.59.105.61
  ssh_port: 22
  name: pd-10.59.105.61-2379
  client_port: 2379
  peer_port: 2380
  deploy_dir: /data/pd
  data_dir: /data/pd/data
  log_dir: /data/pd/log
  arch: amd64
  os: linux
- host: 10.59.105.62
  ssh_port: 22
  name: pd-10.59.105.62-2379
  client_port: 2379
  peer_port: 2380
  deploy_dir: /data/pd
  data_dir: /data/pd/data
  log_dir: /data/pd/log
  arch: amd64
  os: linux
monitoring_servers:
- host: 10.59.105.70
  ssh_port: 22
  port: 9090
  deploy_dir: /data/prometheus
  data_dir: /data/prometheus/data
  log_dir: /data/prometheus/log
  arch: amd64
  os: linux
grafana_servers:
- host: 10.59.105.70
  ssh_port: 22
  port: 3000
  deploy_dir: /data/grafana
  arch: amd64
  os: linux
alertmanager_servers:
- host: 10.59.105.70
  ssh_port: 22
  web_port: 9093
  cluster_port: 9094
  deploy_dir: /data/alertmanager
  data_dir: /data/alertmanager/data
  log_dir: /data/alertmanager/log
  arch: amd64
  os: linux

» scheduler show
[
  "balance-leader-scheduler",
  "balance-hot-region-scheduler",
  "balance-region-scheduler",
  "label-scheduler"
]

正常配置有问题,应该不让通过吧

看起来像是reload的bug

reload 问题,辛苦提供下一下时间段的 pd log 和 leader count 0 的 tikv log,我们看下是否为调度的问题,
image

label 问题,目前看 edit-config 配置是正确的,单机多实例,并且 pd 中也配置了 host,请问目前集群的 leader 、region 情况可否反馈下,监控截图即可

辛苦反馈下 pd-ctl sotre 的信息,

pd&&tikv.rar (483.6 KB)

这个是pd和tikv的日志,查看8月4号 15:30 - 16:10左右的日志

store:

» store
{
  "count": 7,
  "stores": [
    {
      "store": {
        "id": 1,
        "address": "10.59.105.52:20161",
        "labels": [
          {
            "key": "host",
            "value": "h52"
          }
        ],
        "version": "4.0.4",
        "status_address": "10.59.105.52:20181",
        "git_hash": "28e3d44b00700137de4fa933066ab83e5f8306cf",
        "start_timestamp": 1596528031,
        "deploy_path": "/data/tikv/20161/deploy/bin",
        "last_heartbeat": 1596599959929719664,
        "state_name": "Up"
      },
      "status": {
        "capacity": "1.718TiB",
        "available": "1.711TiB",
        "used_size": "2.853GiB",
        "leader_count": 258,
        "leader_weight": 1,
        "leader_score": 258,
        "leader_size": 19383,
        "region_count": 845,
        "region_weight": 1,
        "region_score": 67588,
        "region_size": 67588,
        "start_ts": "2020-08-04T16:00:31+08:00",
        "last_heartbeat_ts": "2020-08-05T11:59:19.929719664+08:00",
        "uptime": "19h58m48.929719664s"
      }
    },
    {
      "store": {
        "id": 2,
        "address": "10.59.105.51:20160",
        "labels": [
          {
            "key": "host",
            "value": "h51"
          }
        ],
        "version": "4.0.4",
        "status_address": "10.59.105.51:20180",
        "git_hash": "28e3d44b00700137de4fa933066ab83e5f8306cf",
        "start_timestamp": 1596528031,
        "deploy_path": "/data/tikv/20160/deploy/bin",
        "last_heartbeat": 1596599960359769897,
        "state_name": "Up"
      },
      "status": {
        "capacity": "1.718TiB",
        "available": "1.711TiB",
        "used_size": "2.75GiB",
        "leader_count": 262,
        "leader_weight": 1,
        "leader_score": 262,
        "leader_size": 20580,
        "region_count": 855,
        "region_weight": 1,
        "region_score": 67500,
        "region_size": 67500,
        "start_ts": "2020-08-04T16:00:31+08:00",
        "last_heartbeat_ts": "2020-08-05T11:59:20.359769897+08:00",
        "uptime": "19h58m49.359769897s"
      }
    },
    {
      "store": {
        "id": 3,
        "address": "10.59.105.52:20160",
        "labels": [
          {
            "key": "host",
            "value": "h52"
          }
        ],
        "version": "4.0.4",
        "status_address": "10.59.105.52:20180",
        "git_hash": "28e3d44b00700137de4fa933066ab83e5f8306cf",
        "start_timestamp": 1596528031,
        "deploy_path": "/data/tikv/20160/deploy/bin",
        "last_heartbeat": 1596599959934028571,
        "state_name": "Up"
      },
      "status": {
        "capacity": "1.718TiB",
        "available": "1.711TiB",
        "used_size": "2.673GiB",
        "leader_count": 283,
        "leader_weight": 1,
        "leader_score": 283,
        "leader_size": 22288,
        "region_count": 852,
        "region_weight": 1,
        "region_score": 67516,
        "region_size": 67516,
        "start_ts": "2020-08-04T16:00:31+08:00",
        "last_heartbeat_ts": "2020-08-05T11:59:19.934028571+08:00",
        "uptime": "19h58m48.934028571s"
      }
    },
    {
      "store": {
        "id": 4,
        "address": "10.59.105.50:20160",
        "labels": [
          {
            "key": "host",
            "value": "h50"
          }
        ],
        "version": "4.0.4",
        "status_address": "10.59.105.50:20180",
        "git_hash": "28e3d44b00700137de4fa933066ab83e5f8306cf",
        "start_timestamp": 1596528031,
        "deploy_path": "/data/tikv/20160/deploy/bin",
        "last_heartbeat": 1596599960864904130,
        "state_name": "Up"
      },
      "status": {
        "capacity": "1.718TiB",
        "available": "1.711TiB",
        "used_size": "2.854GiB",
        "leader_count": 289,
        "leader_weight": 1,
        "leader_score": 289,
        "leader_size": 22937,
        "region_count": 864,
        "region_weight": 1,
        "region_score": 67737,
        "region_size": 67737,
        "start_ts": "2020-08-04T16:00:31+08:00",
        "last_heartbeat_ts": "2020-08-05T11:59:20.86490413+08:00",
        "uptime": "19h58m49.86490413s"
      }
    },
    {
      "store": {
        "id": 5,
        "address": "10.59.105.51:20161",
        "labels": [
          {
            "key": "host",
            "value": "h51"
          }
        ],
        "version": "4.0.4",
        "status_address": "10.59.105.51:20181",
        "git_hash": "28e3d44b00700137de4fa933066ab83e5f8306cf",
        "start_timestamp": 1596528031,
        "deploy_path": "/data/tikv/20161/deploy/bin",
        "last_heartbeat": 1596599960335747479,
        "state_name": "Up"
      },
      "status": {
        "capacity": "1.718TiB",
        "available": "1.709TiB",
        "used_size": "2.784GiB",
        "leader_count": 257,
        "leader_weight": 1,
        "leader_score": 257,
        "leader_size": 21084,
        "region_count": 842,
        "region_weight": 1,
        "region_score": 67604,
        "region_size": 67604,
        "start_ts": "2020-08-04T16:00:31+08:00",
        "last_heartbeat_ts": "2020-08-05T11:59:20.335747479+08:00",
        "uptime": "19h58m49.335747479s"
      }
    },
    {
      "store": {
        "id": 6,
        "address": "10.59.105.50:20161",
        "labels": [
          {
            "key": "host",
            "value": "h50"
          }
        ],
        "version": "4.0.4",
        "status_address": "10.59.105.50:20181",
        "git_hash": "28e3d44b00700137de4fa933066ab83e5f8306cf",
        "start_timestamp": 1596528031,
        "deploy_path": "/data/tikv/20161/deploy/bin",
        "last_heartbeat": 1596599960708386619,
        "state_name": "Up"
      },
      "status": {
        "capacity": "1.718TiB",
        "available": "1.712TiB",
        "used_size": "2.686GiB",
        "leader_count": 348,
        "leader_weight": 1,
        "leader_score": 348,
        "leader_size": 28832,
        "region_count": 833,
        "region_weight": 1,
        "region_score": 67367,
        "region_size": 67367,
        "start_ts": "2020-08-04T16:00:31+08:00",
        "last_heartbeat_ts": "2020-08-05T11:59:20.708386619+08:00",
        "uptime": "19h58m49.708386619s"
      }
    },
    {
      "store": {
        "id": 59,
        "address": "10.59.105.70:3930",
        "labels": [
          {
            "key": "engine",
            "value": "tiflash"
          }
        ],
        "version": "v4.0.4",
        "peer_address": "10.59.105.70:20170",
        "status_address": "10.59.105.70:20292",
        "git_hash": "bfa9128f59cf800e129152f06b12480ad78adafd",
        "start_timestamp": 1596527306,
        "deploy_path": "/data/tiflash/bin/tiflash",
        "last_heartbeat": 1596599965330986025,
        "state_name": "Up"
      },
      "status": {
        "capacity": "1.719TiB",
        "available": "1.626TiB",
        "used_size": "11.5KiB",
        "leader_count": 0,
        "leader_weight": 1,
        "leader_score": 0,
        "leader_size": 0,
        "region_count": 0,
        "region_weight": 1,
        "region_score": 0,
        "region_size": 0,
        "start_ts": "2020-08-04T15:48:26+08:00",
        "last_heartbeat_ts": "2020-08-05T11:59:25.330986025+08:00",
        "uptime": "20h10m59.330986025s"
      }
    }
  ]
}

grafana监控:看起来不是很均衡

如果多个 TiKV 实例部署在同一块物理磁盘上,需要在 tikv 配置中添加 capacity 参数:

raftstore.capacity = 磁盘总容量 / TiKV 实例数量

在上面没看到,辛苦在检查下。

两块盘

可能是数量比较少所以差异化被放大了,可以在观察下,如果在负载上没有出现热点问题,此种情况我们认为是平衡的,可以关注下负载均衡情况

reload 问题,辛苦判断下是否可以稳定复现

leader region有差了几十上百个,这个是正常?即使数量较少

每次reload,都会造成leader region不均衡,重新安装也没用

这个是昨天压测的情况,明显热点分布不均匀

可以看一下在 reload 和 restart 的那段时间附近 store score 和 leader score 吗?在 PD-Statistics-balance 下面

以及 pd 的 operator 监控

这个图上是什么时候开始 reload 的?