region score突然变得很不稳定,这个score是怎么计算得到的?

  • 【TiDB 版本】:v3.0.12
  • 【问题描述】:
    region score突然变得很不稳定,这个score是怎么计算得到的?

    {
    “count”: 8,
    “stores”: [
    {
    “store”: {
    “id”: 4,
    “address”: “172.18.69.174:20160”,
    “version”: “3.0.12”,
    “state_name”: “Up”
    },
    “status”: {
    “capacity”: “6.93TiB”,
    “available”: “3.018TiB”,
    “leader_count”: 25863,
    “leader_weight”: 1,
    “leader_score”: 5340750,
    “leader_size”: 5340750,
    “region_count”: 79590,
    “region_weight”: 1,
    “region_score”: 16703192,
    “region_size”: 16703192,
    “start_ts”: “2020-12-09T21:49:55+08:00”,
    “last_heartbeat_ts”: “2020-12-21T10:35:06.803658673+08:00”,
    “uptime”: “276h45m11.803658673s”
    }
    },
    {
    “store”: {
    “id”: 5,
    “address”: “172.18.69.175:20160”,
    “version”: “3.0.12”,
    “state_name”: “Up”
    },
    “status”: {
    “capacity”: “6.93TiB”,
    “available”: “2.98TiB”,
    “leader_count”: 25254,
    “leader_weight”: 1,
    “leader_score”: 5341269,
    “leader_size”: 5341269,
    “region_count”: 79246,
    “region_weight”: 1,
    “region_score”: 16702847,
    “region_size”: 16702847,
    “sending_snap_count”: 1,
    “start_ts”: “2020-12-09T21:34:20+08:00”,
    “last_heartbeat_ts”: “2020-12-21T10:35:13.728533086+08:00”,
    “uptime”: “277h0m53.728533086s”
    }
    },
    {
    “store”: {
    “id”: 6,
    “address”: “172.18.69.177:20160”,
    “version”: “3.0.12”,
    “state_name”: “Up”
    },
    “status”: {
    “capacity”: “6.93TiB”,
    “available”: “3.026TiB”,
    “leader_count”: 25732,
    “leader_weight”: 1,
    “leader_score”: 5340486,
    “leader_size”: 5340486,
    “region_count”: 80076,
    “region_weight”: 1,
    “region_score”: 16702867,
    “region_size”: 16702867,
    “start_ts”: “2020-12-09T20:51:48+08:00”,
    “last_heartbeat_ts”: “2020-12-21T10:35:12.755680352+08:00”,
    “uptime”: “277h43m24.755680352s”
    }
    },
    {
    “store”: {
    “id”: 1352078,
    “address”: “172.18.69.176:20160”,
    “version”: “3.0.12”,
    “state_name”: “Up”
    },
    “status”: {
    “capacity”: “6.93TiB”,
    “available”: “2.783TiB”,
    “leader_count”: 26064,
    “leader_weight”: 1,
    “leader_score”: 5341051,
    “leader_size”: 5341051,
    “region_count”: 82017,
    “region_weight”: 1,
    “region_score”: 16702851,
    “region_size”: 16702851,
    “start_ts”: “2020-12-09T21:17:32+08:00”,
    “last_heartbeat_ts”: “2020-12-21T10:35:12.506612936+08:00”,
    “uptime”: “277h17m40.506612936s”
    }
    },
    {
    “store”: {
    “id”: 234966,
    “address”: “172.18.69.173:20160”,
    “version”: “3.0.12”,
    “state_name”: “Up”
    },
    “status”: {
    “capacity”: “6.93TiB”,
    “available”: “2.94TiB”,
    “leader_count”: 24870,
    “leader_weight”: 1,
    “leader_score”: 5341070,
    “leader_size”: 5341070,
    “region_count”: 78363,
    “region_weight”: 1,
    “region_score”: 16706043,
    “region_size”: 16706043,
    “sending_snap_count”: 1,
    “start_ts”: “2020-12-17T21:57:51+08:00”,
    “last_heartbeat_ts”: “2020-12-21T10:35:15.808589921+08:00”,
    “uptime”: “84h37m24.808589921s”
    }
    },
    {
    “store”: {
    “id”: 2528326,
    “address”: “172.18.69.172:20160”,
    “version”: “3.0.12”,
    “state_name”: “Up”
    },
    “status”: {
    “capacity”: “6.93TiB”,
    “available”: “2.762TiB”,
    “leader_count”: 25713,
    “leader_weight”: 1,
    “leader_score”: 5340545,
    “leader_size”: 5340545,
    “region_count”: 78913,
    “region_weight”: 1,
    “region_score”: 23833463.998023033,
    “region_size”: 16429476,
    “sending_snap_count”: 1,
    “start_ts”: “2020-12-09T22:14:39+08:00”,
    “last_heartbeat_ts”: “2020-12-21T10:35:13.30571938+08:00”,
    “uptime”: “276h20m34.30571938s”
    }
    },
    {
    “store”: {
    “id”: 2528327,
    “address”: “172.18.69.171:20160”,
    “version”: “3.0.12”,
    “state_name”: “Up”
    },
    “status”: {
    “capacity”: “6.93TiB”,
    “available”: “2.744TiB”,
    “leader_count”: 25308,
    “leader_weight”: 1,
    “leader_score”: 5340666,
    “leader_size”: 5340666,
    “region_count”: 64647,
    “region_weight”: 1,
    “region_score”: 35206941.723953724,
    “region_size”: 13683996,
    “start_ts”: “2020-12-09T22:26:17+08:00”,
    “last_heartbeat_ts”: “2020-12-21T10:35:08.311823071+08:00”,
    “uptime”: “276h8m51.311823071s”
    }
    },
    {
    “store”: {
    “id”: 2528328,
    “address”: “172.18.69.170:20160”,
    “version”: “3.0.12”,
    “state_name”: “Up”
    },
    “status”: {
    “capacity”: “6.931TiB”,
    “available”: “2.731TiB”,
    “leader_count”: 25416,
    “leader_weight”: 1,
    “leader_score”: 5341092,
    “leader_size”: 5341092,
    “region_count”: 69849,
    “region_weight”: 1,
    “region_score”: 46035316.95613432,
    “region_size”: 14571141,
    “start_ts”: “2020-12-20T22:00:50+08:00”,
    “last_heartbeat_ts”: “2020-12-21T10:35:08.808166628+08:00”,
    “uptime”: “12h34m18.808166628s”
    }
    }
    ]
    }

pd 计算 region score 是一个分段函数计算的,当剩余空间容量低于某一个值的时候,对应的分段函数斜率会比较大。
有两个对应的 pd 参数控制,high-space-ratio 已经 low-space-ratio


当 TiKV 实例空间使用率达到 high-space-ratio 的时候,会计算 region score 比较高,这样可以让 region 尽量调度到别的节点上,平衡各个实例之间的空间使用率。

可以通过 pd-ctl 执行 config show all 确认 high-space-ratio 和 low-space-ratio 设置值,并根据实际情况可以进行调整。


当利用率处于high-space-ratio 和 low-space-ratio之间时如何处理