tidb空间大小统计如何理解

【 TiDB 使用环境】Poc
【 TiDB 版本】6.3
【遇到的问题】空间大小统计如何理解
请问下:
tikv_store_status表中的CAPACITY, AVAILABLE, REGION_SIZE是实时真是大小么,压缩前还是压缩后大小?
tikv_region_status表中APPROXIMATE_SIZE是评估大小,这个评估是怎么来的?
tables表中TABLE_ROWS,DATA_LENGTH只来自统计信息么,如果表很大,统计信息很小,那么一定会统计出来的值比较小?
METRICS_SCHEMA.pd_cluster_status监控视图中的storage_capacity,storage_size是真实大小么,这个和tikv_store_status表中的CAPACITY, AVAILABLE有什么关系,谁更准一些呢,是压缩前还是压缩后大小?

mysql> select * from tikv_store_status where store_id=1 \G
*************************** 1. row ***************************
STORE_ID: 1
ADDRESS: 127.0.0.1:20160
STORE_STATE: 0
STORE_STATE_NAME: Up
LABEL: null
VERSION: 6.3.0
CAPACITY: 233.5GiB
AVAILABLE: 87.66GiB
LEADER_COUNT: 2
LEADER_WEIGHT: 1
LEADER_SCORE: 2
LEADER_SIZE: 2
REGION_COUNT: 2
REGION_WEIGHT: 1
REGION_SCORE: 6.726269909837314
REGION_SIZE: 2
START_TS: 2022-10-10 13:07:09
LAST_HEARTBEAT_TS: 2022-10-10 15:09:52
UPTIME: 2h2m43.071582s
1 row in set (0.00 sec)

mysql> select * from tikv_region_status limit 1 \G
*************************** 1. row ***************************
REGION_ID: 71
START_KEY:
END_KEY: 7480000000000000FF4600000000000000F8
TABLE_ID: 4
DB_NAME: mysql
TABLE_NAME: user
IS_INDEX: 1
INDEX_ID: 1
INDEX_NAME: PRIMARY
EPOCH_CONF_VER: 1
EPOCH_VERSION: 68
WRITTEN_BYTES: 635
READ_BYTES: 174287
APPROXIMATE_SIZE: 1
APPROXIMATE_KEYS: 1730
REPLICATIONSTATUS_STATE: NULL
REPLICATIONSTATUS_STATEID: NULL
1 row in set (0.01 sec)

mysql> select * from tables limit 1 \G
*************************** 1. row ***************************
TABLE_CATALOG: def
TABLE_SCHEMA: INFORMATION_SCHEMA
TABLE_NAME: REFERENTIAL_CONSTRAINTS
TABLE_TYPE: SYSTEM VIEW
ENGINE: InnoDB
VERSION: 10
ROW_FORMAT: Compact
TABLE_ROWS: 0
AVG_ROW_LENGTH: 0
DATA_LENGTH: 0
MAX_DATA_LENGTH: 0
INDEX_LENGTH: 0
DATA_FREE: 0
AUTO_INCREMENT: NULL
CREATE_TIME: 1970-01-01 08:00:00
UPDATE_TIME: NULL
CHECK_TIME: NULL
TABLE_COLLATION: utf8mb4_bin
CHECKSUM: NULL
CREATE_OPTIONS:
TABLE_COMMENT:
TIDB_TABLE_ID: 4611686018427387918
TIDB_ROW_ID_SHARDING_INFO: NULL
TIDB_PK_TYPE: NONCLUSTERED
TIDB_PLACEMENT_POLICY_NAME: NULL
1 row in set (0.01 sec)

mysql> select * from METRICS_SCHEMA.pd_cluster_status where time=now() and type like ‘storage_%’;
±---------------------------±---------------±-----------------±-------------+
| time | instance | type | value |
±---------------------------±---------------±-----------------±-------------+
| 2022-10-10 15:12:28.000000 | 127.0.0.1:2379 | storage_capacity | 501371150336 |
| 2022-10-10 15:12:28.000000 | 127.0.0.1:2379 | storage_size | 303885665 |
±---------------------------±---------------±-----------------±-------------+
2 rows in set (0.00 sec)

REGION_SIZE 是根据key数量、Key长度的预估值,key数量也不是一个准确值。METRICS_SCHEMA中数据是来源于prometheus监控中数据,tikv_store_status中CAPACITY, AVAILABLE 应该也是真实的大小,可能和METRICS_SCHEMA有换算单位差异,你按1024换成字节在用1000转成tb对比下。 ables表中TABLE_ROWS,DATA_LENGTH 是根据统计信息估算的。

可以使用 tables表中的 AVG_ROW_LENGTHTABLE_ROWS来进行计算

3 个赞