information_schema.tables 里重复垃圾数据 怎么清理

Bug 反馈
清晰准确地描述您发现的问题,提供任何可能复现问题的步骤有助于研发同学及时处理问题
【 Bug 的影响】
Tidb 5.2.2
tispark 对 yixintui_operate库 完全没法操作


【可能的问题复现步骤】

【看到的非预期行为】
表已经drop 掉了 ,但是 系统视图里 还是有垃圾数据 怎么清理 ,
MySQL [information_schema]> select TABLE_SCHEMA,CREATE_TIME, TIDB_TABLE_ID from information_schema.tables where table_name =‘Material_creative_count’;
±-----------------±--------------------±--------------+
| TABLE_SCHEMA | CREATE_TIME | TIDB_TABLE_ID |
±-----------------±--------------------±--------------+
| yixintui_operate | 2021-11-05 17:47:43 | 29909 |
| yixintui_operate | 2021-11-05 17:47:43 | 29944 |
| yixintui_operate | 2021-11-05 17:47:43 | 37467 |

同样的sql 同一集群 部分tidb里 查看没有数据 ,部分tidb 查看到 2-3条数据 。
【期望看到的行为】

【相关组件及具体版本】

【其他背景信息或者截图】
如集群拓扑,系统和内核版本,应用 app 信息等;如果问题跟 SQL 有关,请提供 SQL 语句和相关表的 Schema 信息;如果节点日志存在关键报错,请提供相关节点的日志内容或文件;如果一些业务敏感信息不便提供,请留下联系方式,我们与您私下沟通。

版本?

tidb5.2.2
Spark version 2.4.3

Material_creative_count 是个view,还是一个table?

table 11月5号前后的操作 。

26514 yixintui_operate material_creative_count truncate table public 1382 26509 0 2021-11-10 16:10:54 2021-11-10 16:10:54 synced truncate table Material_creative_count
26510 yixintui_operate material_creative_count truncate table public 1382 26506 0 2021-11-10 15:10:54 2021-11-10 15:10:54 synced truncate table Material_creative_count
26507 yixintui_operate material_creative_count truncate table public 1382 26504 0 2021-11-10 14:10:54 2021-11-10 14:10:54 synced truncate table Material_creative_count
26505 yixintui_operate material_creative_count truncate table public 1382 26502 0 2021-11-10 13:10:54 2021-11-10 13:10:54 synced truncate table Material_creative_count
26503 yixintui_operate material_creative_count truncate table public 1382 23785 0 2021-11-10 12:10:54 2021-11-10 12:10:54 synced truncate table Material_creative_count
26038 yixintui_operate material_creative_count modify column public 1382 23785 0 2021-11-05 17:47:43 2021-11-05 17:47:43 synced alter table Material_creative_count MODIFY Id BIGINT auto_random COMMENT “鎶ヨ〃ID”
26037 yixintui_operate material_creative_count modify column public 1382 23785 0 2021-11-05 17:47:15 2021-11-05 17:47:15 synced alter table Material_creative_count MODIFY id BIGINT auto_random COMMENT “鎶ヨ〃ID”
26036 yixintui_operate material_creative_count modify column public 1382 23785 0 2021-11-05 17:46:14 2021-11-05 17:46:15 synced alter table Material_creative_count MODIFY id BIGINT auto_random COMMENT “鎶ヨ〃ID”
23786 yixintui_operate material_creative_count create table public 1382 23785 0 2021-10-19 16:48:37 2021-10-19 16:48:37 synced CREATE TABLE Material_creative_count (
Id bigint(20) NOT NULL AUTO_INCREMENT COMMENT ‘鎶ヨ〃ID’,
Platform_Type int(11) DEFAULT NULL COMMENT ‘濯掍綋骞冲彴’,
Agent_Material_Id varchar(100) DEFAULT NULL COMMENT ‘濯掍綋骞冲彴绱犳潗ID’,
creative_count int(11) DEFAULT NULL COMMENT '鍒涙剰鏁?,
PRIMARY KEY (Id) /*T![clustered_index] CLUSTERED */
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin AUTO_INCREMENT=2068868805 COMMENT='绱犳潗-鍒涙剰鏁?
23784 yixintui_operate material_creative_count drop table queueing 1382 23782 0 2021-10-19 16:48:04 2021-10-19 16:48:05 synced DROP TABLE yixintui_operate.Material_creative_count;
23783 yixintui_operate material_creative_count create table public 1382 23782 0 2021-10-19 16:28:11 2021-10-19 16:28:11 synced CREATE TABLE Material_creative_count (
Platform_Type int(11) DEFAULT NULL COMMENT ‘濯掍綋骞冲彴’,
Agent_Material_Id varchar(100) DEFAULT NULL COMMENT ‘濯掍綋骞冲彴绱犳潗ID’,
creative_count int(11) DEFAULT NULL COMMENT '鍒涙剰鏁?
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin AUTO_INCREMENT=2068868805 COMMENT='绱犳潗-鍒涙剰鏁?

你试一下,开一个tidb,然后在新的tidb上能不能操作?

tidb 操作不影响 。
tispark 不能 操作 yixintui_operate 这个库 。

看看能不能找到包含 29909,29944 和 37467 这几个 table ID 的 DDL history 呢?

这个怎么找 每个table_id 的history ?

https://github.com/pingcap/tidb/blob/master/docs/tidb_http_api.md

curl http://{TiDBIP}:10080/ddl/history

tispark某个库里任何操作都报java.lang.IllegalArgumentException: Multiple entries with same key: 这里有ddl.txt 下载
每一次ddl 之后 table_id 都会变大 。老的数据应该从系统视图information_schema.tables 删除的, 没删干净 。

select * from information_schema.DDL_JOBS where table_id=29909 limit 1000;
±-------±-----------------±------------------------±---------------±-------------±----------±---------±----------±--------------------±--------------------±-------±---------------------------------------+
| JOB_ID | DB_NAME | TABLE_NAME | JOB_TYPE | SCHEMA_STATE | SCHEMA_ID | TABLE_ID | ROW_COUNT | START_TIME | END_TIME | STATE | QUERY |
±-------±-----------------±------------------------±---------------±-------------±----------±---------±----------±--------------------±--------------------±-------±---------------------------------------+
| 29934 | yixintui_operate | material_creative_count | truncate table | public | 1382 | 29909 | 0 | 2021-11-30 12:11:12 | 2021-11-30 12:11:12 | synced | truncate table Material_creative_count |

(user:tidbdba time: 14:53)[db: yixintui_operate] select * from information_schema.DDL_JOBS where table_id=29944 limit 10000;
±-------±-----------------±------------------------±---------------±-------------±----------±---------±----------±--------------------±--------------------±-------±---------------------------------------+
| JOB_ID | DB_NAME | TABLE_NAME | JOB_TYPE | SCHEMA_STATE | SCHEMA_ID | TABLE_ID | ROW_COUNT | START_TIME | END_TIME | STATE | QUERY |
±-------±-----------------±------------------------±---------------±-------------±----------±---------±----------±--------------------±--------------------±-------±---------------------------------------+
| 29949 | yixintui_operate | material_creative_count | truncate table | public | 1382 | 29944 | 0 | 2021-11-30 17:11:10 | 2021-11-30 17:11:11 | synced | truncate table Material_creative_count |

(user:tidbdba time: 14:53)[db: yixintui_operate] select * from information_schema.DDL_JOBS where table_id=37467 limit 10000;
±-------±-----------------±------------------------±---------------±-------------±----------±---------±----------±--------------------±--------------------±-------±---------------------------------------+
| JOB_ID | DB_NAME | TABLE_NAME | JOB_TYPE | SCHEMA_STATE | SCHEMA_ID | TABLE_ID | ROW_COUNT | START_TIME | END_TIME | STATE | QUERY |
±-------±-----------------±------------------------±---------------±-------------±----------±---------±----------±--------------------±--------------------±-------±---------------------------------------+
| 37494 | yixintui_operate | material_creative_count | truncate table | public | 1382 | 37467 | 0 | 2021-12-15 11:11:18 | 2021-12-15 11:11:18 | synced | truncate table Material_creative_count |

TiDB DDL owner 的日志里面有这几个 ID 相关的报错信息吗?

查看 DDL owner 的节点:

admin show ddl;

怎么看DDL owner 的节点 还是随便看其中一个tidb
{
“id”: 29910,
“type”: 11,
“schema_id”: 1382,
“table_id”: 29902,
“schema_name”: “yixintui_operate”,
“state”: 6,
“err”: null,
“err_count”: 0,
“row_count”: 0,
“raw_args”: [
“dIAAAAAAAHTO”,
null
],
“schema_state”: 5,
“snapshot_ver”: 0,
“real_start_ts”: 0,
“start_ts”: 429455273759604927,
“dependency_id”: 0,
“query”: “truncate table Material_creative_count”,
“binlog”: {
“SchemaVersion”: 19060,
“DBInfo”: null,
“TableInfo”: {
"id": 29909,
“name”: {
“O”: “Material_creative_count”,
“L”: “material_creative_count”
},
“charset”: “utf8mb4”,
“collate”: “utf8mb4_bin”,
“cols”: [
{
“id”: 1,
“name”: {
“O”: “Id”,
“L”: “id”
},
“offset”: 0,
“origin_default”: null,
“origin_default_bit”: null,
“default”: null,
“default_bit”: null,
“default_is_expr”: false,
“generated_expr_string”: “”,
“generated_stored”: false,
“dependences”: null,
“type”: {
“Tp”: 8,
“Flag”: 4099,
“Flen”: 20,
“Decimal”: 0,
“Charset”: “binary”,
“Collate”: “binary”,
“Elems”: null
},
“state”: 5,
“comment”: “报表ID”,
“hidden”: false,
“change_state_info”: null,
“version”: 2
},
{
“id”: 2,
“name”: {
“O”: “Platform_Type”,
“L”: “platform_type”
},
“offset”: 1,
“origin_default”: null,
“origin_default_bit”: null,
“default”: null,
“default_bit”: null,
“default_is_expr”: false,
“generated_expr_string”: “”,
“generated_stored”: false,
“dependences”: null,
“type”: {
“Tp”: 3,
“Flag”: 0,
“Flen”: 11,
“Decimal”: 0,
“Charset”: “binary”,
“Collate”: “binary”,
“Elems”: null
},
“state”: 5,
“comment”: “媒体平台”,
“hidden”: false,
“change_state_info”: null,
“version”: 2
},
{
“id”: 3,
“name”: {
“O”: “Agent_Material_Id”,
“L”: “agent_material_id”
},
“offset”: 2,
“origin_default”: null,
“origin_default_bit”: null,
“default”: null,
“default_bit”: null,
“default_is_expr”: false,
“generated_expr_string”: “”,
“generated_stored”: false,
“dependences”: null,
“type”: {
“Tp”: 15,
“Flag”: 0,
“Flen”: 100,
“Decimal”: 0,
“Charset”: “utf8mb4”,
“Collate”: “utf8mb4_bin”,
“Elems”: null
},
“state”: 5,
“comment”: “媒体平台素材ID”,
“hidden”: false,
“change_state_info”: null,
“version”: 2
},
{
“id”: 4,
“name”: {
“O”: “creative_count”,
“L”: “creative_count”
},
“offset”: 3,
“origin_default”: null,
“origin_default_bit”: null,
“default”: null,
“default_bit”: null,
“default_is_expr”: false,
“generated_expr_string”: “”,
“generated_stored”: false,
“dependences”: null,
“type”: {
“Tp”: 3,
“Flag”: 0,
“Flen”: 11,
“Decimal”: 0,
“Charset”: “binary”,
“Collate”: “binary”,
“Flag”: 0,
“Flen”: 11,
“Decimal”: 0,
“Charset”: “binary”,
“Collate”: “binary”,
“Elems”: null
“Elems”: null
},
“state”: 5,
“comment”: “创意数”,
“hidden”: false,
“change_state_info”: null,
“version”: 2
}
],
“index_info”: null,
“constraint_info”: null,
“fk_info”: null,
“state”: 5,
“pk_is_handle”: true,
“is_common_handle”: false,
“common_handle_version”: 0,
“comment”: “素材-创意数”,
“auto_inc_id”: 2068868805,
“auto_id_cache”: 0,
“auto_rand_id”: 0,
“max_col_id”: 4,
“max_idx_id”: 0,
“max_cst_id”: 0,
“update_timestamp”: 428895283012698449,
“ShardRowIDBits”: 0,
“max_shard_row_id_bits”: 0,
“auto_random_bits”: 5,
“pre_split_regions”: 0,
“partition”: null,
“compression”: “”,
“view”: null,
“sequence”: null,
“Lock”: null,
“version”: 4,
“tiflash_replica”: null,
“is_columnar”: false,
“temp_table_type”: 0
},
“FinishedTS”: 429455273798402189
},
“version”: 1,
“reorg_meta”: null,
“priority”: 0
},

ADMIN SHOW DDL; 下面的 OWNER_ADDRESS 就是 DDL owner 的地址。看那台 TiDB 上的日志,可以先按表名搜索一下错误或警告;如果没有,再搜索 table id 为 29944 的日志。

17.87 日志只有 最近一周的 29909 29944 都是 11-30号产生的 。
image

先再确定一下是不是数据出了问题:

curl http://{TiDBIP}:10080/schema?table_id=29909
curl http://{TiDBIP}:10080/schema?table_id=29944
curl http://{TiDBIP}:10080/schema?table_id=37467

看看这几个结果。

有什么接口 可以删除这两个 table_id 29909 29944 吗?
[root@yzdmp006068 ~]# curl http://xx.xx.17.68:10080/schema?table_id=29909
{
“id”: 29909,
“name”: {
“O”: “Material_creative_count”,
“L”: “material_creative_count”
},
“charset”: “utf8mb4”,
“collate”: “utf8mb4_bin”,
“cols”: [
{
“id”: 1,
“name”: {
“O”: “Id”,
“L”: “id”
},
“offset”: 0,
“origin_default”: null,
“origin_default_bit”: null,
“default”: null,
“default_bit”: null,
“default_is_expr”: false,
“generated_expr_string”: “”,
“generated_stored”: false,
“dependences”: null,
“type”: {
“Tp”: 8,
“Flag”: 4099,
“Flen”: 20,
“Decimal”: 0,
“Charset”: “binary”,
“Collate”: “binary”,
“Elems”: null
},
“state”: 5,
“comment”: “报表ID”,
“hidden”: false,
“change_state_info”: null,
“version”: 2
},
{
“id”: 2,
“name”: {
“O”: “Platform_Type”,
“L”: “platform_type”
},
“offset”: 1,
“origin_default”: null,
“origin_default_bit”: null,
“default”: null,
“default_bit”: null,
“default_is_expr”: false,
“generated_expr_string”: “”,
“generated_stored”: false,
“dependences”: null,
“type”: {
“Tp”: 3,
“Flag”: 0,
“Flen”: 11,
“Decimal”: 0,
“Charset”: “binary”,
“Collate”: “binary”,
“Elems”: null
},
“state”: 5,
“comment”: “媒体平台”,
“hidden”: false,
“change_state_info”: null,
“version”: 2
},
{
“id”: 3,
“name”: {
“O”: “Agent_Material_Id”,
“L”: “agent_material_id”
},
“offset”: 2,
“origin_default”: null,
“origin_default_bit”: null,
“default”: null,
“default_bit”: null,
“default_is_expr”: false,
“generated_expr_string”: “”,
“generated_stored”: false,
“dependences”: null,
“type”: {
“Tp”: 15,
“Flag”: 0,
“Flen”: 100,
“Decimal”: 0,
“Charset”: “utf8mb4”,
“Collate”: “utf8mb4_bin”,
“Elems”: null
},
“state”: 5,
“comment”: “媒体平台素材ID”,
“hidden”: false,
“change_state_info”: null,
“version”: 2
},
{
“id”: 4,
“name”: {
“O”: “creative_count”,
“L”: “creative_count”
},
“offset”: 3,
“origin_default”: null,
“origin_default_bit”: null,
“default”: null,
“default_bit”: null,
“default_is_expr”: false,
“generated_expr_string”: “”,
“generated_stored”: false,
“dependences”: null,
“type”: {
“Tp”: 3,
“Flag”: 0,
“Flen”: 11,
“Decimal”: 0,
“Charset”: “binary”,
“Collate”: “binary”,
“Elems”: null
},
“state”: 5,
“comment”: “创意数”,
“hidden”: false,
“change_state_info”: null,
“version”: 2
}
],
“index_info”: null,
“constraint_info”: null,
“fk_info”: null,
“state”: 5,
“pk_is_handle”: true,
“is_common_handle”: false,
“common_handle_version”: 0,
“comment”: “素材-创意数”,
“auto_inc_id”: 2068868805,
“auto_id_cache”: 0,
“auto_rand_id”: 0,
“max_col_id”: 4,
“max_idx_id”: 0,
“max_cst_id”: 0,
“update_timestamp”: 428895283012698449,
“ShardRowIDBits”: 0,
“max_shard_row_id_bits”: 0,
“auto_random_bits”: 5,
“pre_split_regions”: 0,
“partition”: null,
“compression”: “”,
“view”: null,
“sequence”: null,
“Lock”: null,
“version”: 4,
“tiflash_replica”: null,
“is_columnar”: false,
“temp_table_type”: 0
}

[root@yzdmp006068 ~]# curl http://xx.xx.17.68:10080/schema?table_id=29944
{
“id”: 29944,
“name”: {
“O”: “Material_creative_count”,
“L”: “material_creative_count”
},
“charset”: “utf8mb4”,
“collate”: “utf8mb4_bin”,
“cols”: [
{
“id”: 1,
“name”: {
“O”: “Id”,
“L”: “id”
},
“offset”: 0,
“origin_default”: null,
“origin_default_bit”: null,
“default”: null,
“default_bit”: null,
“default_is_expr”: false,
“generated_expr_string”: “”,
“generated_stored”: false,
“dependences”: null,
“type”: {
“Tp”: 8,
“Flag”: 4099,
“Flen”: 20,
“Decimal”: 0,
“Charset”: “binary”,
“Collate”: “binary”,
“Elems”: null
},
“state”: 5,
“comment”: “报表ID”,
“hidden”: false,
“change_state_info”: null,
“version”: 2
},
{
“id”: 2,
“name”: {
“O”: “Platform_Type”,
“L”: “platform_type”
},
“offset”: 1,
“origin_default”: null,
“origin_default_bit”: null,
“default”: null,
“default_bit”: null,
“default_is_expr”: false,
“generated_expr_string”: “”,
“generated_stored”: false,
“dependences”: null,
“type”: {
“Tp”: 3,
“Flag”: 0,
“Flen”: 11,
“Decimal”: 0,
“Charset”: “binary”,
“Collate”: “binary”,
“Elems”: null
},
“state”: 5,
“comment”: “媒体平台”,
“hidden”: false,
“change_state_info”: null,
“version”: 2
},
{
“id”: 3,
“name”: {
“O”: “Agent_Material_Id”,
“L”: “agent_material_id”
},
“offset”: 2,
“origin_default”: null,
“origin_default_bit”: null,
“default”: null,
“default_bit”: null,
“default_is_expr”: false,
“generated_expr_string”: “”,
“generated_stored”: false,
“dependences”: null,
“type”: {
“Tp”: 15,
“Flag”: 0,
“Flen”: 100,
“Decimal”: 0,
“Charset”: “utf8mb4”,
“Collate”: “utf8mb4_bin”,
“Elems”: null
},
“state”: 5,
“comment”: “媒体平台素材ID”,
“hidden”: false,
“change_state_info”: null,
“version”: 2
},
{
“id”: 4,
“name”: {
“O”: “creative_count”,
“L”: “creative_count”
},
“offset”: 3,
“origin_default”: null,
“origin_default_bit”: null,
“default”: null,
“default_bit”: null,
“default_is_expr”: false,
“generated_expr_string”: “”,
“generated_stored”: false,
“dependences”: null,
“type”: {
“Tp”: 3,
“Flag”: 0,
“Flen”: 11,
“Decimal”: 0,
“Charset”: “binary”,
“Collate”: “binary”,
“Elems”: null
},
“state”: 5,
“comment”: “创意数”,
“hidden”: false,
“change_state_info”: null,
“version”: 2
}
],
“index_info”: null,
“constraint_info”: null,
“fk_info”: null,
“state”: 5,
“pk_is_handle”: true,
“is_common_handle”: false,
“common_handle_version”: 0,
“comment”: “素材-创意数”,
“auto_inc_id”: 2068868805,
“auto_id_cache”: 0,
“auto_rand_id”: 0,
“max_col_id”: 4,
“max_idx_id”: 0,
“max_cst_id”: 0,
“update_timestamp”: 428895283012698449,
“ShardRowIDBits”: 0,
“max_shard_row_id_bits”: 0,
“auto_random_bits”: 5,
“pre_split_regions”: 0,
“partition”: null,
“compression”: “”,
“view”: null,
“sequence”: null,
“Lock”: null,
“version”: 4,
“tiflash_replica”: null,
“is_columnar”: false,
“temp_table_type”: 0
}
#这个rename 之后 已经清理掉了
[root@yzdmp006068 ~]# curl http://xx.xx.17.68:10080/schema?table_id=37467
[schema:1146]Table which ID = 37467 does not exist.

清理 垃圾元数据信息的步骤:
rename table — drop table – 重启当前session tidb

这个问题是这样的:

  • 由于某些原因/bug(待查),TiDB 在持久化的元信息(保存在 TiKV)里面出现了重复定义的表结构,也就是表名相同,表的 ID 不同。
  • TiDB 加载表结构的时候,由于假定所有的表都是唯一的,如果出现重复的表结构,只会 load 其中的一张,比如 37xxx 那一张。但是 information_schema 中查看的时候,会把所有的重复表都显示出来,TiSpark 也是一样会检查
  • TiDB drop table 用的是自己缓存的元信息,所以 drop table 只能删除掉 37xxx 的表,继续 drop 就会报表结构不存在
  • TiDB 每次重启都会重新 load TiKV 中的元信息,所以不断的重启 / drop 可以依次把重复的表定义全部删除干净