大小表关联时,大表有索引,结果走了小表的全表扫描

liusf1993 · 2021 年3 月 29 日 04:34

为提高效率，请提供以下信息，问题描述清晰能够更快得到解决：

【TiDB 版本】
v4.0.11
【问题描述】
有两张表 t_wrong_item(数据量2亿),t_question_knowledge(100万),
两表关联查询,关联字段和查询字段都有索引

SELECT tqk.knowledge_id
FROM t_wrong_item twi, t_question_knowledge tqk
WHERE
      twi.student_id = 110655225744
  and TWI.question_id = tqk.question_id;

执行计划如下,小表走了全表扫描,耗时一秒多

Projection_7,1386490.00,16,root,"","time:1.09s, loops:2, Concurrency:4",exam.t_question_knowledge.knowledge_id,9.95 KB,N/A
└─MergeJoin_8,1386490.00,16,root,"","time:1.09s, loops:2","inner join, left key:exam.t_question_knowledge.question_id, right key:exam.t_wrong_item.question_id",83.6 KB,0 Bytes
  ├─IndexReader_24(Build),1974026.70,8,root,"","time:1.12ms, loops:2, cop_task: {num: 1, max: 1.09ms, proc_keys: 8, rpc_num: 1, rpc_time: 1.08ms, copr_cache: disabled}",index:IndexRangeScan_23,567 Bytes,N/A
  │ └─IndexRangeScan_23,1974026.70,8,cop[tikv],"table:twi, index:logic(student_id, question_id)","tikv_task:{time:0s, loops:1}","range:[110655225744,110655225744], keep order:true",N/A,N/A
  └─IndexReader_22(Probe),1109192.00,1109198,root,"","time:853.1ms, loops:1086, cop_task: {num: 2, max: 852.1ms, min: 439.4ms, avg: 645.8ms, p95: 852.1ms, max_proc_keys: 737355, p95_proc_keys: 737355, tot_proc: 984ms, tot_wait: 2ms, rpc_num: 2, rpc_time: 1.29s, copr_cache: disabled}",index:IndexFullScan_21,67.7 MB,N/A
    └─IndexFullScan_21,1109192.00,1109198,cop[tikv],"table:tqk, index:logic(question_id, knowledge_id)","tikv_task:{proc max:473ms, min:251ms, p80:473ms, p95:473ms, iters:1093, tasks:2}",keep order:true,N/A,N/A

手工指定执行计划后,执行计划如下,耗时10ms左右

SELECT  /*+INL_JOIN(tqk)*/tqk.knowledge_id
FROM t_wrong_item twi, t_question_knowledge tqk
WHERE
      twi.student_id = 110655225744
  and TWI.question_id = tqk.question_id;

执行计划如下:

IndexJoin_10,1386490.00,16,root,"","time:1.5ms, loops:2, inner:{total:1.03ms, concurrency:4, task:1, construct:26.4µs, fetch:998µs, build:8.84µs}, probe:8.82µs","inner join, inner:IndexReader_9, outer key:exam.t_wrong_item.question_id, inner key:exam.t_question_knowledge.question_id, equal cond:eq(exam.t_wrong_item.question_id, exam.t_question_knowledge.question_id)",45.6 KB,N/A
├─IndexReader_14(Build),1974026.70,8,root,"","time:465.1µs, loops:3, cop_task: {num: 1, max: 483.6µs, proc_keys: 8, rpc_num: 1, rpc_time: 472.3µs, copr_cache: disabled}",index:IndexRangeScan_13,563 Bytes,N/A
│ └─IndexRangeScan_13,1974026.70,8,cop[tikv],"table:twi, index:logic(student_id, question_id)","tikv_task:{time:0s, loops:1}","range:[110655225744,110655225744], keep order:false",N/A,N/A
└─IndexReader_9(Probe),0.70,16,root,"","time:927.8µs, loops:2, cop_task: {num: 1, max: 850.7µs, proc_keys: 16, tot_proc: 1ms, rpc_num: 1, rpc_time: 839.3µs, copr_cache: disabled}",index:IndexRangeScan_8,1.31 KB,N/A
  └─IndexRangeScan_8,0.70,16,cop[tikv],"table:tqk, index:logic(question_id, knowledge_id)","tikv_task:{time:1ms, loops:1}","range: decided by [eq(exam.t_question_knowledge.question_id, exam.t_wrong_item.question_id)], keep order:false",N/A,N/A

建表语句如下(删除了不必要的字段)

create table t_question_knowledge
(
    question_knowledge_id bigint auto_increment primary key,
    question_id           varchar(64) not null,
    knowledge_id          varchar(64) not null,
    constraint logic unique (question_id, knowledge_id)
);

create index t_question_knowledge_knowledge_id_index on abc.t_question_knowledge (knowledge_id);

create table t_wrong_item
(
    wrong_item_id bigint auto_increment primary key,
    student_id    bigint      not null,
    question_id   varchar(64) not null,
    constraint logic unique (student_id, question_id)
);

stats_healthy是正常的,一个98,一个96;

若提问为性能优化、故障排查类问题，请下载脚本运行。终端输出的打印结果，请务必全选并复制粘贴上传。

yilong · 2021 年3 月 29 日 06:44

可以看到时间较长的使用了 merge join. 具体可以参考文档
https://docs.pingcap.com/zh/tidb/stable/explain-joins#merge-join
执行计划 twi.student_id=110655225744 的行数估算不太准确，实际是8行，plan估计成了1974026行。相同的sql，如果是其他student_id value是否也会耗时很长？
请反馈 explain select * from t_wrong_item twi where twi.student_id = 110655225744; 看看估计的行数是多少
能否反馈下真实的表结构和导出统计信息，多谢。
https://docs.pingcap.com/zh/tidb/stable/statistics#统计信息的导入导出

liusf1993 · 2021 年3 月 29 日 08:33

student_id的区分度是比较大的,同一个student_id下的数据一般应该在1000以内,
explain select * from t_wrong_item twi where twi.student_id = 2;的估算结果为1974523

id	estRows	task	access object	operator info
IndexLookUp_10	1974523.39	root	""	""
├─IndexRangeScan_8(Build)	1974523.39	cop[tikv]	table:twi, index:logic(student_id, question_id)	range:[2,2], keep order:false
└─TableRowIDScan_9(Probe)	1974523.39	cop[tikv]	table:twi	keep order:false

上面sql和查询的字段都是真实字段,只是将一些业务字段省略掉了.
导出的统计信息为
t_wrong_item (641.9 KB)

yilong · 2021 年3 月 29 日 10:08

可以让重新收集一下统计信息。统计信息的 modify 行数很多，收集全表统计信息应该可以解决这个问题。（注意业务低峰期操作）
分析原因：查询条件 ID = 2 和 ID = 110655225744 都在直方图边界范围外，且统计信息的 modify 行数很多，因此用的默认的相关性0.01.

dbsid · 2021 年4 月 2 日 15:35

提交了以下feature request. 对于现实中的大表，一直保持stats_healthy 100%不现实, 总是会存在过滤值越界的情况。如果可以像Oracle一样用linear decay的方式缓慢降低最高值附近越界时的selectivity，这种数量级差异的错误估算和错误的执行计划就可以避免。

https://github.com/pingcap/tidb/issues/23841

dbsid · 2021 年4 月 5 日 10:01

现有统计信息t_wrong_item.student_id上面唯一值有多少？

liusf1993 · 2021 年4 月 6 日 12:55

student_id唯一值目前是100多万

dbsid · 2021 年4 月 6 日 13:00

我看了源代码，越界情况只有唯一值少于100个的情况下才会用到0.01这个默认selectivity. t_wrong_item的统计信息麻烦重新上传一下，之前的我看不了。

outOfRangeBetweenRate=100

func outOfRangeEQSelectivity(ndv, modifyRows, totalRows int64) float64 {
	if modifyRows == 0 {
		return 0 // it must be 0 since the histogram contains the whole data
	}
	if ndv < outOfRangeBetweenRate {
		ndv = outOfRangeBetweenRate // avoid inaccurate selectivity caused by small NDV
	}
	selectivity := 1 / float64(ndv) // TODO: After extracting TopN from histograms, we can minus the TopN fraction here.
	if selectivity*float64(totalRows) > float64(modifyRows) {
		selectivity = float64(modifyRows) / float64(totalRows)
	}
	return selectivity
}

liusf1993 · 2021 年4 月 7 日 04:34

t_wrong_item (641.9 KB)

dbsid · 2021 年4 月 7 日 05:32

表上的列统计信息都是空的。

 "database_name": "exam",
 "table_name": "t_wrong_item",
 "columns": {
  "course_id": {
   "histogram": {
    "ndv": 0
   },
   "cm_sketch": null,
   "null_count": 0,
   "tot_col_size": 4679323,
   " ": 423198647158833153,
   "correlation": 0
  },

索引有统计信息，NDV、CM-Sketch和Histogram都有，索引上面的last_update_version比列上的高 423210776889393153 - 423198647158833153 = 12129730560000

  "indices": {
  "examitemididx": {
   "histogram": {
    "ndv": 192723898,
    "buckets": [
     {
      "count": 4202414,
      "lower_bound": "A4AAAAAAAAAA",
      "upper_bound": "A4AAAAADqq/N",
      "repeats": 1
     },

    ...
           94264
      ]
     }
    ],
    "default_value": 1
   },
   "null_count": 0,
   "tot_col_size": 0,
   "last_update_version": 423210776889393153,
   "correlation": 0
  },

因为student_id列上没有统计信息，ndv=0, 把selectivity使用了1/100

问题的根源在于，是什么情况下，表上的列没有更新统计信息，而索引上有。一个猜想是搜集统计信息的时候，因为是大表数据量多，导致中途退出了，所以列上的统计信息一直没有更新。
因为这个表上的count值是更新的，所以从这种情况下通过stats_healthy并不能发现问题。
这个表的统计信息是自动更新的吗？还是手动收集的？有没有设置tidb_auto_analyze_start_time、tidb_auto_analyze_end_time。

dbsid · 2021 年4 月 7 日 08:10

麻烦跑下：SHOW ANALYZE STATUS;

看看有没有关于表t_wrong_item的记录，没有的话下一次收集该表的统计信息之后跑一下，想看看针对analyze columns的任务状态。这是一个正常的例子。

MySQL [tpcc]> SHOW ANALYZE STATUS;
+--------------+------------+----------------+---------------------------------+----------------+---------------------+----------+
| Table_schema | Table_name | Partition_name | Job_info                        | Processed_rows | Start_time          | State    |
+--------------+------------+----------------+---------------------------------+----------------+---------------------+----------+
| tpcc         | stock      |                | auto analyze columns            |        1000000 | 2021-03-21 02:40:37 | finished |
| tpcc         | stock      |                | auto analyze index PRIMARY      |        1000000 | 2021-03-21 02:40:53 | finished |

liusf1993 · 2021 年4 月 7 日 08:56

tidb_auto_analyze_start_time、tidb_auto_analyze_end_time这两个参数没有手动设置过,应该是用的默认值.

mysql> show variables like '%tidb_auto_analyze_start_time%';
+------------------------------+-------------+
| Variable_name                | Value       |
+------------------------------+-------------+
| tidb_auto_analyze_start_time | 00:00 +0000 |
+------------------------------+-------------+
1 row in set (0.37 sec)

mysql> show variables like '%tidb_auto_analyze_end_time%';
+----------------------------+-------------+
| Variable_name              | Value       |
+----------------------------+-------------+
| tidb_auto_analyze_end_time | 23:59 +0000 |
+----------------------------+-------------+
1 row in set (0.00 sec)

表的统计信息大多应该都是自动更新的,但中间出现过有些sql没有使用合适的执行计划,手动执行过analyze table xxx,但不太确定是否执行过t_wrong_item这张表;请问是否能通过某些关键字搜索到对应的日志信息?
SHOW ANALYZE STATUS 没有 t_wrong_item的信息,请问这个信息默认保存时间是多久?
重新执行analyze table t_wrong_item后可以看到成功信息

| exam         | t_wrong_item               |                | analyze index exercise_result_item_id |      199434926 | 2021-04-07 16:25:22 | finished |
| exam         | t_wrong_item               |                | analyze index examItemIdIdx           |      199434926 | 2021-04-07 16:25:22 | finished |
| exam         | t_wrong_item               |                | analyze index logic                   |      199436398 | 2021-04-07 16:25:22 | finished |
| exam         | t_wrong_item               |                | analyze index wrong_item_paper_id     |      199436574 | 2021-04-07 16:29:30 | finished |
| exam         | t_wrong_item               |                | analyze columns                       |      199436574 | 2021-04-07 16:25:22 | finished |

执行完成后,再次执行原先的sql,执行计划正常了

id	estRows	actRows	task	access object	execution info	operator info	memory	disk
IndexHashJoin_13	3.54	16	root	""	time:2.66ms, loops:2, inner:{total:1.51ms, concurrency:4, task:1, construct:30.4µs, fetch:1.45ms, build:19.3µs, join:18.9µs}	inner join, inner:IndexReader_10, outer key:exam.t_wrong_item.question_id, inner key:exam.t_question_knowledge.question_id, equal cond:eq(exam.t_wrong_item.question_id, exam.t_question_knowledge.question_id)	45.6 KB	N/A
├─IndexReader_26(Build)	2.00	8	root	""	time:1.15ms, loops:3, cop_task: {num: 1, max: 1.15ms, proc_keys: 8, rpc_num: 1, rpc_time: 1.14ms, copr_cache: disabled}	index:IndexRangeScan_25	567 Bytes	N/A
│ └─IndexRangeScan_25	2.00	8	cop[tikv]	table:twi, index:logic(student_id, question_id)	tikv_task:{time:0s, loops:1}	range:[110655225744,110655225744], keep order:false	N/A	N/A
└─IndexReader_10(Probe)	1.77	16	root	""	time:1.39ms, loops:2, cop_task: {num: 1, max: 1.31ms, proc_keys: 16, rpc_num: 1, rpc_time: 1.31ms, copr_cache: disabled}	index:IndexRangeScan_9	1.31 KB	N/A
  └─IndexRangeScan_9	1.77	16	cop[tikv]	table:tqk, index:logic(question_id, knowledge_id)	tikv_task:{time:0s, loops:1}	range: decided by [eq(exam.t_question_knowledge.question_id, exam.t_wrong_item.question_id)], keep order:false	N/A	N/A

另外补充下: 前天将tidb的版本升级到了v4.0.12

dbsid · 2021 年4 月 7 日 09:38

你可以在tidb.log搜索一下analyze table，查看一下这个表的统计信息更新历史，看看有什么线索。

yilong · 2021 年4 月 8 日 04:12

辛苦按照上面的方法试试看能不能在 tidb.log 里找到信息，如果没有的话，看起来应该已经重新手工收集过了且升级了，可以观察一段时间，多谢。

liusf1993 · 2021 年4 月 8 日 04:55

只能找到最近手动执行的记录了,之前可能因为时间太久,被清掉了.

system · 2022 年10 月 31 日 19:15

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。