查询INFORMATION_SCHEMA.COLUMNS 单表元数据慢

【 TiDB 使用环境】生产环境
【 TiDB 版本】v7.5.1
【复现路径】做过哪些操作出现的问题
【遇到的问题:问题现象及影响】
【资源配置】进入到 TiDB Dashboard -集群信息 (Cluster Info) -主机(Hosts) 截图此页面
【附件:截图/日志/监控】

SELECT
TABLE_SCHEMA,
NULL,
TABLE_NAME,
COLUMN_NAME,
CASE

	WHEN UPPER( DATA_TYPE )= 'DECIMAL' THEN
	3 
	WHEN UPPER( DATA_TYPE )= 'DECIMAL UNSIGNED' THEN
	3 
	WHEN UPPER( DATA_TYPE )= 'TINYINT' THEN
CASE
		
		WHEN LOCATE( '(1)', COLUMN_TYPE ) != 0 THEN
		- 7 ELSE - 6 
	END 
		WHEN UPPER( DATA_TYPE )= 'TINYINT UNSIGNED' THEN
	CASE
			
			WHEN LOCATE( '(1)', COLUMN_TYPE ) != 0 THEN
			- 7 ELSE - 6 
		END 
			WHEN UPPER( DATA_TYPE )= 'BOOLEAN' THEN
			16 
			WHEN UPPER( DATA_TYPE )= 'SMALLINT' THEN
			5 
			WHEN UPPER( DATA_TYPE )= 'SMALLINT UNSIGNED' THEN
			5 
			WHEN UPPER( DATA_TYPE )= 'INT' THEN
			4 
			WHEN UPPER( DATA_TYPE )= 'INT UNSIGNED' THEN
			4 
			WHEN UPPER( DATA_TYPE )= 'FLOAT' THEN
			7 
			WHEN UPPER( DATA_TYPE )= 'FLOAT UNSIGNED' THEN
			7 
			WHEN UPPER( DATA_TYPE )= 'DOUBLE' THEN
			8 
			WHEN UPPER( DATA_TYPE )= 'DOUBLE UNSIGNED' THEN
			8 
			WHEN UPPER( DATA_TYPE )= 'NULL' THEN
			0 
			WHEN UPPER( DATA_TYPE )= 'TIMESTAMP' THEN
			93 
			WHEN UPPER( DATA_TYPE )= 'BIGINT' THEN
			- 5 
			WHEN UPPER( DATA_TYPE )= 'BIGINT UNSIGNED' THEN
			- 5 
			WHEN UPPER( DATA_TYPE )= 'MEDIUMINT' THEN
			4 
			WHEN UPPER( DATA_TYPE )= 'MEDIUMINT UNSIGNED' THEN
			4 
			WHEN UPPER( DATA_TYPE )= 'DATE' THEN
			91 
			WHEN UPPER( DATA_TYPE )= 'TIME' THEN
			92 
			WHEN UPPER( DATA_TYPE )= 'DATETIME' THEN
			93 
			WHEN UPPER( DATA_TYPE )= 'YEAR' THEN
			91 
			WHEN UPPER( DATA_TYPE )= 'VARCHAR' THEN
			12 
			WHEN UPPER( DATA_TYPE )= 'VARBINARY' THEN
			- 3 
			WHEN UPPER( DATA_TYPE )= 'BIT' THEN
			- 7 
			WHEN UPPER( DATA_TYPE )= 'JSON' THEN
			- 1 
			WHEN UPPER( DATA_TYPE )= 'ENUM' THEN
			1 
			WHEN UPPER( DATA_TYPE )= 'SET' THEN
			1 
			WHEN UPPER( DATA_TYPE )= 'TINYBLOB' THEN
			- 3 
			WHEN UPPER( DATA_TYPE )= 'TINYTEXT' THEN
			12 
			WHEN UPPER( DATA_TYPE )= 'MEDIUMBLOB' THEN
			- 4 
			WHEN UPPER( DATA_TYPE )= 'MEDIUMTEXT' THEN
			- 1 
			WHEN UPPER( DATA_TYPE )= 'LONGBLOB' THEN
			- 4 
			WHEN UPPER( DATA_TYPE )= 'LONGTEXT' THEN
			- 1 
			WHEN UPPER( DATA_TYPE )= 'BLOB' THEN
			- 4 
			WHEN UPPER( DATA_TYPE )= 'TEXT' THEN
			- 1 
			WHEN UPPER( DATA_TYPE )= 'CHAR' THEN
			1 
			WHEN UPPER( DATA_TYPE )= 'BINARY' THEN
			- 2 
			WHEN UPPER( DATA_TYPE )= 'GEOMETRY' THEN
			- 2 
			WHEN UPPER( DATA_TYPE )= 'UNKNOWN' THEN
			1111 
			WHEN UPPER( DATA_TYPE )= 'POINT' THEN
			- 2 
			WHEN UPPER( DATA_TYPE )= 'LINESTRING' THEN
			- 2 
			WHEN UPPER( DATA_TYPE )= 'POLYGON' THEN
			- 2 
			WHEN UPPER( DATA_TYPE )= 'MULTIPOINT' THEN
			- 2 
			WHEN UPPER( DATA_TYPE )= 'MULTILINESTRING' THEN
			- 2 
			WHEN UPPER( DATA_TYPE )= 'MULTIPOLYGON' THEN
			- 2 
			WHEN UPPER( DATA_TYPE )= 'GEOMETRYCOLLECTION' THEN
			- 2 
			WHEN UPPER( DATA_TYPE )= 'GEOMCOLLECTION' THEN
			- 2 ELSE 1111 
		END AS DATA_TYPE,
		UPPER(
		CASE
				
				WHEN LOCATE(
					'UNSIGNED',
				UPPER( COLUMN_TYPE )) != 0 
				AND LOCATE(
					'UNSIGNED',
				UPPER( DATA_TYPE )) = 0 
				AND LOCATE(
					'SET',
				UPPER( DATA_TYPE )) <> 1 
				AND LOCATE(
					'ENUM',
					UPPER( DATA_TYPE )) <> 1 THEN
					CONCAT( DATA_TYPE, ' UNSIGNED' ) 
					WHEN UPPER( DATA_TYPE )= 'TINYINT' THEN
				CASE
						
						WHEN LOCATE( '(1)', COLUMN_TYPE ) != 0 THEN
						'BIT' 
						WHEN LOCATE(
							'UNSIGNED',
						UPPER( COLUMN_TYPE )) != 0 
						AND LOCATE(
							'UNSIGNED',
							UPPER( DATA_TYPE )) = 0 THEN
							'TINYINT UNSIGNED' ELSE DATA_TYPE 
						END 
							WHEN UPPER( DATA_TYPE )= 'POINT' THEN
							'GEOMETRY' 
							WHEN UPPER( DATA_TYPE )= 'LINESTRING' THEN
							'GEOMETRY' 
							WHEN UPPER( DATA_TYPE )= 'POLYGON' THEN
							'GEOMETRY' 
							WHEN UPPER( DATA_TYPE )= 'MULTIPOINT' THEN
							'GEOMETRY' 
							WHEN UPPER( DATA_TYPE )= 'MULTILINESTRING' THEN
							'GEOMETRY' 
							WHEN UPPER( DATA_TYPE )= 'MULTIPOLYGON' THEN
							'GEOMETRY' 
							WHEN UPPER( DATA_TYPE )= 'GEOMETRYCOLLECTION' THEN
							'GEOMETRY' 
							WHEN UPPER( DATA_TYPE )= 'GEOMCOLLECTION' THEN
							'GEOMETRY' ELSE UPPER( DATA_TYPE ) 
						END 
						) AS TYPE_NAME,
						UPPER(
						CASE
								
								WHEN UPPER( DATA_TYPE )= 'DATE' THEN
								10 
								WHEN UPPER( DATA_TYPE )= 'TIME' THEN
								8+ ( CASE WHEN DATETIME_PRECISION > 0 THEN DATETIME_PRECISION + 1 ELSE DATETIME_PRECISION END ) 
								WHEN UPPER( DATA_TYPE )= 'DATETIME' 
								OR UPPER( DATA_TYPE )= 'TIMESTAMP' THEN
									19+ ( CASE WHEN DATETIME_PRECISION > 0 THEN DATETIME_PRECISION + 1 ELSE DATETIME_PRECISION END ) 
									WHEN UPPER( DATA_TYPE )= 'YEAR' THEN
									4 
									WHEN UPPER( DATA_TYPE )= 'TINYINT' 
									AND LOCATE( '(1)', COLUMN_TYPE ) != 0 THEN
										1 
										WHEN UPPER( DATA_TYPE )= 'MEDIUMINT' 
										AND LOCATE(
											'UNSIGNED',
											UPPER( COLUMN_TYPE )) != 0 THEN
											8 
											WHEN UPPER( DATA_TYPE )= 'JSON' THEN
											1073741824 
											WHEN UPPER( DATA_TYPE )= 'GEOMETRY' THEN
											65535 
											WHEN UPPER( DATA_TYPE )= 'POINT' THEN
											65535 
											WHEN UPPER( DATA_TYPE )= 'LINESTRING' THEN
											65535 
											WHEN UPPER( DATA_TYPE )= 'POLYGON' THEN
											65535 
											WHEN UPPER( DATA_TYPE )= 'MULTIPOINT' THEN
											65535 
											WHEN UPPER( DATA_TYPE )= 'MULTILINESTRING' THEN
											65535 
											WHEN UPPER( DATA_TYPE )= 'MULTIPOLYGON' THEN
											65535 
											WHEN UPPER( DATA_TYPE )= 'GEOMETRYCOLLECTION' THEN
											65535 
											WHEN UPPER( DATA_TYPE )= 'GEOMCOLLECTION' THEN
											65535 
											WHEN CHARACTER_MAXIMUM_LENGTH IS NULL THEN
											NUMERIC_PRECISION 
											WHEN CHARACTER_MAXIMUM_LENGTH > 2147483647 THEN
											2147483647 ELSE CHARACTER_MAXIMUM_LENGTH 
										END 
										) AS COLUMN_SIZE,
										65535 AS BUFFER_LENGTH,
										UPPER(
										CASE
												
												WHEN UPPER( DATA_TYPE )= 'DECIMAL' THEN
												NUMERIC_SCALE 
												WHEN UPPER( DATA_TYPE )= 'FLOAT' 
												OR UPPER( DATA_TYPE )= 'DOUBLE' THEN
												CASE
														
														WHEN NUMERIC_SCALE IS NULL THEN
														0 ELSE NUMERIC_SCALE 
												END ELSE NULL 
												END 
												) AS DECIMAL_DIGITS,
												10 AS NUM_PREC_RADIX,
											CASE
													
													WHEN IS_NULLABLE = 'NO' THEN
													0 ELSE
												CASE
														
														WHEN IS_NULLABLE = 'YES' THEN
														1 ELSE 2 
													END 
													END AS NULLABLE,
													COLUMN_COMMENT AS REMARKS,
													COLUMN_DEFAULT AS COLUMN_DEF,
													0 AS SQL_DATA_TYPE,
													0 AS SQL_DATETIME_SUB,
												CASE
														
														WHEN CHARACTER_OCTET_LENGTH > 2147483647 THEN
														2147483647 ELSE CHARACTER_OCTET_LENGTH 
													END AS CHAR_OCTET_LENGTH,
													ORDINAL_POSITION,
													IS_NULLABLE,
													NULL AS SCOPE_CATALOG,
													NULL AS SCOPE_SCHEMA,
													NULL AS SCOPE_TABLE,
													NULL AS SOURCE_DATA_TYPE,
												IF
													( EXTRA LIKE '%auto_increment%', 'YES', 'NO' ) AS IS_AUTOINCREMENT,
												IF
													( EXTRA LIKE '%GENERATED%', 'YES', 'NO' ) AS IS_GENERATEDCOLUMN 
												FROM
													INFORMATION_SCHEMA.COLUMNS 
												WHERE
													TABLE_SCHEMA = 'JHDL' 
													AND TABLE_NAME LIKE 'JHDL_29_SQLUSER.DHC_TARSUBCATE' 
													AND COLUMN_NAME LIKE '%' 
												ORDER BY
													TABLE_SCHEMA,
												TABLE_NAME,
ORDINAL_POSITION

这个sql需要执行3-5秒,而且执行越多越慢

EXPLAIN ANALYZE 看下实际的执行计划


太慢了

DatabaseMetaData metaData = conn.getMetaData();
tableFieldResultSet = metaData.getColumns(catalog, schema, tableName, “%”);

你集群中的表数量很多吗?columns表count(*)看看

当在TiDB中查询INFORMATION_SCHEMA.COLUMNS单表元数据变慢时,有几个可能的原因需要考虑:

  1. 统计信息不准确:如果统计信息不准确或过期,TiDB可能会做出不够优化的执行计划。你可以尝试使用ANALYZE TABLE语句来更新表的统计信息,以帮助优化执行计划。
  2. 查询优化:检查查询语句是否能够被优化。有时候,复杂的查询或者查询条件没有合适的索引可能会导致查询变慢。
  3. 资源限制:检查系统资源是否受到限制,例如CPU、内存和磁盘I/O。如果资源受限,TiDB可能无法高效地执行查询。
  4. 系统负载:如果系统负载较高,查询性能可能会受到影响。确保系统负载处于可接受范围内,如果可能的话,可以考虑增加硬件资源或者优化查询分布。
  5. TiDB版本:确保你正在使用的TiDB版本是最新的,并且已经应用了任何相关的性能优化补丁或更新。

Tidb在表元数据管理方面确实应该多向oracle学习。可能说得不对,因为没有深入研究为什么查元数据回这样低效。但是从现象来看,确实比较让人confused。我们生产数据库top sql,居然50%是查information_schema的慢查询,这就很有意思了。

TiDB查元数据有时候是很慢,不知道是不是TiDB自身不足的原因


数量不多十几万吧

这个表好像是系统表分析不了

是从6.5.3升级到7.5.1之后这个问题才凸显的,6版本没有发现这个问题,升级到7.5.1后影响到功能使用

可能是版本问题

execution_info里面看下哪一步的耗时比较久

执行计划呢

你可以再搭一个测试环境,光把表结构弄过去,看下系统表还慢不慢。。。



这是慢查询里面的图片

我们测试环境columns表在8w左右,查询需要1-2s

当查询大量获取表结构信息时,这个sql会慢到9s

schema应该变化不大呀,会这么慢呀