这个sql还有优化空间吗

explain SELECT
company.NAME AS NAME,
company.reg_province AS province,
count( ‘’ ) AS count
FROM
standard_drafting_unit_relation
INNER JOIN company ON standard_drafting_unit_relation.drafting_unit_id = company.id
INNER JOIN standard ON standard_drafting_unit_relation.standard_id = standard.id
INNER JOIN ( SELECT company_industry_relation.company_id AS company_id FROM company_industry_relation WHERE company_industry_relation.industry_id IN ( ‘INB0101’ ) GROUP BY company_industry_relation.company_id ) AS a ON a.company_id = company.id
INNER JOIN (
SELECT
patent_proposer_relation.proposer_id AS company_id
FROM
patent_proposer_relation
INNER JOIN patent ON patent_proposer_relation.patent_id = patent.id
WHERE
patent.approval_date IS NOT NULL
AND patent.type = ‘发明’
GROUP BY
patent_proposer_relation.proposer_id
) AS b ON b.company_id = company.id
WHERE
standard.category = ‘国家标准’
AND company.display = 1
AND company.status_std IN ( ‘存续(在营、开业、在业)’ )
AND company.establish_date <= ‘2023-12-24’
AND company.reg_province IN ( ‘河北省’ )
GROUP BY
company.NAME,
company.reg_province
ORDER BY
count( '
’ ) DESC
LIMIT 30;

把执行计划贴文本里,operator info弄全

建议执行explain analyze,获取最真实的执行计划。

楼主这是什么业务,这么长个SQL不好好优化一下。

+1 :joy_cat:

explain analyze和表结构建议贴出来看看。一般检查下统计信息,索引,join的方式。

INNER JOIN ( 
	SELECT 
		company_industry_relation.company_id AS company_id 
	FROM company_industry_relation 
	WHERE company_industry_relation.industry_id IN ('INB0101') 
	GROUP BY company_industry_relation.company_id 
) AS a ON a.company_id = company.id

这一段放到where条件改成这样试一下
AND company.id IN (SELECT company_id FROM company_industry_relation WHERE industry_id IN ('INB0101'))

这一段也一样处理

INNER JOIN (
	SELECT patent_proposer_relation.proposer_id AS company_id FROM patent_proposer_relation
	INNER JOIN patent ON patent_proposer_relation.patent_id = patent.id
	WHERE patent.approval_date IS NOT NULL
	AND patent.type = '发明'
	GROUP BY patent_proposer_relation.proposer_id
) AS b ON b.company_id = company.id

放一下格式化之后的sql,不格式化,看起来太难受了~强迫症表示受不了~

SELECT
	company.NAME AS NAME,
	company.reg_province AS province,
	count('') AS count
FROM standard_drafting_unit_relation
INNER JOIN company ON standard_drafting_unit_relation.drafting_unit_id = company.id
INNER JOIN standard ON standard_drafting_unit_relation.standard_id = standard.id
INNER JOIN ( 
	SELECT company_id FROM company_industry_relation 
	WHERE industry_id IN ('INB0101') 
	GROUP BY company_id 
) AS a ON a.company_id = company.id
INNER JOIN (
	SELECT patent_proposer_relation.proposer_id AS company_id FROM patent_proposer_relation
	INNER JOIN patent ON patent_proposer_relation.patent_id = patent.id
	WHERE patent.approval_date IS NOT NULL
	AND patent.type = '发明'
	GROUP BY patent_proposer_relation.proposer_id
) AS b ON b.company_id = company.id
WHERE standard.category = '国家标准'
AND company.display = 1
AND company.status_std IN ('存续(在营、开业、在业)')
AND company.establish_date <= '2023-12-24'
AND company.reg_province IN ('河北省')
GROUP BY company.NAME,company.reg_province
ORDER BY count('') DESC
LIMIT 30;

原来就是写的子查询,也很慢,然后听取别人的意见写的这样关联

感觉不是所有的表都有tiflash副本。都添加了tiflash副本,然后尝试mpp执行看看。
这类group by的提升会比较大。

贴一下建表语句看一下。如果子表数据量比较小,in的写法应该是快的。

这sql是国企的吗? :joy:需要看看 standard_drafting_unit_relation的表结构,这个地方竟然使用到了tiflash

感觉执行计划还可以

写出这个语句的也是神人,为什么搞这么多子查询

要想优化首先得改写sql,join为什么要用那么多子查询我也是看不明白

SELECT company.NAME AS NAME,
         company.reg_province AS province,
         count('') AS count
FROM standard_drafting_unit_relation
INNER JOIN company
    ON standard_drafting_unit_relation.drafting_unit_id = company.id
INNER JOIN standard
    ON standard_drafting_unit_relation.standard_id = standard.id
INNER JOIN company_industry_relation a
    ON a.company_id = company.id
        AND industry_id IN ('INB0101')
INNER JOIN patent_proposer_relation
    ON company.id=patent_proposer_relation.proposer_id
INNER JOIN patent
    ON patent_proposer_relation.patent_id = patent.id
        AND patent.approval_date IS NOT NULL
        AND patent.type = '发明'
WHERE standard.category = '国家标准'
        AND company.display = 1
        AND company.status_std IN ('存续(在营、开业、在业)')
        AND company.establish_date <= '2023-12-24'
        AND company.reg_province IN ('河北省')
GROUP BY  company.NAME,company.reg_province
ORDER BY  count('') DESC LIMIT 30;

子查询去掉是不是会好一些