升级Tidb v6.4.0后,使用with语句做leftjoin,关联后字段部分为空

【 TiDB 使用环境】测试
【 TiDB 版本】V6.4.0
【遇到的问题:问题现象及影响】
在Tidbv6.3.0时,执行相同的sql,数据都是正确的,但是升级到v6.4.0以后,再次执行相同的sql,部分字段全为空

执行sql如下;

with tmp_base_data_pre as
(
    select 
        t1.tenantsid                  
        ,t1.information_enterprise_id            
        ,t1.template_idstr            
        ,t1.information_name  
        ,t1.information_id             
        ,t1.information_version        
        ,t1.information_items         
        ,t1.template_version  
        ,t1.template_name     
        ,t1.template_type     
        ,t1.id                         
        ,t1.complete                   
        ,t1.version                    
        ,t1.datas                      
        ,t1.create_by                  
        ,t1.create_date                
        ,t1.create_name                
        ,t1.create_program             
        ,t1.last_update_by             
        ,t1.last_update_date           
        ,t1.last_update_name           
        ,t1.last_update_program
    from
    (
        select 
            tenantsid                  
            ,information_name           
            ,information_enterprise_id  
            ,information_id             
            ,information_version        
            ,information_items         
            ,JSON_UNQUOTE(JSON_EXTRACT(t.template, '$.idStr'))      as template_idstr       
            ,JSON_UNQUOTE(JSON_EXTRACT(t.template, '$.version'))    as template_version  
            ,JSON_UNQUOTE(JSON_EXTRACT(t.template, '$.name'))       as template_name     
            ,JSON_UNQUOTE(JSON_EXTRACT(t.template, '$.type'))       as template_type     
            ,id                         
            ,complete                   
            ,version                    
            ,datas                      
            ,create_by                  
            ,create_date                
            ,create_name                
            ,create_program             
            ,last_update_by             
            ,last_update_date           
            ,last_update_name           
            ,last_update_program
        from EVALUATION_KPI_QUESTIONNAIRE_RESULT_JSON_DATA t
        where t.complete='true' or t.complete is null
		
    )t1
    join 
    (
	    select 
            tenantsid                  
            ,JSON_UNQUOTE(JSON_EXTRACT(t.template, '$.idStr'))                    as template_idstr           
            ,information_enterprise_id
            ,substring_index(max(concat(last_update_date,'_',id)),'_',1)          as last_update_date			
            ,substring_index(max(concat(last_update_date,'_',id)),'_',-1)         as id			
	    from EVALUATION_KPI_QUESTIONNAIRE_RESULT_JSON_DATA t
        where t.complete='true' or t.complete is null
	    group by tenantsid,information_enterprise_id,JSON_UNQUOTE(JSON_EXTRACT(t.template, '$.idStr'))
	)t2
    on(t1.tenantsid=t2.tenantsid and t1.template_idstr=t2.template_idstr and t1.information_enterprise_id=t2.information_enterprise_id and t1.last_update_date=t2.last_update_date and t1.id=t2.id)	
), 
tmp_value_list as 
(
    select 
        tenantsid
        ,information_enterprise_id
        ,template_idstr
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items,concat('\$\[',h.help_topic_id,'\].isPrimary'))) as information_items_isprimary
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items,concat('\$\[',h.help_topic_id,'\].way'))      ) as information_items_way
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items,concat('\$\[',h.help_topic_id,'\].name'))     ) as information_items_name
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items,concat('\$\[',h.help_topic_id,'\].value'))    ) as information_items_value
    from tmp_base_data_pre t
    inner join HELP_TOPIC h 
    on(h.help_topic_id<json_length(t.information_items))
),
tmp_value_area AS
(
    select 
        tenantsid
        ,information_enterprise_id
        ,template_idstr        
        ,information_items_value_level
        ,information_items_value_adcode
        ,information_items_value_id
        ,information_items_value_name
		,ROW_NUMBER()over(partition by tenantsid order by template_idstr desc) as test_rank
    from (
	select 
        tenantsid
        ,information_enterprise_id
        ,template_idstr        
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items_value,concat('\$\[',h.help_topic_id,'\].level'))    ) as information_items_value_level
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items_value,concat('\$\[',h.help_topic_id,'\].adcode'))    ) as information_items_value_adcode
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items_value,concat('\$\[',h.help_topic_id,'\].id'))    ) as information_items_value_id
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items_value,concat('\$\[',h.help_topic_id,'\].name'))    ) as information_items_value_name
    from tmp_value_list t
    inner join HELP_TOPIC h 
    on(h.help_topic_id<json_length(t.information_items_value) and t.information_items_way='地区选择' and (t.information_items_isprimary='true' or t.information_items_isprimary is null))
    where t.information_items_way='地区选择' and (t.information_items_isprimary='true' or t.information_items_isprimary is null)
	)t
),
tmp_value_industry AS
(
    select 
        tenantsid
        ,information_enterprise_id
        ,template_idstr        
        ,information_items_value_level
        ,information_items_value_code
        ,information_items_value_id
        ,information_items_value_name
		,ROW_NUMBER()over(partition by tenantsid order by template_idstr desc) as test_rank
	from (
	select 
        tenantsid
        ,information_enterprise_id
        ,template_idstr        
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items_value,concat('\$\[',h.help_topic_id,'\].level'))    ) as information_items_value_level
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items_value,concat('\$\[',h.help_topic_id,'\].code'))    ) as information_items_value_code
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items_value,concat('\$\[',h.help_topic_id,'\].id'))    ) as information_items_value_id
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items_value,concat('\$\[',h.help_topic_id,'\].name'))    ) as information_items_value_name
    from tmp_value_list t
    inner join HELP_TOPIC h 
    on(h.help_topic_id<json_length(t.information_items_value) and t.information_items_way='行业选择' and (t.information_items_isprimary='true' or t.information_items_isprimary is null))
    where t.information_items_way='行业选择' and (t.information_items_isprimary='true' or t.information_items_isprimary is null)
	)t
),
tmp_value_scale AS
(
    select 
        tenantsid
        ,information_enterprise_id
        ,template_idstr        
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items_value,concat('\$\[',h.help_topic_id,'\].id'))    ) as information_items_value_id
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items_value,concat('\$\[',h.help_topic_id,'\].name'))    ) as information_items_value_name
    from tmp_value_list t
    inner join HELP_TOPIC h 
    on(h.help_topic_id<json_length(t.information_items_value) and t.information_items_way='企业规模' and (t.information_items_isprimary='true' or t.information_items_isprimary is null))
    where t.information_items_way='企业规模' and (t.information_items_isprimary='true' or t.information_items_isprimary is null)
),
tmp_indicator_list as 
(
    select 
        tenantsid
        ,information_enterprise_id
        ,template_idstr
        ,JSON_UNQUOTE(JSON_EXTRACT(t.datas,concat('\$\[',h.help_topic_id,'\].id')))     as indicator_id
        ,JSON_UNQUOTE(JSON_EXTRACT(t.datas,concat('\$\[',h.help_topic_id,'\].name')))   as indicator_name
        ,JSON_UNQUOTE(JSON_EXTRACT(t.datas,concat('\$\[',h.help_topic_id,'\].unit')))   as indicator_unit
        ,JSON_UNQUOTE(JSON_EXTRACT(t.datas,concat('\$\[',h.help_topic_id,'\].result'))) as indicator_value
    from tmp_base_data_pre t
    inner join HELP_TOPIC h 
    on(h.help_topic_id<json_length(t.datas))
)
select 
    t1.tenantsid
    ,t1.information_name
    ,t1.information_enterprise_id
    ,t1.information_id
    ,t1.information_version
    ,t21.information_items_value_id                          as province_id 
    ,t21.information_items_value_adcode                      as province_adcode 
    ,t21.information_items_value_name                        as province_name 
    ,t22.information_items_value_id                          as city_id 
    ,t22.information_items_value_adcode                      as city_adcode 
    ,t22.information_items_value_name                        as city_name 
    ,t23.information_items_value_id                          as district_id 
    ,t23.information_items_value_adcode                      as district_adcode 
    ,t23.information_items_value_name                        as district_name 
    ,t31.information_items_value_code                        as category_code
    ,t31.information_items_value_name                        as category_name
    ,t32.information_items_value_code                        as division_code
    ,t32.information_items_value_name                        as division_name
	,t33.information_items_value_code                        as group_code
    ,t33.information_items_value_name                        as group_name
	,t34.information_items_value_code                        as class_code
    ,t34.information_items_value_name                        as class_name
	,t4.information_items_value_id                           as scale_id
    ,t4.information_items_value_name                         as scale_name
	,t1.template_idstr               
    ,t1.template_version          
    ,t1.template_name             
    ,t1.template_type             
    ,t1.id                                                   as questionnaire_id                      
    ,t1.complete                  
    ,t1.version                   
    ,t1.create_by                 
    ,t1.create_date               
    ,t1.create_name               
    ,t1.create_program            
    ,t1.last_update_by            
    ,t1.last_update_date          
    ,t1.last_update_name          
    ,t1.last_update_program
	,t5.indicator_id
    ,t5.indicator_name
    ,t5.indicator_unit
    ,t5.indicator_value
from tmp_base_data_pre t1
left join tmp_value_area t21
on(t1.tenantsid=t21.tenantsid and t1.template_idstr=t21.template_idstr and t1.information_enterprise_id=t21.information_enterprise_id and t21.information_items_value_level='province')
left join tmp_value_area t22
on(t1.tenantsid=t22.tenantsid and t1.template_idstr=t22.template_idstr and t1.information_enterprise_id=t22.information_enterprise_id and t22.information_items_value_level='city' and substr(t21.information_items_value_adcode,1,2)=substr(t22.information_items_value_adcode,1,2))
left join tmp_value_area t23
on(t1.tenantsid=t23.tenantsid and t1.template_idstr=t23.template_idstr and t1.information_enterprise_id=t23.information_enterprise_id and t23.information_items_value_level='district' and substr(t22.information_items_value_adcode,1,4)=substr(t23.information_items_value_adcode,1,4))
left join tmp_value_industry t31
on(t1.tenantsid=t31.tenantsid and t1.template_idstr=t31.template_idstr and t1.information_enterprise_id=t31.information_enterprise_id and t31.information_items_value_level='1')
left join tmp_value_industry t32
on(t1.tenantsid=t32.tenantsid and t1.template_idstr=t32.template_idstr and t1.information_enterprise_id=t32.information_enterprise_id and t32.information_items_value_level='2')
left join tmp_value_industry t33
on(t1.tenantsid=t33.tenantsid and t1.template_idstr=t33.template_idstr and t1.information_enterprise_id=t33.information_enterprise_id and t33.information_items_value_level='3')
left join tmp_value_industry t34
on(t1.tenantsid=t34.tenantsid and t1.template_idstr=t34.template_idstr and t1.information_enterprise_id=t34.information_enterprise_id and t34.information_items_value_level='4')
left join tmp_value_scale t4
on(t1.tenantsid=t4.tenantsid and t1.template_idstr=t4.template_idstr and t1.information_enterprise_id=t4.information_enterprise_id)
left join tmp_indicator_list t5
on(t1.tenantsid=t5.tenantsid and t1.template_idstr=t5.template_idstr and t1.information_enterprise_id=t5.information_enterprise_id)

执行结果:

这可能是由于升级到v6.4.0后,某些配置文件参数被转换为系统变量,导致系统行为发生了变化。建议您检查一下升级后的配置文件和系统变量是否正确设置。您可以参考 TiDB 官方文档中的配置文件参数和系统变量的说明,以确保它们的设置是正确的。

此外,您还可以尝试使用 TiDB 的日志功能来查看执行 SQL 时的详细信息,以便更好地了解问题所在。您可以使用 TiDB 的慢查询日志和错误日志来查看执行 SQL 的详细信息,以及任何可能的错误或异常情况。

具体是哪个配置,有范围吗?因为没有手动改过tidb的配置,都是用的官方默认的,而且,我在另外一个区直接部署了v6.4.0,效果和升级v6.4.0是一样的,都是很多空

你是从哪个版本升级起来的?由于 6.4.0 不提供迭代版本,要不要考虑先升级到 6.5.2 看看?

1 个赞

我有个猜测,不一定对。

6.4对join升级了一个新特性。
如果6.3正常,6.4不正常,这个新特性引入导致的可能性比较大。

https://github.com/pingcap/tidb/issues/37825

由于动态规划算法的枚举过程可能消耗更多的时间,目前 Join Reorder 算法由变量 [tidb_opt_join_reorder_threshold](https://docs-archive.pingcap.com/zh/tidb/v6.4/system-variables#tidb_opt_join_reorder_threshold) 控制,当参与 Join Reorder 的节点个数大于该阈值时选择贪心算法,反之选择动态规划算法。

注意文档里面提到这个tidb_opt_join_reorder_threshold的默认值是0.
也就是说默认还是走原来的贪心算法。
我猜想是不是有人调整过这个参数,导致连接算法变了?

你可以排查一下这个变量的值现在是多少。如果不是0,把它调整到到0试试看。
反正从升级的内容来看,这个地方是比较可疑的。

1 个赞

楼上猜的有点像

这种参数不大可能有人改吧

这种参数不大可能有人改吧,这不算是缺陷吗

觉得是缺陷没问题的,有复现方法可以直接上github/论坛提的。
我上周提的bug,现在都close了。效率还是可以的。

https://github.com/pingcap/tidb/issues/44848

神速,厉害