SQL查询结果不一致,会少数据

【 TiDB 使用环境】生产环境
【 TiDB 版本】v6.5.3
情况说明

  • 通过参数设置,强制走tiflash mpp
  • 多次查询结果不一致,少一条数据,并且少的是固定的一条
  • 如果让集群自动选择查询引擎,结果正常
  • 如果将sql中3个left join改写成两个最后在union,结果正常

【explain analyze执行计划对比】

sql如下

SELECT
        full_data.field1 AS field1,
        full_data.field2 AS field2,
    (CASE WHEN t1.total IS NULL THEN 0 ELSE t1.total END)   +(CASE WHEN t2.total IS NULL THEN 0 ELSE t2.total END) AS total
    FROM
    (SELECT
            CASE
            WHEN group0.group_id IS NOT NULL THEN CONCAT(group0.group_id, '(组)')
            ELSE dimension0.`patent_type`
            END AS field1,
            CASE
            WHEN group1.group_id IS NOT NULL THEN CONCAT(group1.group_id, '(组)')
            ELSE dimension1.`country`
            END AS field2
    FROM dw_ext.`t_1692603597355` t
            INNER JOIN ads_tab11 dimension0 ON dimension0.pdoc_id = t.pdoc_id
            LEFT JOIN biz.ads_tab22 group0 ON group0.group_value = dimension0.`patent_type` and group0.field = 'PATENT_TYPE' and group0.identifier_id = '0fb4b7e7fe0642618a1e7288faf7d8ae' 
            INNER JOIN ads_tab33 dimension1 ON dimension1.pdoc_id = t.pdoc_id
            LEFT JOIN biz.ads_tab22 group1 ON group1.group_value = dimension1.`country` and group1.field = 'COUNTRY' and group1.identifier_id = '0fb4b7e7fe0642618a1e7288faf7d8ae' 
    WHERE 1=1
            AND dimension0.`patent_type` in ('A','B','U','D') AND dimension0.`patent_type` IS NOT NULL
            AND dimension1.`country` in ('US','CN') AND dimension1.`country` IS NOT NULL
    GROUP BY field1,field2
    HAVING COUNT(t.pdoc_id) > 0
    ) full_data
    LEFT JOIN
        (
    SELECT field1,field2,COUNT(`pn`) AS total
    FROM (
    SELECT
            CASE
            WHEN group0.group_id IS NOT NULL THEN CONCAT(group0.group_id, '(组)')
            ELSE dimension0.`patent_type`
            END AS field1,
            CASE
            WHEN group1.group_id IS NOT NULL THEN CONCAT(group1.group_id, '(组)')
            ELSE dimension1.`country`
            END AS field2,dimension1.`pn`
    FROM dw_ext.`t_1692603597355` t
            INNER JOIN ads_tab11 dimension0 ON dimension0.pdoc_id = t.pdoc_id
            LEFT JOIN biz.ads_tab22 group0 ON group0.group_value = dimension0.`patent_type` and group0.field = 'PATENT_TYPE' and group0.identifier_id = '0fb4b7e7fe0642618a1e7288faf7d8ae' 
            INNER JOIN ads_tab33 dimension1 ON dimension1.pdoc_id = t.pdoc_id
            LEFT JOIN biz.ads_tab22 group1 ON group1.group_value = dimension1.`country` and group1.field = 'COUNTRY' and group1.identifier_id = '0fb4b7e7fe0642618a1e7288faf7d8ae' 
    WHERE 1=1
            AND dimension0.`patent_type` in ('A','B','U','D') AND dimension0.`patent_type` IS NOT NULL
            AND dimension1.`country` in ('US','CN') AND dimension1.`country` IS NOT NULL
        AND dimension1.`pn` IS NOT NULL
                AND dimension1.`country` IN ('US')
    GROUP BY t.pdoc_id, dimension1.`pn`,field1,field2
    ) T
    GROUP BYfield1,field2
    ORDER BY total DESC,field1 ASC,field2 ASC
        ) t1 ON
            t1.field1 = full_data.field1 AND 
            t1.field2 = full_data.field2
        LEFT JOIN
        (
    SELECT field1,field2,COUNT(`pn`) AS total
    FROM (
    SELECT
            CASE
            WHEN group0.group_id IS NOT NULL THEN CONCAT(group0.group_id, '(组)')
            ELSE dimension0.`patent_type`
            END AS field1,
            CASE
            WHEN group1.group_id IS NOT NULL THEN CONCAT(group1.group_id, '(组)')
            ELSE dimension1.`country`
            END AS field2,dimension1.`pn`
    FROM
    dw_ext.`t_1692603597355` t
            INNER JOIN ads_tab11 dimension0 ON dimension0.pdoc_id = t.pdoc_id
            LEFT JOIN biz.ads_tab22 group0 ON group0.group_value = dimension0.`patent_type` and group0.field = 'PATENT_TYPE' and group0.identifier_id = '0fb4b7e7fe0642618a1e7288faf7d8ae' 
            INNER JOIN ads_tab33 dimension1 ON dimension1.pdoc_id = t.pdoc_id
            LEFT JOIN biz.ads_tab22 group1 ON group1.group_value = dimension1.`country` and group1.field = 'COUNTRY' and group1.identifier_id = '0fb4b7e7fe0642618a1e7288faf7d8ae' 
    WHERE 1=1
            AND dimension0.`patent_type` in ('A','B','U','D') AND dimension0.`patent_type` IS NOT NULL
            AND dimension1.`country` in ('US','CN') AND dimension1.`country` IS NOT NULL
        AND dimension1.`pn` IS NOT NULL
                AND dimension1.`country` IN ('CN')
    GROUP BY t.pdoc_id, dimension1.`pn`,field1,field2
    ) T
    GROUP BY field1,field2
    ORDER BY total DESC,field1 ASC,field2 ASC
        ) t2 ON
            t2.field1 = full_data.field1 AND 
            t2.field2 = full_data.field2   
    GROUP BY full_data.field1,full_data.field2
    HAVING total > 0
    ORDER BY total DESC,field1 ASC,field2 ASC
    LIMIT   30000;

换成 INNER JOIN 呢?

少的那条有什么特殊地方吗,比如null值一类

最后的这一段是不是没必要?SELECT里也没有聚合函数。

GROUP BY full_data.field1,full_data.field2
HAVING total > 0

另外,这种语句中 IS NOT NULL 应该是可以省略的。加上估计效率不高。

AND dimension0.`patent_type` in ('A','B','U','D') 
AND dimension0.`patent_type` IS NOT NULL

AND dimension1.`country` in ('US','CN') 
AND dimension1.`country` IS NOT NULL
1 个赞

没有,这条记录有的时候能查出来,有的时候查不出来

怀疑是不是表数据和索引数据不一致了

走TiFlash,没有索引。:sweat:

贴下执行正确和错误完整的执行计划吧,截图内容太少。

查询的对不对呢

我看到语句里有limit,会不会和这个有关系?去掉limit试试

1 个赞

没关系,目前看应该是子查询嵌套比较多导致的。这个也是tiflash的问题,如果使用tikv混用也没有这个问题。如果使用with语法改造也没有这个问题

:thinking:那倒是可以提个bug了,提bug有积分哒 :yum:

1 个赞

少的那条数据有什么特殊地方吗

explain analyze 有完整的对比吗,截图那个不够完整。。

https://github.com/pingcap/tidb/issues/47477 已经开 issue 记录

如果方便,请到 issue 上提供一下问题的复现路径。

还需要补充什么吗

贴下执行正确和错误完整的执行计划吧,截图内容太少。

这个帖子上有执行正确和错误的执行的计划,你看一下,如果窃取全,太长了,有没有其他办法

:yum:表妹的意思是提供一下复现的方式。

1 个赞

你是做了什么样的操作才会出现这个问题的,把完整的复现路径列一下,方便定位,

只有能复现、能定位到问题,才能解决问题。