DELETE Statement,懂你不容易

【是否原创】是
【首发渠道】TiDB 社区

【目录】
闲话少叙
语法定义
Yacc实现
验证Yacc
修改Yacc
Conflicts问题
【正文】

闲话少叙

书接上文,自从团队对外公布了TiDB for PostgreSQL项目之后,团队成员正在紧锣密鼓的为下一个里程碑做努力,目前我们主要的工作集中在以下几个方面:

  1. 完善PgSQL的通信协议
  2. 实现PgSQL关键字和语法
  3. 修复单元测试
  4. 数据库Chaos测试平台

我们先回顾一下往期内容,再进入本文主题:

TiDB for PostgreSQL—牛刀小试 - 知乎 (zhihu.com)

TiDB for PostgreSQL 学习指南 - 知乎 (zhihu.com)

TiDB Parser模块的简单解读与改造方法 - 知乎 (zhihu.com)

SQL只是CRUD? - 知乎 (zhihu.com)

除此之外,我们的项目也有幸入选PingCap DevCon 2021大会论坛话题,同时还进入了第二期TiDB Hacking Camp,欢迎大家多多关注我们,更欢迎大家加入我们。

Hacking Camp 第二期开启,六大开源项目重磅来袭! - 知乎 (zhihu.com)

语法定义

书归正文,这期文章主要介绍Data Manipulation Language中的Delete Statement,也就是CRUD中的最后一个字母。让我们先看看Mysql,PostgreSQL,SQL-2003 Standard都是怎么定义Delete的语法。

MySQL

Single-Table Syntax

DELETE [LOW_PRIORITY] [QUICK] [IGNORE] FROM tbl_name [[AS] tbl_alias]
[PARTITION (partition_name [, partition_name] ...)]
[WHERE where_condition]
[ORDER BY ...]
[LIMIT row_count]

From https://dev.mysql.com/doc/refman/8.0/en/delete.html

Multiple-Table Syntax

DELETE [LOW_PRIORITY] [QUICK] [IGNORE]
tbl_name[.*] [, tbl_name[.*]] ...
FROM table_references
[WHERE where_condition]

DELETE [LOW_PRIORITY] [QUICK] [IGNORE]
FROM tbl_name[.*] [, tbl_name[.*]] ...
USING table_references
[WHERE where_condition]

From https://dev.mysql.com/doc/refman/8.0/en/delete.html

PostgreSQL

[ WITH [ RECURSIVE ] with_query [, ...] ]
DELETE FROM [ ONLY ] table_name [ * ] [ [ AS ] alias ]
[ USING from_item [, ...] ]
[ WHERE condition | WHERE CURRENT OF cursor_name ]
[ RETURNING * | output_expression [ [ AS ] output_name ] [, ...] ]

This command conforms to the SQL standard, except that the USING and RETURNING clauses are PostgreSQL extensions, as is the ability to use WITH with DELETE.

From https://www.postgresql.org/docs/current/sql-delete.html

SQL-2003 BNF

<delete statement: positioned>    ::=   DELETE FROM <target table> WHERE CURRENT OF <cursor name>
<delete statement: searched>    ::=   DELETE FROM <target table> [ WHERE <search condition> ]

<target table>    ::=
         <table name>
     |     ONLY <left paren> <table name> <right paren>

<search condition>    ::=   <boolean value expression>

从这三个数据库的定义是不是觉得只有SQL-2003跟常用的Delete From Table删表跑路的语法差不多,其他两个数据库加入了一些自定义的语法功能,尤其是MySQL,长得像个远房亲戚。这里不对各个自定义语法功能做解释,想了解的读者可以点击对应的官方链接查阅。

Yacc实现

了解语法定义是远远不够的,我们还需要看看parser.y对Delete Statement的MySQL实现,从Delete Statement开始,分成了DeleteWithoutUsingStmt和DeleteWithUsingStmt两个分支,每个分支中又定义了对应的语法,可以看出每一个语法定义都基本符合MySQL的定义,整个过程就是把SQL解析成AST,下一步就是拿现有的parser做个demo,看看是否能正常解析我们的Delete SQL。
DCParser/parser.y at main · DigitalChinaOpenSource/DCParser (github.com)

DELETE | PingCAP Docs
/*******************************************************************
*
* Delete Statement
*
*******************************************************************/
DeleteWithoutUsingStmt:
“DELETE” TableOptimizerHints PriorityOpt QuickOptional IgnoreOptional “FROM” TableName PartitionNameListOpt TableAsNameOpt IndexHintListOpt WhereClauseOptional OrderByOptional LimitClause
{
// Single Table
tn := $7.(*ast.TableName)
tn.IndexHints = $10.([]*ast.IndexHint)
tn.PartitionNames = $8.([]model.CIStr)
join := &ast.Join{Left: &ast.TableSource{Source: tn, AsName: $9.(model.CIStr)}, Right: nil}
x := &ast.DeleteStmt{
TableRefs: &ast.TableRefsClause{TableRefs: join},
Priority: $3.(mysql.PriorityEnum),
Quick: $4.(bool),
IgnoreErr: $5.(bool),
}
if $2 != nil {
x.TableHints = $2.([]*ast.TableOptimizerHint)
}
if $11 != nil {
x.Where = $11.(ast.ExprNode)
}
if $12 != nil {
x.Order = $12.(*ast.OrderByClause)
}
if $13 != nil {
x.Limit = $13.(*ast.Limit)
}
$$ = x
}
| “DELETE” TableOptimizerHints PriorityOpt QuickOptional IgnoreOptional TableAliasRefList “FROM” TableRefs WhereClauseOptional
{
// Multiple Table
x := &ast.DeleteStmt{
Priority: $3.(mysql.PriorityEnum),
Quick: $4.(bool),
IgnoreErr: $5.(bool),
IsMultiTable: true,
BeforeFrom: true,
Tables: &ast.DeleteTableList{Tables: $6.([]*ast.TableName)},
TableRefs: &ast.TableRefsClause{TableRefs: $8.(*ast.Join)},
}
if $2 != nil {
x.TableHints = $2.([]*ast.TableOptimizerHint)
}
if $9 != nil {
x.Where = $9.(ast.ExprNode)
}
$$ = x
}
DeleteWithUsingStmt:
“DELETE” TableOptimizerHints PriorityOpt QuickOptional IgnoreOptional “FROM” TableAliasRefList “USING” TableRefs WhereClauseOptional
{
// Multiple Table
x := &ast.DeleteStmt{
Priority: $3.(mysql.PriorityEnum),
Quick: $4.(bool),
IgnoreErr: $5.(bool),
IsMultiTable: true,
Tables: &ast.DeleteTableList{Tables: $7.([]*ast.TableName)},
TableRefs: &ast.TableRefsClause{TableRefs: $9.(*ast.Join)},
}
if $2 != nil {
x.TableHints = $2.([]*ast.TableOptimizerHint)
}
if $10 != nil {
x.Where = $10.(ast.ExprNode)
}
$$ = x
}
DeleteFromStmt:
DeleteWithoutUsingStmt
| DeleteWithUsingStmt

验证Yacc
我们新建一个golang项目,如下。 这个demo解析了两个Delete SQL,第一个比较简单,第二个几乎把所有的参数都填上了,我们看看执行后的效果。

package main
import (
    "fmt"
    parser "github.com/DigitalChinaOpenSource/DCParser"
    "github.com/DigitalChinaOpenSource/DCParser/ast"
    _ "github.com/DigitalChinaOpenSource/DCParser/test_driver"
)

func parse(sql string) (*ast.StmtNode, error) {
    p := parser.New()
    stmtNodes, _, err := p.Parse(sql, "", "")
    if err != nil {
        return nil, err
    }
    return &stmtNodes[0], nil
}

func main() {
    astNode, err := parse("DELETE FROM somelog;")
    if err != nil {
        fmt.Printf("parse error: %v\n", err.Error())
        return
    }
    fmt.Printf("%v\n", *astNode)
    astNode, err = parse("DELETE /*+ BKA(somelog) */ LOW_PRIORITY QUICK IGNORE FROM somelog PARTITION (p1,p2) AS t FORCE INDEX FOR ORDER BY(timestamp_column) WHERE user = 'jcole' ORDER BY timestamp_column LIMIT 1;")
    if err != nil {
        fmt.Printf("parse error: %v\n", err.Error())
        return
    }
    fmt.Printf("%v\n", *astNode)
}

项目设置断点后运行,对"DELETE FROM somelog;"的解析呈现AST结构,可以看到TableName正确获取到了,其他的参数where, order, limit,priority, ignore, quick等等都是空值。

对二个SQL解析时"DELETE /*+ BKA(somelog) */ LOW_PRIORITY QUICK IGNORE FROM somelog PARTITION (p1,p2) AS t FORCE INDEX FOR ORDER BY(timestamp_column) WHERE user = ‘jcole’ ORDER BY timestamp_column LIMIT 1;",真个AST就变得庞大很多,包括where里的相等操作也能正确解析,这个parser工作正常,接下来我们开始删除不需要的语法参数。

修改Yacc
我们开始删除PostgreSQL中没有的参数,如以下代码所示,对应的代码解析部分也要做相应的调整,不然会编译报错,当然中间也确实遇到了一个很麻烦的问题,如果你也遇到了类似conflict的问题,可以看看文末的trouble shooting。
/*******************************************************************
*
* Delete Statement
*
*******************************************************************/
DeleteWithoutUsingStmt:
“DELETE” TableOptimizerHints PriorityOpt QuickOptional “FROM” TableName TableAsNameOpt WhereClauseOptional
{
// Single Table
tn := $6.(*ast.TableName)
join := &ast.Join{Left: &ast.TableSource{Source: tn, AsName: $7.(model.CIStr)}, Right: nil}
x := &ast.DeleteStmt{
TableRefs: &ast.TableRefsClause{TableRefs: join},
Priority: $3.(mysql.PriorityEnum),
Quick: $4.(bool),
}
if $2 != nil {
x.TableHints = $2.([]*ast.TableOptimizerHint)
}
if $8 != nil {
x.Where = $8.(ast.ExprNode)
}
$$ = x
}
| “DELETE” TableOptimizerHints PriorityOpt QuickOptional TableAliasRefList “FROM” TableRefs WhereClauseOptional
{
// Multiple Table
x := &ast.DeleteStmt{
Priority: $3.(mysql.PriorityEnum),
Quick: $4.(bool),
IsMultiTable: true,
BeforeFrom: true,
Tables: &ast.DeleteTableList{Tables: $5.([]*ast.TableName)},
TableRefs: &ast.TableRefsClause{TableRefs: $7.(*ast.Join)},
}
if $2 != nil {
x.TableHints = $2.([]*ast.TableOptimizerHint)
}
if $8 != nil {
x.Where = $8.(ast.ExprNode)
}
$$ = x
}
DeleteWithUsingStmt:
“DELETE” TableOptimizerHints PriorityOpt QuickOptional “FROM” TableAliasRefList “USING” TableRefs WhereClauseOptional
{
// Multiple Table
x := &ast.DeleteStmt{
Priority: $3.(mysql.PriorityEnum),
Quick: $4.(bool),
IsMultiTable: true,
Tables: &ast.DeleteTableList{Tables: $6.([]*ast.TableName)},
TableRefs: &ast.TableRefsClause{TableRefs: $8.(*ast.Join)},
}
if $2 != nil {
x.TableHints = $2.([]*ast.TableOptimizerHint)
}
if $9 != nil {
x.Where = $9.(ast.ExprNode)
}
$$ = x
}
DeleteFromStmt:
DeleteWithoutUsingStmt
| DeleteWithUsingStmt

利用项目自带的goyacc进行编译,记住,一定是用项目再带的goyacc编译,因为tidb对yyLexer接口一些改动,如果用go自带的goyacc编译会报没有实现方法的错误。好了,执行完之后就会生成一个新的parser.go文件。
image

最后还需要把yacc修改导致的UT错误给修复了:)

本文删除MySQL自定义语法的目标已经接近完成,剩下的几个语法功能就留给读者自己尝试删除,关于如何加入PostgreSQL语法功能,我们会在下次分解。

Conflicts问题

当你在编译y文件时出现conflicts问题的时候,需要添加-v来查阅解析步骤。
image

可以通过正确和错误文件对比提高排查效率,当然这次问题属于第一次修改没经验造成的,我还特地翻阅了Bison官方文档Shift/Reduce (Bison 3.7.6) (gnu.org)。最后根据Yacc的工作原理和output文件,把问题修复的,原因其实也非常简单,就是只改了一个delete分支里的语法,导致编译之后Delete AST解析路劲不一致导致的冲突。

3赞

前排学习