[Analyze] 为啥analyze options持久化的时候是写内部表而不是存在表的schema里面呢？

TiDBer_D7483dYr · 2024 年4 月 8 日 02:28

我看是*: persist analyze options for manual and auto analyze by chrysan · Pull Request #30939 · pingcap/tidb · GitHub 引入的
1）存schema里面不应该更方便么，也不用关心gc（表已经删了或rename了，需要从mysql.analyze_options内部表中删除对应配置）问题了
是有啥别的考虑么？
2）getAdjustedSampleRate这个函数注释说，《Random sampling for histogram construction: how much is enough?》这个论文已经证明，无论多大的数据库，在样本数达到一定值（大概10w量级），准确度就足够了。这是不是意味着，buckets个数，topn个数等也有上限呢？既然这样直接默认是上限不行么，这样也不用调整了

xfworld · 2024 年4 月 8 日 02:40

这样看下来 DBA 要失业了

心急吃不了热豆腐 · 2024 年4 月 8 日 03:04

理论与实际是否可以实现是有一定的差距的

像风一样的男子 · 2024 年4 月 8 日 03:22

建议提个issue

友利奈绪 · 2024 年4 月 8 日 03:47

感觉还得有段时间实现

dba远航 · 2024 年4 月 8 日 06:07

考虑的很全面

小于同学 · 2024 年4 月 9 日 06:45

这样看下来 DBA 要失业了