opened 10:47AM - 09 Jun 22 UTC
closed 01:13AM - 01 Nov 22 UTC
type/enhancement
component/compute
## Enhancement
Currently, error handling/cancel in TiFlash is error prone, it h…as caused many issues such as #4441, #4219, #4202 etc.
We want to refine the error handling/cancel logical in TiFlash MPP system to make it less error prone.
Some basic ideas:
- Refine `MPPTunnel`
- `MPPTunnel` has 3 mode: `local`, `sync` and `async`, currently, the implementation of `MPPTunnel` is based on `is_local` and `is_async` flag, which makes the code complex and error prone.
- `MPPTunnel`/`BlockIO`/`ExchangeReceiver` should be treated as the top level components in `MPPTask`
- Each top level components in `MPPTask` should implement its own `cancel` and `handleError` method
- Like `MPPTask::cancel`, there should be a method like `MPPTask::handleError` method to handle errors based on task status
- Like `cancel`, there should be a query level error handling method, so once a `MPPTask` meet error, all the related tasks in the same TiFlash node can see the error and stop running
- Local tunnel should not introduce direct dependency from send task to receive task
- As #4208 avoid to print too much meaningless log when error/cancel happens