ticdc version:5.3.0
ticdc最后几条日志:
2022/03/10 14:09:18.435 +08:00] [INFO] [statistics.go:154] ["sink replication status"] [name=MQ] [changefeed=payment-task] [capture=10.59.110.32:8300] [count=0] [qps=0] [ddl=0]
[2022/03/10 14:11:09.347 +08:00] [INFO] [statistics.go:154] ["sink replication status"] [name=MQ] [changefeed=verify-task] [capture=10.59.110.32:8300] [count=34] [qps=0] [ddl=0]
[2022/03/10 14:19:18.934 +08:00] [INFO] [statistics.go:154] ["sink replication status"] [name=MQ] [changefeed=payment-task] [capture=10.59.110.32:8300] [count=0] [qps=0] [ddl=0]
[2022/03/10 14:21:09.634 +08:00] [INFO] [statistics.go:154] ["sink replication status"] [name=MQ] [changefeed=verify-task] [capture=10.59.110.32:8300] [count=39] [qps=0] [ddl=0]
[2022/03/10 14:29:20.124 +08:00] [INFO] [statistics.go:154] ["sink replication status"] [name=MQ] [changefeed=payment-task] [capture=10.59.110.32:8300] [count=3] [qps=0] [ddl=0]
[2022/03/10 14:31:09.753 +08:00] [INFO] [statistics.go:154] ["sink replication status"] [name=MQ] [changefeed=verify-task] [capture=10.59.110.32:8300] [count=22] [qps=0] [ddl=0]
[2022/03/10 14:39:20.434 +08:00] [INFO] [statistics.go:154] ["sink replication status"] [name=MQ] [changefeed=payment-task] [capture=10.59.110.32:8300] [count=1] [qps=0] [ddl=0]
[2022/03/10 14:41:09.975 +08:00] [INFO] [statistics.go:154] ["sink replication status"] [name=MQ] [changefeed=verify-task] [capture=10.59.110.32:8300] [count=25] [qps=0] [ddl=0]
[2022/03/10 14:49:20.436 +08:00] [INFO] [statistics.go:154] ["sink replication status"] [name=MQ] [changefeed=payment-task] [capture=10.59.110.32:8300] [count=0] [qps=0] [ddl=0]
[2022/03/10 14:51:10.034 +08:00] [INFO] [statistics.go:154] ["sink replication status"] [name=MQ] [changefeed=verify-task] [capture=10.59.110.32:8300] [count=15] [qps=0] [ddl=0]
ticdc监控:貌似owner没了
ticdc.json (2.7 MB)
但是tiup查看capture都是正常:
[
{
"id": "15df6a51-0fb6-4d15-b4cf-16c9badcb377",
"is-owner": true,
"address": "10.59.110.133:8300"
},
{
"id": "2a177a94-af2c-4915-a6cb-1d51985971b4",
"is-owner": false,
"address": "10.59.110.207:8300"
},
{
"id": "7129d057-2cd3-4b91-98bb-4dc9a745c796",
"is-owner": false,
"address": "10.59.110.32:8300"
}
]
看processor memory就突然往上增长
请把之前的owner的日志发一下,看监控现在是没有owner,所以全部等待。
这是另外两个节点的日志:
cdc3.log (102.1 KB) cdc2.log (425.8 KB)
尝试重启一下owner这台。然后观察一下日志,看是否可以选举出owner
重启之后,恢复正常了
什么问题?
tiup看capture owner为什么是正常的?
期间也没有任何报错信息
是bug么?
麻烦看下133这个节点是否有cdc_err.log
重启期间有些error信息:
[2022/03/10 15:52:26.201 +08:00] [INFO] [helper.go:63] ["got signal to exit"] [signal=terminated]
[2022/03/10 15:52:26.201 +08:00] [ERROR] [client.go:750] ["[pd] fetch pending tso requests error"] [dc-location=global] [error="[PD:client:ErrClientGetTSO]context canceled: context canceled"]
[2022/03/10 15:52:26.201 +08:00] [INFO] [client.go:669] ["[pd] exit tso dispatcher"] [dc-location=global]
[2022/03/10 15:52:26.201 +08:00] [INFO] [capture.go:254] ["run owner exited"] [error="[CDC:ErrPDEtcdAPIError]context canceled: context canceled"] [errorVerbose="[CDC:ErrPDEtcdAPIError]context canceled: context canceled\
github.com/pingcap/errors.AddStack\
\tgithub.com/pingcap/errors@v0.11.5-0.20210513014640-40f9a1999b3b/errors.go:174\
github.com/pingcap/errors.(*Error).GenWithStackByCause\
\tgithub.com/pingcap/errors@v0.11.5-0.20210513014640-40f9a1999b3b/normalize.go:302\
github.com/pingcap/ticdc/pkg/errors.WrapError\
\tgithub.com/pingcap/ticdc/pkg/errors/helper.go:30\
github.com/pingcap/ticdc/cdc/capture.(*Capture).runEtcdWorker\
\tgithub.com/pingcap/ticdc/cdc/capture/capture.go:287\
github.com/pingcap/ticdc/cdc/capture.(*Capture).campaignOwner\
\tgithub.com/pingcap/ticdc/cdc/capture/capture.go:252\
github.com/pingcap/ticdc/cdc/capture.(*Capture).run.func2\
\tgithub.com/pingcap/ticdc/cdc/capture/capture.go:177\
runtime.goexit\
\truntime/asm_amd64.s:1371"]
[2022/03/10 15:52:26.202 +08:00] [INFO] [capture.go:178] ["the owner routine has exited"] [error="resign owner failed, capture: 15df6a51-0fb6-4d15-b4cf-16c9badcb377: [CDC:ErrCaptureResignOwner]context canceled: context canceled"] [errorVerbose="[CDC:ErrCaptureResignOwner]context canceled: context canceled\
github.com/pingcap/errors.AddStack\
\tgithub.com/pingcap/errors@v0.11.5-0.20210513014640-40f9a1999b3b/errors.go:174\
github.com/pingcap/errors.(*Error).GenWithStackByCause\
\tgithub.com/pingcap/errors@v0.11.5-0.20210513014640-40f9a1999b3b/normalize.go:302\
github.com/pingcap/ticdc/pkg/errors.WrapError\
\tgithub.com/pingcap/ticdc/pkg/errors/helper.go:30\
github.com/pingcap/ticdc/cdc/capture.(*Capture).resign\
\tgithub.com/pingcap/ticdc/cdc/capture/capture.go:327\
github.com/pingcap/ticdc/cdc/capture.(*Capture).campaignOwner\
\tgithub.com/pingcap/ticdc/cdc/capture/capture.go:256\
github.com/pingcap/ticdc/cdc/capture.(*Capture).run.func2\
\tgithub.com/pingcap/ticdc/cdc/capture/capture.go:177\
runtime.goexit\
\truntime/asm_amd64.s:1371\
resign owner failed, capture: 15df6a51-0fb6-4d15-b4cf-16c9badcb377"]
[2022/03/10 15:52:26.202 +08:00] [INFO] [capture.go:189] ["the processor routine has exited"] [error="[CDC:ErrPDEtcdAPIError]context canceled: context canceled"] [errorVerbose="[CDC:ErrPDEtcdAPIError]context canceled: context canceled\
github.com/pingcap/errors.AddStack\
\tgithub.com/pingcap/errors@v0.11.5-0.20210513014640-40f9a1999b3b/errors.go:174\
github.com/pingcap/errors.(*Error).GenWithStackByCause\
\tgithub.com/pingcap/errors@v0.11.5-0.20210513014640-40f9a1999b3b/normalize.go:302\
github.com/pingcap/ticdc/pkg/errors.WrapError\
\tgithub.com/pingcap/ticdc/pkg/errors/helper.go:30\
github.com/pingcap/ticdc/cdc/capture.(*Capture).runEtcdWorker\
\tgithub.com/pingcap/ticdc/cdc/capture/capture.go:287\
github.com/pingcap/ticdc/cdc/capture.(*Capture).run.func3\
\tgithub.com/pingcap/ticdc/cdc/capture/capture.go:188\
runtime.goexit\
\truntime/asm_amd64.s:1371"]
[2022/03/10 15:52:26.202 +08:00] [WARN] [client.go:1162] ["failed to receive from stream"] [addr=10.59.105.50:20161] [storeID=2] [error="rpc error: code = Unavailable desc = transport is closing"]
[2022/03/10 15:52:26.216 +08:00] [INFO] [capture.go:142] ["capture recovered"] [capture-id=15df6a51-0fb6-4d15-b4cf-16c9badcb377]
[2022/03/10 15:52:26.216 +08:00] [INFO] [capture.go:119] ["the capture routine has exited"]
[2022/03/10 15:52:26.216 +08:00] [ERROR] [client.go:750] ["[pd] fetch pending tso requests error"] [dc-location=global] [error="[PD:client:ErrClientGetTSO]context canceled: context canceled"]
[2022/03/10 15:52:26.216 +08:00] [INFO] [client.go:669] ["[pd] exit tso dispatcher"] [dc-location=global]
[2022/03/10 15:52:26.217 +08:00] [INFO] [server.go:135] ["cdc server exits successfully"]
[2022/03/10 15:52:27.143 +08:00] [INFO] [helper.go:51] ["init log"] [file=/data/cdc/8300/log/cdc.log] [level=info]
[2022/03/10 15:52:27.144 +08:00] [INFO] [version.go:47] ["Welcome to Change Data Capture (CDC)"] [release-version=v5.3.0] [git-hash=20626babf21fc381d4364646c40dd84598533d66] [git-branch=heads/refs/tags/v5.3.0] [utc-build-time="2021-11-22 10:37:02"] [go-version="go version go1.16.4 linux/amd64"] [failpoint-build=false]
[2022/03/10 15:52:27.144 +08:00] [INFO] [server.go:67] ["creating CDC server"] [pd-addrs="[http://10.59.105.60:2379,http://10.59.105.61:2379,http://10.59.105.62:2379]"] [config="{\"addr\":\"0.0.0.0:8300\",\"advertise-addr\":\"10.59.110.133:8300\",\"log-file\":\"/data/cdc/8300/log/cdc.log\",\"log-level\":\"info\",\"log\":{\"file\":{\"max-size\":300,\"max-days\":0,\"max-backups\":0}},\"data-dir\":\"/data/cdc/8300/store\",\"gc-ttl\":86400,\"tz\":\"System\",\"capture-session-ttl\":10,\"owner-flush-interval\":200000000,\"processor-flush-interval\":100000000,\"sorter\":{\"num-concurrent-worker\":4,\"chunk-size-limit\":134217728,\"max-memory-percentage\":30,\"max-memory-consumption\":17179869184,\"num-workerpool-goroutine\":16,\"sort-dir\":\"/tmp/sorter\"},\"security\":{\"ca-path\":\"\",\"cert-path\":\"\",\"key-path\":\"\",\"cert-allowed-cn\":null},\"per-table-memory-quota\":10485760,\"kv-client\":{\"worker-concurrent\":8,\"worker-pool-size\":0,\"region-scan-limit\":40}}"]
cdc_stderr.log. 是否有这个日志存在