PostgreSQL Bugs

Collected from the PG bugs email list.

Bug ID15567
PG Version10.5
OScentos7
Opened2018-12-28 02:01:23+00
Reported bymovead
StatusNew

Body of first available message related to this bug follows.

The following bug has been logged on the website:

Bug reference:      15567
Logged by:          movead
Email address:      (redacted)
PostgreSQL version: 10.5
Operating system:   centos7
Description:        

[Problem description]
We use pg9.5.7 with replica and i find that if some damaged wal record
arrived at standby then the startup process on standby try to restart
walreceiver process but never sucess and log below Infinite cycle on
standby.If i try to restart database now,it sucess.
(I try it on pg10.5,the same result.)
-----------------------------------------------------------------------------------------------------------------
2018-12-15 10:38:00.302 CST,,,30928,,5c1422d5.78d0,2,,2018-12-15 05:38:29
CST,,0,FATAL,57P01,"terminating walreceiver process due to administrator
command",,,,,,,,,""
2018-12-15 10:38:00.302 CST,,,5546,,5c1276c1.15aa,8,,2018-12-13 23:12:01
CST,1/0,0,LOG,00000,"incorrect resource manager data checksum in record at
5A9/11E0A90",,,,,,,,,""
2018-12-15 10:38:00.302 CST,,,5546,,5c1276c1.15aa,9,,2018-12-13 23:12:01
CST,1/0,0,LOG,00000,"incorrect resource manager data checksum in record at
5A9/11E0A90",,,,,,,,,""
2018-12-15 10:38:00.302 CST,,,5546,,5c1276c1.15aa,10,,2018-12-13 23:12:01
CST,1/0,0,LOG,00000,"incorrect resource manager data checksum in record at
5A9/11E0A90",,,,,,,,,""
2018-12-15 10:38:05.308 CST,,,5546,,5c1276c1.15aa,11,,2018-12-13 23:12:01
CST,1/0,0,LOG,00000,"incorrect resource manager data checksum in record at
5A9/11E0A90",,,,,,,,,""
2018-12-15 10:38:10.314 CST,,,5546,,5c1276c1.15aa,12,,2018-12-13 23:12:01
CST,1/0,0,LOG,00000,"incorrect resource manager data checksum in record at
5A9/11E0A90",,,,,,,,,""
2018-12-15 10:38:15.319 CST,,,5546,,5c1276c1.15aa,13,,2018-12-13 23:12:01
CST,1/0,0,LOG,00000,"incorrect resource manager data checksum in record at
5A9/11E0A90",,,,,,,,,""
2018-12-15 10:38:20.325 CST,,,5546,,5c1276c1.15aa,14,,2018-12-13 23:12:01
CST,1/0,0,LOG,00000,"incorrect resource manager data checksum in record at
5A9/11E0A90",,,,,,,,,""
2018-12-15 10:38:25.331 CST,,,5546,,5c1276c1.15aa,15,,2018-12-13 23:12:01
CST,1/0,0,LOG,00000,"incorrect resource manager data checksum in record at
5A9/11E0A90",,,,,,,,,""
2018-12-15 10:38:30.336 CST,,,5546,,5c1276c1.15aa,16,,2018-12-13 23:12:01
CST,1/0,0,LOG,00000,"incorrect resource manager data checksum in record at
5A9/11E0A90",,,,,,,,,""
2018-12-15 10:38:35.342 CST,,,5546,,5c1276c1.15aa,17,,2018-12-13 23:12:01
CST,1/0,0,LOG,00000,"incorrect resource manager data checksum in record at
5A9/11E0A90",,,,,,,,,""
2018-12-15 10:38:40.348 CST,,,5546,,5c1276c1.15aa,18,,2018-12-13 23:12:01
CST,1/0,0,LOG,00000,"incorrect resource manager data checksum in record at
5A9/11E0A90",,,,,,,,,""
2018-12-15 10:38:45.353 CST,,,5546,,5c1276c1.15aa,19,,2018-12-13 23:12:01
CST,1/0,0,LOG,00000,"incorrect resource manager data checksum in record at
5A9/11E0A90",,,,,,,,,""
2018-12-15 10:38:50.359 CST,,,5546,,5c1276c1.15aa,20,,2018-12-13 23:12:01
CST,1/0,0,LOG,00000,"incorrect resource manager data checksum in record at
5A9/11E0A90",,,,,,,,,""
2018-12-15 10:38:55.364 CST,,,5546,,5c1276c1.15aa,21,,2018-12-13 23:12:01
CST,1/0,0,LOG,00000,"incorrect resource manager data checksum in record at
5A9/11E0A90",,,,,,,,,""
2018-12-15 10:39:00.370 CST,,,5546,,5c1276c1.15aa,22,,2018-12-13 23:12:01
CST,1/0,0,LOG,00000,"incorrect resource manager data checksum in record at
5A9/11E0A90",,,,,,,,,""
-----------------------------------------------------------------------------------------------------------------


[Code review]
I read the code about startup process and walreciver process.

When a damaged wal record arrived:
1.The startup process shutdown the walreciver process use function
ShutdownWalRcv().
2.The startup process start the walreciver process use funtion
RequestXLogStreaming() by signal PMSIGNAL_START_WALRECEIVER.
3.The startup process try to read wal record. 
	And now reciver process does not startup completely,so the startup process
read the damaged wal record another times.
	And startup process set walrcv->walRcvState = WALRCV_STOPPED use function
ShutdownWalRcv().
4.go to 2.
That's the infinite cycle.

[Question]
I wonder that if it is a bug or you just don't want restart walreciver
process.

Messages

DateAuthorSubject
2018-12-28 02:01:23+00=?utf-8?q?PG_Bug_reporting_form?=BUG #15567: Wal receiver process restart failed when a damaged wal record arrived at standby.
2018-12-29 03:59:34+00Maxim BogukRe: BUG #15567: Wal receiver process restart failed when a damaged wal record arrived at standby.
2018-12-29 05:08:18+00"lichuancheng(at)highgo(dot)com"Re: Re: BUG #15567: Wal receiver process restart failed when a damaged wal record arrived at standby.
2019-01-03 09:50:48+00"lichuancheng(at)highgo(dot)com"BUG #15567: Wal receiver process restart failed when a damaged wal record arrived at standby.
2019-01-09 07:14:11+00Michael PaquierRe: BUG #15567: Wal receiver process restart failed when a damaged wal record arrived at standby.
2019-01-09 08:28:51+00"lichuancheng(at)highgo(dot)com"Re: BUG #15567: Wal receiver process restart failed when a damaged wal record arrived at standby.
2019-01-11 02:59:12+00"lichuancheng(at)highgo(dot)com"Re: BUG #15567: Wal receiver process restart failed when a damaged wal record arrived at standby.
2019-03-11 06:52:50+00"lichuancheng(at)highgo(dot)com"Re: Re: BUG #15567: Wal receiver process restart failed when a damaged wal record arrived at standby.