PostgreSQL啟動恢複讀取checkpoint記錄失敗的條件

來源:互聯網
上載者:User

標籤:sla   rom   損壞   sizeof   code   receive   max   erro   rtu   

1、首先讀取ControlFile->checkPoint指向的checkpoint2、如果讀取失敗,slave直接abort退出,master再次讀取ControlFile->prevCheckPoint指向的checkpointStartupXLOG-> |--checkPointLoc = ControlFile->checkPoint; |--record = ReadCheckpointRecord(xlogreader, checkPointLoc, 1, true): |-- if (record != NULL){ ... }else if (StandbyMode){ ereport(PANIC,(errmsg("could not locate a valid checkpoint record"))); }else{ checkPointLoc = ControlFile->prevCheckPoint; record = ReadCheckpointRecord(xlogreader, checkPointLoc, 2, true); if (record != NULL){ InRecovery = true;//標記下面進入recovery }else{ ereport(PANIC,(errmsg("could not locate a valid checkpoint record"))); } }

一、那麼什麼條件下讀取的checkpoint記錄record==NULL?

1、ControlFile->checkPoint % XLOG_BLCKSZ < SizeOfXLogShortPHD2、ReadRecord(xlogreader, ControlFile->checkPoint, LOG, true)返回NULL3、ReadRecord讀到的record!=NULL && record->xl_rmid != RM_XLOG_ID4、ReadRecord讀到的record!=NULL && info != XLOG_CHECKPOINT_SHUTDOWN && info != XLOG_CHECKPOINT_ONLINE5、ReadRecord讀到的record!=NULL && record->xl_tot_len != SizeOfXLogRecord + SizeOfXLogRecordDataHeaderShort + sizeof(CheckPoint)

二、ReadRecord函數返回NULL的條件

ReadRecord(xlogreader, ControlFile->checkPoint, LOG, true)    |--record = XLogReadRecord(xlogreader, ControlFile->checkPoint, &errormsg);    |-- 2.1 record==NULL && !StandbyMode    |-- 2.2 record!=NULL && !tliInHistory(xlogreader->latestPageTLI, expectedTLEs)    /*-----    note:只要讀取了一頁xlog,就會賦值為該頁第一個記錄的時間軸    XLogReaderValidatePageHeader        -->xlogreader->latestPageTLI=hdr->xlp_tli;    ------*/

三、XlogReadRecord讀取checkpoint返回NULL的條件?

XLogReadRecord(xlogreader, ControlFile->checkPoint, &errormsg)    targetPagePtr = ControlFile->checkPoint - (ControlFile->checkPoint % XLOG_BLCKSZ);    targetRecOff = ControlFile->checkPoint % XLOG_BLCKSZ;    readOff = ReadPageInternal(state,targetPagePtr, Min(targetRecOff + SizeOfXLogRecord, XLOG_BLCKSZ));    pageHeaderSize = XLogPageHeaderSize((XLogPageHeader) state->readBuf);    record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);    total_len = record->xl_tot_len;    -------------    1、readOff < 0    2、0< targetRecOff < pageHeaderSize    3、(((XLogPageHeader) state->readBuf)->xlp_info & XLP_FIRST_IS_CONTRECORD) && targetRecOff == pageHeaderSize       page頭有跨頁的record並且checkpoint定位的位移正好在頁頭尾部    4、targetRecOff <= XLOG_BLCKSZ - SizeOfXLogRecord &&        !ValidXLogRecordHeader(state, ControlFile->checkPoint, state->ReadRecPtr, record,randAccess)       ---(record->xl_tot_len < SizeOfXLogRecord || record->xl_rmid > RM_MAX_ID || record->xl_prev != state->ReadRecPtr)    5、targetRecOff > XLOG_BLCKSZ - SizeOfXLogRecord && total_len < SizeOfXLogRecord    6、total_len > state->readRecordBufSize && !allocate_recordbuf(state, total_len)       一旦該記錄損壞,total_len的長度非常大的話,就需要allocate_recordbuf擴充state->readbuf,可能因此分配失敗abort       記錄的checksum需要等待全部讀取完整記錄後才校正    -------------

三、ReadPageInternal返回的readOff返回小於0的條件

ReadPageInternal(state,targetPagePtr, Min(targetRecOff + SizeOfXLogRecord, XLOG_BLCKSZ))    1、第一次read wal檔案,readLen = state->read_page:讀取第一頁。readLen < 0    2、readLen>0 && !XLogReaderValidatePageHeader(state, targetSegmentPtr, state->readBuf)    --    3、讀取checkpoint所在頁readLen = state->read_page: readLen < 0    4、readLen > 0 && readLen <= SizeOfXLogShortPHD    5、!XLogReaderValidatePageHeader(state, pageptr, (char *) hdr)

四、XLogPageRead何時傳回值<0 ?

/*    1、WaitForWALToBecomeAvailable open失敗    2、lseek 失敗 && !StandbyMode    3、read失敗 && !StandbyMode    4、校正page頭失敗 && !StandbyMode    如果是StandbyMode,則會重新retry->WaitForWALToBecomeAvailable,切換日誌源進行open    */    !WaitForWALToBecomeAvailable(targetPagePtr + reqLen,private->randAccess,1,targetRecPtr)//open    |-- return -1    readOff = targetPageOff;    if (lseek(readFile, (off_t) readOff, SEEK_SET) < 0){        !StandbyMode:: return -1    }    if (read(readFile, readBuf, XLOG_BLCKSZ) != XLOG_BLCKSZ){        !StandbyMode:: return -1    }    XLogReaderValidatePageHeader(xlogreader, targetPagePtr, readBuf)    !StandbyMode:: return -1

五、WaitForWALToBecomeAvailable何時返回false?

--XLOG_FROM_ARCHIVE | XLOG_FROM_PG_WAL    1、先XLogFileReadAnyTLI open日誌:        1、遍曆時間軸列表裡的每一個時間軸,從最新的開始        2、當讀取checkpoint的時候,source是XLOG_FROM_ANY        3、先找歸檔的日誌進行open;如果open失敗再找WAL日誌進行open        4、如果都沒有open成功,則向前找時間軸,open前一個時間軸segno和檔案號相同的檔案進行open        5、open成功後expectedTLEs被賦值為目前時間線列表的所有值    2、如果open失敗,則切換日誌源:XLOG_FROM_ARCHIVE | XLOG_FROM_PG_WAL -> XLOG_FROM_STREAM    3、切換日誌源後,XLOG_FROM_ARCHIVE | XLOG_FROM_PG_WAL 則:       slave && promote :return false       !StandbyMode:return false    --XLOG_FROM_STREAM    1、!WalRcvStreaming()即receiver進程掛了,切換日誌源    2、CheckForStandbyTrigger()切換日誌源    3、XLOG_FROM_STREAM->XLOG_FROM_ARCHIVE

PostgreSQL啟動恢複讀取checkpoint記錄失敗的條件

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.