Time limit: 3.000 seconds
限時:3.000秒
Problem
問題
You work for a company which uses lots of personal computers. Your boss, Dr Penny Pincher, has wanted to link the computers together for some time but has been unwilling to spend any money on the Ethernet boards you have recommended. You, unwittingly, have pointed out that each of the PCs has come from the vendor with an asynchronous serial port at no extra cost. Dr Pincher, of course, recognizes her opportunity and assigns you the task of writing the software necessary to allow communication between PCs.
你在一家有很多PC機的公司工作。你的老闆是Penny Pincher博士,她想把所有電腦都串連起來。你提出的建議是購買網卡,但她覺得太貴了。此時,你無意間想到這些電腦都來自同一個供貨商,並且都有一個非同步通訊的串口可以直接使用而無需花錢。Pincher博士採納了這個主意,並讓你編寫用於連通電腦(通過串口的非同步通訊方式)的軟體。
You've read a bit about communications and know that every transmission is subject to error and that the typical solution to this problem is to append some error checking information to the end of each message. This information allows the receiving program to detect when a transmission error has occurred (in most cases). So, off you go to the library, borrow the biggest book on communications you can find and spend your weekend (unpaid overtime) reading about error checking.
後來,你在某次通訊過程中讀取了一些資料位元,結果發現傳輸中出現了錯誤。解決這個問題的傳統辦法就是在每條報文的後面都附加一段錯誤校正資訊。這樣軟體就可以利用這些校正資訊來檢測傳輸過程中是否出現了錯誤(在絕大多數情況下都有效)。在此後的一周裡你天天去泡圖書館(在沒有加班費的空閑時間),借了最厚的通訊書籍查閱有關錯誤校正的知識。
Finally you decide that CRC (cyclic redundancy check) is the best error checking for your situation and write a note to Dr Pincher detailing the proposed error checking mechanism noted below.
最後你確定使用CRC(迴圈冗餘校正)是解決當前問題的最佳方案,並給Pincher博士遞交了一分關於錯誤校正機制的說明,如下:
CRC Generation
CRC的產生
The message to be transmitted is viewed as a long positive binary number. The first byte of the message is treated as the most significant byte of the binary number. The second byte is the next most significant, etc. This binary number will be called "m" (for message). Instead of transmitting "m" you will transmit a message, "m2", consisting of "m" followed by a two-byte CRC value.
傳輸的報文可以視為一個很長的位元。報文的1個位元組是整個位元的最高位,第2個位元組其次,等等。將此位元稱為“m”(message, 報文),在傳輸m時要附加兩個位元組的CRC編碼,稱為“m2”。
The CRC value is chosen so that "m2" when divided by a certain 16-bit value "g" leaves a remainder of 0. This makes it easy for the receiving program to determine whether the message has been corrupted by transmission errors. It simply divides any message received by "g". If the remainder of the division is zero, it is assumed that no error has occurred.
選擇的CRC碼應該能使m2能被一個16位的數“g”整除,這樣的話接收方軟體就可以非常容易的確定訊息在傳輸過程中是否出錯。只需要將收到的每一條訊息除以“g”,如果餘數為0就表示沒有錯誤發生。
You notice that most of the suggested values of "g" in the book are odd, but don't see any other similarities, so you select the value 34943 for "g" (the generator value).
你除了注意到書中建議的大多數“g”值都是奇數外沒發現其它的規律,因此你選擇“34943”作為“g”的值。
Input and Output
輸入與輸出
You are to devise an algorithm for calculating the CRC value corresponding to any message that might be sent. To test this algorithm you will write a program which reads lines (each line being all characters up to, but not including the end of line character) as input, and for each line calculates the CRC value for the message contained in the line, and writes the numeric value of the CRC bytes (in hexadecimal notation) on an output line. Each input line will contain no more than 1024 ASCII characters. The input is terminated by a line that contains a # in column 1. Note that each CRC printed should be in the range 0 to 34942 (decimal).
你要設計一種演算法,為要發送的所有 報文計算出CRC碼。為了測試你的演算法,你還要寫一個程式來讀取每行輸入的字串(每行都從開始一直到(但不包括)分行符號結束),並為每一行字串表示的訊息計算出CRC值,然後對應的輸出一行,列印出CRC位元組的數值。每行輸入都不會超過1024個ASCII字元。首字元為#號的一行表示輸入結束。注意每個CRC都應該在0到34942(十進位)的範圍內。
Sample Input
輸入樣本
this is a test
A
#
Sample Output
輸出樣本
77 FD
00 00
0C 86
Analysis
分析
實際在通迅中使用的CRC校正碼要比題目中描述的複雜得多,涉及的知識點也很廣,這裡僅介紹與題目相關的演算法,瞭解其它內容請參見“迴圈冗餘校正”(維基百科)。
本題最主要的演算法就是將字串看作一個很長的整數,然後執行除法取餘。簡單來講,就是要實現大數的除法。我們先來看10進位的例子,請用筆計算:2357 ÷ 6。我們只關心餘數,第一次在3上面試商3,下面得餘數5;用5 × 10 + 5 = 55作為第二次被除數,商9得餘數1;用1 × 10 + 7 = 17作第三次被除數,商2餘5。注意到每次被除數的每一位去除以除數6得到的餘數都要累計到下一位(乘以10再加下一位)。只要能保證被除數中每一位都大於除數,就可以避免出現借位的情況。將筆算除法推廣到n進位,即得到大數除法。設p為一個n進位整數作為被除數,q為小於n的除數(即選擇的校正碼產生數):
p = a0n0 + a1n1 + ... + aknk
現在要求的是p ÷ q的餘數rk,即r0 = p mod q,先從rk開始向前計算:
rk = aknk mod q
rk-1 = (rkn + ak-1nk-1) mod q
...
r0 = (r1n + a0n0) mod q
根據此遞推公式就很容易處理大數的除法了。為了方便實現,在程式中可以使用216=65535進位或232=4294967295進位。這樣在計算餘數時,無需將上一個r乘以進位,只要左移16位或32位即可,最大限度的利用了寄存器和CPU的特性。
求得餘數後,下一步就是要將餘數轉為校正碼。設校正碼為c,那麼應滿足:
(a0n0 + a1n1 + ... + aknk + c) mod q = 0
前面已求得:
(a0n0 + a1n1 + ... + aknk) mod q = r
顯然,c的值不止一個,但最小的正值c可用如下公式來計算:
c = q - (nr mod q)
上式很容易理解,就不贅述了。至此,CRC碼計算完成。這裡還有一個要注意的地方,我們一般寫程式和編譯的機器環境(包括OJ系統啟動並執行環境)都是x86架構的,也就意味著位元組序是little-endian, 即在儲存空間中的所有整型數值都是高位在前低位在後。比如32位16進位數:AF045Bh,在記憶體中的順序應該是:
5B 04 AF 00
如果我們直接將字串的指標轉為int型指標進行計算就會算錯。必須先對每一組位元組(整型變數大小的一組)先進行反轉再作運算。
Solution
解答
#include <algorithm>#include <iomanip>#include <iostream>#include <string>using namespace std;int main(void) {typedef unsigned short word;char Bits[1032]; //儲存輸入的字串cin.sync_with_stdio(false); //輸入的資料量很大,關閉同步以加速cout << setbase(16) << setiosflags(ios::uppercase) << setfill('0');//迴圈處理輸入的每一行字串for (string Line; getline(cin, Line) && Line[0] != '#'; cout << endl) {word nGen = 34943, nLen = Line.length(), *pBit = (word*)Bits;//將字串轉存到靜態數組中。x86的CPU位元組序為little-endian,reverse_copy(Line.begin(), Line.end(), Bits); //反轉為正位元組序*(word*)(&Bits[nLen]) = 0; //將結尾後面的一些位都清零nLen = nLen / 2 + (nLen % 2 != 0); //計算作為int數組的長度long nRem = 0;//nRem表示餘數//迴圈除所有的位,累加進位的餘數,產生CRC碼for (int i = nLen - 1; i >= 0; --i) {nRem = ((nRem << 16) + pBit[i]) % nGen;}if (nRem != 0) { //如果餘數不為0,則需構造CRC碼,演算法見文檔nRem = nGen - (nRem << 16) % nGen;} //下面按要求的格式輸出CRC碼unsigned char* pByte = (unsigned char*)&nRem;cout << setw(2) << (int)pByte[1] << ' ' << setw(2) << (int)pByte[0];}return 0;}