Following the first two sections, we discuss the coding of polarization codes and the Gaussian approximation method in the Gaussian channel, then we focus on a very important algorithm of polarization code construction. This algorithm does not have a definite noun, so we named it the "Tal-vardy algorithm " with the name of two inventors.
In the "Polarization Code summary (2)", we briefly describe the method of constructing the polarization code under the BEC Channel, which is constructed by direct calculation of the Babbitt parameter Z (W), and the computational complexity is O (N).
In the Gaussian approximation of polarization codes, we discuss the method of constructing polarization codes in common Gaussian channels-Gaussian approximation, and the computational complexity is O (N).
Now, once again, we extend the tentacles of the polarization code to another common channel-the two-yuan symmetric no Memory channel (BMS).
because the space is likely to be large, I'll make a brief introduction to the algorithm in two sections. I will put the references in this article at the beginning of the relevant content, and suggest that you need to read the original paper.
1 "How to Construct Polar Codes" Ido Tal, Alexander Vardy.
Part1. Simple Introduction
In this algorithm, there are two core channel operations, one is called Channel weakening (degrade), the other is called Channel Hardening (Upgrade). In this paper, the relationship between the weakening channel and the enhanced channel and the original channel of the two operations is likened to the structure of "sandwich".
Figure 13 The relationship between the channels
The general idea of the thesis is to weaken the channel and strengthen the channel by weakening the operation and strengthening the original channel. The analysis shows that the two channels are very close to each other, so we can use these two channels to approximate the original channel through the similar mathematical "clamped theorem".
The idea of the "Tal-vardy algorithm" to construct the polarization code is to directly calculate the error probability PE (W) of each channel, and then use this parameter to select the information bits we need. This method of selecting channels is clearly more universal. The "Tal-vardy algorithm" is proposed for the B-DMC (two-yuan discrete memory-free) channel, which is not directly used for channels with continuous output such as Gaussian channels. Therefore, the author also proposes a method to make this algorithm also can be applied to the output continuous channel.
As we mentioned before, the difficulty of calculating the channel error probability PE (w) is that the output symbol set size of w increases exponentially with N, which is a difficult point to overcome. In order to make this calculation possible, the author can reduce the output symbol set to the specified symbol set size by using the "merge function" in the weakening operation or hardening operation.
This algorithm is used to construct the linear complexity of the time complexity of the polarization code in N.
In order to better understand the algorithm presented in this paper, we will try to discuss it from three parts according to the idea of the thesis. is the combination of output character set, channel operation, How to handle continuous symmetric channel.
Part2. Research subjects
"2" A Note on symmetric discrete memoryless Channels "Ingmar land.
In the study of polarization code construction problems, we often encounter a variety of channels, and now we do a simple summary.
DMC (discrete non-memory channel)
DMC has a discrete input character set X, a discrete output character set Y, and a transfer probability function P (y|x). Its output is only relevant to the current input, so it is also a memory-free channel.
Assuming that the input character set size is MX, the output is my, without losing its generality, we assume that:
Then, the transfer probability of this channel can be represented by a matrix:
Notice that each row of the matrix is 1.
Strongly symmetric DMCs (strong symmetrical DMC)
Before we introduce this channel, let's first introduce a concept- identity arrangement .
If vector v and the elements in the vector μ are exactly the same, except that the elements are arranged in a different order, we call v an identical arrangement of μ.
eg. Μ=[1 2 3 4],v=[2 4 1 3], then V is an identical arrangement of μ.
definition: for a transfer probability matrix of a channel, if each row of the matrix is an identical arrangement of other rows, and each column is an identical arrangement of the other columns, we call this transfer probability matrix the DMC described as " strong symmetric DMC".
A very special example is the two-dollar symmetric channel (BSC):
Figure 22 The symmetric channel of the element
The input character set of the binary symmetric channel is {0,1} and the output is {0,1}, and its transition probability matrix is:
Symmetric DMC
definition: for a transfer probability matrix, if it can be divided into several sub-matrices by column, so that each sub-matrix satisfies the definition of "strong symmetry", then we call this transfer probability matrix described DMC as " symmetric DMC".
A special example is the two-dollar Delete channel (BEC):
Figure 32 $ Delete Channel
Its input character set is {0,1} and the output character set is {0,,1}, where the symbol is deleted. The transfer probability of BEC is:
Obviously, it can be split into two sub-matrices by column:
Both matrices conform to the definition of a strongly symmetric channel, so BEC is a symmetric DMC.
Another special case is the AWGN channel. BPSK modulation, the input character set for the AWGN channel is { -1,1}. First, the output can be quantified using a quantization interval relative to Y = 0 (i.e., the continuous output is approximated as discrete output), its sub-channels are BSC, and the resulting channel is symmetric according to the above definition. Secondly, this quantization interval can be set infinitely small, its sub-channel is still a BSc, but the number of sub-channel is approaching infinity.
Weakly symmetric DMC
definition: for a transfer probability matrix, if each row of it is an identical arrangement of the other rows, and each column is equal, then we call the DMC as weakly symmetric DMC described by this transfer probability matrix.
eg. given a weakly symmetric DMC, whose input character set is {0,1} (note that this place is wrongly written as {0,1,2} in "2"), the output character set is {0,1,2}, and its transition probability matrix is as follows:
If the output symbol set of a channel is {0,1}, then we call this channel has two yuan input,"two yuan input symmetric no memory channel", this is the object that the algorithm in this paper studies. Let's take a brief look at its nature.
The Arikan paper (specifically "channel polarization ...") provides a detailed description of the nature of the "symmetric two-yuan discrete non-memory channel" in section vi-a of the reference "1", which is also described in section Ii.
For a memory-free channel W, we assume that its input is a binary number, and it is symmetric, there is w:x→y, where X is the input symbol set, X={0,1};y is the output symbol set, Y arbitrary. By definition, for Y, there is an identity arrangement that satisfies:
i);
II), for all the y∈y are established.
For the sake of convenience, we will remember that and y are conjugated pairs. We assume that the output symbol set Y is a finite output set (this assumes that the algorithm is proven when it is extended to a channel with a set of sequential output symbols).
In section vi-a of the Arikan paper, a theorem is given:
Proposition 13 (theorem 13):
If a B-DMC w is symmetrical, then, and is also symmetrical, and has:
Where the operation "·" is a shorthand. We précis-writers x y: when x=0, x y→y; when x=1, X y→. As defined above, y and is conjugate pairs.
This is a very important conclusion, we will use this formula several times in the following channel operation to simplify the calculation, please pay attention to the reader.
The proof of theorem 13 is given in Arikan paper.
Part3. Merging functions
From the point of view of logical order, let's look at the contents of the merge function first. But before we do, we have to familiarize ourselves with the weakening channel and the hardening Channel, which is necessary for the introduction of the merging function.
Weakened channel
For the original channel W:x→y, for the channel q:x→z, if there is an intermediate channel p:y→z, so that for all X and Z have: So, we write that the reference to the Q is weaker than W.
Enhanced Channel
The description of the enhanced channel is similar to that of the weakening channel, in fact, the expression of the enhanced channel can be obtained simply by changing the position of the W and Q in the above formula:
Write, the reference to Q ' is enhanced relative to W.
The understanding of merging functions begins with a lemma:
LEMMA7:
set W:x→y as the BMS channel, assuming that y1,y2 is the symbol in the output character set Y. For channel q:x→z, define its output character set Z as: Then, for all X and Z, define:
Well, there is .
In lemma 7, the "\" in the character set Z means "not included".
As we can see, in this lemma, we put in an original W channel and get a weakened channel Q. And from W to Q, the output character set size of the channel has changed, the Q character set size is 2 smaller than W. Therefore, from this point of view, we can get a weakened, smaller character set of BMS channel at the same time through Lemma 7.
The proof of lemma 7 is not difficult. We only need to make a clever definition of the intermediate Channel P:
For the intermediate channel p:y→z, the mappings from Y to Z are as follows: with a probability mapping of 100%, and a probability mapping of 100%, the remaining symbol one by one maps to itself.
Obviously, such intermediate channel is present, according to the preceding description, q is the weakening channel of W.
Evidence.
The merge function is a powerful tool to solve the explosion growth of the channel output character set caused by the Arikan channel merging iterative formula. According to Lemma 7, for an original channel W with a V-size output character set, by merging the operations of a pair of symbols (and their conjugate symbols), we can reduce the output character set size of the channel by 2 each time. By invoking this operation multiple times, we are able to reduce the output character set size of W to any size μ. In "1", μ is also used to denote "fidelity", in general, the greater the μ, the fewer calls to merge functions, the better the system performance, the corresponding output character set is greater, the polarization code construction algorithm of the computational complexity is higher.
Now, we have a powerful tool for merging functions, but to apply it, there is one more problem to solve. In each merge operation, which two symbols should we merge, is it optional in the output symbol set? Or is it necessary to follow certain principles?
The theorem 8 in "1" defines this.
Theorem8
For BMS channel W:x→y, the output character set Y has m elements, assuming:
1≤LR (y1) ≤lr (y2) ≤ ≤LR (YM)
For any of the two symbols in Y, a, B, set I (A, b) to the size of the combined channel capacity. Then, for 1≤i≤j≤k≤m, there are:
In Theorem 8, LR (y) represents the maximum likelihood value of the Y symbol under a likelihood decision. By theorem 8, we can find that the channel capacity of the channel obtained after merging the adjacent two symbols is always greater than the combined result of the non-adjacent symbol. This theorem guides us to merge each time we merge, by selecting adjacent symbols. The proof of theorem 8 is given in the appendix to "1".
To do this, we sort the output symbol set Y for the W channel according to the maximum likelihood value. Assume that the output symbol set has a Y size of 2L and contains l conjugate pairs.
It is noted that the likelihood value ordering in Theorem 8 has the implied condition lr≥1.
The likelihood value is defined as:
According to the definition of the symmetric channel:, can be obtained, for having:
Similarly, for the following: Therefore, for conjugate pairs, 1≤i≤l, both must have a satisfying likelihood value greater than or equal to 1. We pick out this symbol as the representative of this pair of conjugate pairs, participate in the sort of likelihood value, and finally get :
When we merge the adjacent two symbols Yi and yi+1 into Z, there are:
In addition, in order to minimize the loss of channel capacity during the process of merging, we tend to choose a pair of adjacent elements with the smallest change in channel capacity before and after merging. Therefore, after the output symbol set is sorted by likelihood value, one of the things we do before merging is to look for adjacent elements with the lowest channel loss. We set the loss of the channel before and after merging, and use this as the basis for selecting the merging of adjacent elements.
Following the symbolic naming rules in "1", we set A,b,a ', B ', respectively: define
which
Thus, before merging, we can find a pair of adjacent elements with the lowest loss value by calculating the loss of the channel capacity after merging all adjacent elements.
"1" section V of this part of the introduction of the content is very detailed, including the merger function, such as stacks, linked lists, pointers related to the concept of data structure, and briefly described the merger function algorithm implementation, the idea is very clear, can be used as a programming reference.
The above introduction to the merger function is only for the channel weakening operation. Merging can also be done through channel hardening, which is a bit more complicated, and I can't articulate it, so let the reader explore it on their own.
In the next section, we will focus on channel weakening and channel hardening, and if space permits, we will explore the application of the Tal-vardy algorithm under the binary Gaussian channel.
Tal-vardy algorithm of polarization code (1)