Quantification, data type, overflow, and underflow

Source: Internet
Author: User

Before writing an iterative algorithm, found that the algorithm in some cases will be wrong, and later in the debugging process found that some theoretically greater than 0 of the value will be in the iterative process into 0, the final calculation process occurred in addition to 0, resulting in an error. The original purpose of this article is to clarify why some theoretically greater than 0 of the number in the actual calculation will become 0 ( underflow ), and then by the way a lot of people discussed the data type conversion , computational precision is also written in. Some of the previous blogs may have some limitations (limited to decimals or other), which is really a bad topic, so I start from the quantification of digital signal processing, trying to give a more intuitive understanding. The article may have some questions, and please criticize it.

1. Quantification

Quantization in digital signal processing refers to the process of mapping an input signal from a large set to a small set. The process of mapping a continuous quantity to a discrete set can be understood simply and narrowly. As shown, the red curve is the input signal, and the result obtained by 3-bit quantization is the blue curve.

(by Hyacinth-own, CC By-sa 3.0, https://commons.wikimedia.org/w/index.php?curid=30716342)

Naturally, it leads to three questions.

    1. Why to quantify the input signal
    2. How quantification affects the signal itself
    3. How to quantify the input signal

Why to quantify the input signal

The received signal, such as the electromagnetic wave in the communication process, is generally regarded as analog (continuous in time, continuous value), in order to store and calculate these signals, it needs to be sampled and quantified to become a digital quantity. Quantification is actually for two reasons, one is storage, the other is computing. The non-quantifiable signal cannot be stored in memory and cannot be computed.

How quantification affects the signal itself

In the quantization process, the set of input signals is often non-numerical (or set with infinite elements), and the set of quantization output signals is limited. This means that quantification is an irreversible process, which naturally has an impact on the signal. As shown, the quantization process will result in quantization noise (error, i.e. the difference of the signal before and after quantization), that is, the distortion of the signal after quantization, without additional prior knowledge, the distortion is unrecoverable.

by Gregory Maxwell- Http://wiki.xiph.org/File:Dsat_011.png , CC by 3.0, https://commons.wikimedia.org/w/index.php?curid=26171868

How to quantify the input signal (note: Scalar quantization is only discussed here)

In the two images above, quantization is the same (binary) number assigned to a signal falling into the same region by dividing the input range into several regions. The Division of the region is homogeneous, and this quantitative relationship can be expressed as.

The quantization error is the same for different input signals, which is called uniform quantization. However, in many cases, we pay more attention to the relative quantization error (quantization error/signal value), that is, for small signal, quantization error is small, and large signal can have relatively large quantization error. In this case, different quantization methods can be used, as shown (note: This is what I made up, there is a corresponding non-uniform quantification of the standard).

The non-uniform quantization can obtain a higher Snr, and the two different quantization methods have different applications. There are other methods of quantification, which are not described here.

2. Data type

Although the previous section discusses quantization in digital signal processing, quantification, or similar quantification, actually occurs in many places. For example, the storage space of a computer is limited, and 32-bit storage space can only represent a different kind of possibility. However, in many cases, we expect the operation to be performed on the real number domain, and in the case of digital signal processing, the computer can only store and compute the quantized signal.

2.1 Data types and Quantification

Data storage, designed to data types, is discussed here in two ways: Integer (integer) and floating-point (float). Here, like the quantization idea, we think that the computer quantifies the input on the real field as an integer/float representation.

Integer (consider a 32-bit signed integer)

Enter the numeric value of the real field, the quantization range is, the quantization method is uniformly quantified, assuming that the quantization result, then there is

For example, then take. For different, quantization errors are a constant value.

Floating point (32-bit floating point)

Floating-point and integer ratios are slightly more complex, referring to Wikipedia, where 32-bit floating point storage is represented as.

The corresponding floating-point value can be represented as (decimal)

Floating-point numbers greater than 0 are sequentially, whereas floating-point numbers greater than 1 are sequentially, i.e. the quantization interval is different, in fact, the relationship between quantization precision and data size can be expressed as

The number stored as a floating point representation on a real field can be considered as a non-uniform quantization process.

Note 1: The quantization in this section should actually quantify and encode two processes, not only quantifying the values, but also encoding the storage with the corresponding encoding.

Note 2: The definition of the data type is more complex than described in this document, since design to 0, infinity and non-number processing.

Note 3: For different operating environments, the definition of float varies.

2.2 Error-prone formulas

Like the following code has been discussed in the blog Park many times, that is

float a = (float) 10.375; float b = (float) 2.263; System.out.println (a+b);

This code runs on my machine The result output is 12.6380005. Although there is a discussion about (int) 10.375+ (int) 2.263 = 12, but whether it is integer or floating-point type, the cause of this problem is the same-we arbitrarily select two numbers in the real field (or the Rational number field), but the computer stores the results after quantization, This is true for either an integer or a float. It's just that we habitually think of float as powerful and omnipotent, but its ability to express is still limited.

The first section discusses the signal changes before and after quantization, which can result in quantization noise. Similarly, the number we give, such as 10.375 and 2.263, uses floating-point storage to generate noise as well. The difference between the calculated result and the theoretical value is the direct manifestation of this noise.

We are not able to guarantee that the calculation seems to be wrong, but it is actually the error that the computer calculates in its own logic.

2.3 Conversion of data types

Similar to float→double, or int→long of such type conversions, or quantization intervals increase, or quantization accuracy increases, the conversion process does not cause any problems, simple example (). A quantitative relationship like this, from Int→long→int, or float→double→float, does not introduce noise further.

However, the conversion is not, that is, int→float→int

int a = 200000002; float b = (float) A; int c = (int) b; System.out.println (c);

The output is 200000000, and the conversion process can be expressed as

3. Overflow and underflow

Overflow (arithmetic overflow), where the result of the operation exceeds the range that the register or storage space can store or represent. From a quantitative point of view, can be considered to be more than the scope of quantification, overflow is generally easy to find, but sometimes ignored. For example, Leetcode's first question, a code that has some problems can pass the test

Given an array of integers, return indices of the both numbers such that they add-to a specific target.

 Public int[] Twosum (int[] Nums,inttarget) {     for(inti = 0; i < nums.length; i++) {         for(intj = i + 1; J < Nums.length; J + +) {            if(Nums[j] = = Target-Nums[i]) {                return New int[] {i, J}; }        }    }    Throw NewIllegalArgumentException ("No, Sum solution");}

The above code is the solution provided, but because nums is of type int, target is also of type int, so target-nums[i] may overflow, resulting in an incorrect result.

Relatively speaking, "Underflow" is a lot of concealment, underflow (arithmetic underflow) difficult to find, also very bad to deal with. The underflow here does not mean that the data is less than the minimum value that can be represented, such as a 129 no longer int8 representation range, which should be categorized into overflow, i.e. "the result of the operation exceeds the range that the register or storage space can store or represent."

Floating point design process, the larger the data, the lower the quantization accuracy, however, there is an exception, that is, 0 near. 32-bit floating-point numbers, and 0 The most recent normal number for (different standards differ), however than the nearest number is. This means that the quantization accuracy around 0 is relatively low, and the relatively low problem does not cause too many problems, but once a non-0 data is small enough to be stored as 0, it can cause a series of problems. I wanted to give an example, but there was something weird in Java, and I didn't get it .

float a = (float) (1.5*java.lang.math.pow (10,-45)); float b = float.min_normal; System.out.println (b>a); System.out.println (a); System.out.println (b);

Here I define a and a b,b is the smallest normal number in float, but obviously smaller than B, at the same time a is greater than 0, do not know how to deal with this. No relevant information has been checked for the time being.

But in any case, as long as a number is small enough, it will be underflow to 0, and in the iterative algorithm, this situation is very likely to occur. If unfortunately this number is counted as a divisor, then there will be a case of 0, which throws an error.

4. Other

The question that is worth paying attention to is that there is so much talk about quantization noise, so what is the computational result of computing? That

Does the computer calculate the results to ensure absolute accuracy?

It is possible under certain conditions. Returning to the first and second sections will find that the discussion is premised on

    • The set of input signals is greater than the set of quantization output signals (e.g. the input signal is analog, the output is digital) and the quantization process introduces quantization errors.
    • If the desired operation is performed on a real field, then the process of data stored by data type can be considered as the process of quantization coding.

In the field of digital signal processing, the receiving signal is analog and needs to be quantified by ADC sampling, then quantization noise is inevitable. However, the data in the computer may have different meanings. For example, use variable A to represent the number of pages in a book. Then, the number of pages in a book must be an integer not less than 0, and it will generally be limited. Then, if a uses integer storage, it can accurately represent the number of pages in a book without introducing any errors.

From this, for different applications, if there are some prior knowledge, we may be able to design different data types/structures, as well as the corresponding calculation method, to obtain accurate calculation results.

Quantification, data type, overflow, and underflow

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.