Recently, Ian Goodfellow and other people put forward the antagonism of concentric high dimension ball, they use the dimension of data manifold to study the influence of the change of input dimension to the generalization error of neural network, and show that the vulnerability of neural network to small amount of antagonistic perturbation is the reasonable response of test error.
A lot of work has been done to show that the standard image model has the following phenomena: Most randomly selected images from the data distribution can be correctly categorized, but they are visually similar to those that are incorrectly categorized (Goodfellow et al., 2014; Szegedy et al., 2014). This false classification is often referred to as a confrontation sample. These adversarial errors are highly robust in terms of angle, direction, and Scaling (Athalye & Sutskever, 2017). Although there have been some theoretical work and coping strategies (Cisse et al., 2017; Madry et al., 2017; Papernot et al., 2016), but the cause of this phenomenon is still difficult to understand.
There are some assumptions about confrontation samples: a more common assumption is that the linear characteristics of the neural network classifier in the input space are too strong (Goodfellow et al., 2014; Luo et al., 2015). Another assumption is that the confrontation sample is not the main part of the data (Goodfellow et al., 2016; Anonymous, 2018b,a; Lee et al., 2017). Cisse and others argue that the larger singular values in the internal matrix make the classifier more vulnerable to small fluctuations in the input (2017).
While trying to explain the reasons behind the confrontation samples, there are some work to do to increase the robustness of the model put forward some solutions. Some work enhances robustness by altering the non-linear transformations used in the model (Krotov & Hopfield, 2017), refining a large network into a small network (Papernot et al., 2016), or using regularization (Cisse et al., 20 17). Other work exploration uses another statistical model to detect antagonistic samples (Feinman et al., 2017; Abbasi & Gagné, 2017; Grosse et al., 2017; Metzen et al., 2017)). However, many of these methods have been proved to be failures of L (Carlini & Wagner, 2017a,b). In the end, many examples show the use of confrontation training to improve robustness (Madry et al., 2017; Kurakin et al., 2016; Szegedy et al., 2014; Goodfellow et al., 2014). Although confrontation training has made the model a step forward in the face of confrontation, local errors are still present (Sharma & chen,2017) as they transcend the range designed by the confrontation training.
This is particularly interesting because these models have a high rate of accuracy on the test set. We assume that this phenomenon is essentially caused by the high dimension of the data manifold. To begin to study these hypotheses, we define a simple synthesis task to differentiate two concentric (concentric) high-dimensional spheres. This allows us to study the adversarial samples in the data manifold with good mathematical definition, and we can also characterize the decision-making boundaries that the model has learned. More importantly, we can naturally change the dimension of the data manifold to study the influence of the change of input dimension on the generalization error of neural network. Our experimental and theoretical analysis on multiple datasets shows the following points:
Similar phenomena appear in the image model: Most of the randomly selected points from the data distribution are sorted correctly, but the points that are not correctly categorized are "close" to the incorrect inputs. Even when the test error is less than one out of 10,000, this behavior can still occur.
In this dataset, there is a trade-off between the average distance between the generalization error and the nearest mistaken classification point. In particular, we have shown that any model that can classify a small number of points in a sphere will be vulnerable to an O (1 square root D) confrontation disturbance.
The neural networks trained on this dataset logically approximate the theoretical optimal tradeoff of the nearest error average distance of the error set. This indicates that the error rate of the model must be reduced exponentially in order to increase linearly to the average of the most recent error.
We have also shown that the models trained on this dataset can achieve extremely high accuracy even if most of the inputs are omitted.
Next, we explore the links between the confrontation samples in the high-dimensional ball and the confrontation samples in the image model:
Figure 1: The two-dimensional slice of the input space. Left: two random directions; Middle: a random direction, a "confrontation" direction; right: Two orthogonal "confrontation direction". Data manifolds are marked with black, and the maximum bounds are red. A green area represents a data point that is categorized as a sphere by the Relu network. In the last picture, even if the error rate of the model is less than one out of 10,000, the mapping of the data points outside the ball is incorrectly categorized.
Figure 2: Left: two on Relu net based on 50 million samples from the RADIUS 1.0 and 1.3 of the high dimensional ball training results. We evaluated the accuracy rate in the whole space using the 1.15 theory of decision boundary. We draw the exact rate of each norm on 10,000 random samples. It can be found that with the distance away from the theoretical boundary, the accuracy rate rises dramatically. When we were far enough away from the theoretical boundary, we never observed random sample errors. However, even if we are away from the theoretical boundary 0.6 or 2.4, we can still find "error". Right: In the case of d=2, we trained the same Relu network on 100 samples. By visualizing a dense subset of the entire space, the results show that there are no errors in the two circles of the model.
Figure 3: Left: Two network after training on 100,000 samples the distribution of alpha. The red line represents the desired interval for the perfect classification. Although only 1e-11 of the samples were incorrectly categorized under the vast majority of alpha. Right: two times the network uses the training curve after the perfect initialization without errors. As the training progresses, the average sample loss is released to minimize the very poor sample loss. The error quantity alpha increases at a similar rate.
Figure 4: We project the sample with the input dimension d to the classification model in the K Wizi space. We then draw the k/d graph needed to get the error rate determined. We found that the k/d quickly decreased as the input dimension increased.
Figure 5: We compare the nearest error average distance and error rate of the three networks trained on the High-dimensional ball dataset. The results show that all errors are within the ball. The three networks are trained with 5 training sets of different sizes, and their performance is measured by the differences in the training process (the network eventually becomes particularly accurate so that the image cannot be displayed because the error rate is too small to be statistically estimated). Surprisingly, we observed that the measurement of the error and the average distance to the nearest error tracked the optimization process. It should be noted that some networks may exhibit better performance than optimizations, due to some noise when estimating error rates and average distances.
Thesis: Adversarial spheres
Thesis Link: https://arxiv.org/abs/1801.02774
Absrtact: At present the most advanced computer vision model shows the fragility of weak antagonistic perturbation. In other words, most of the images in the data distribution can be correctly categorized by the model, and these correctly categorized images are visually similar to those that are mistakenly classified (the human eye is undetectable). Although there is a lot of research in this phenomenon, the cause of this phenomenon is still difficult to understand. We assume that this counter-intuitive phenomenon itself is due to the high dimensional geometric characteristics of the input data manifold. As a first step in exploring this hypothesis, we have studied the classification of two high-dimensional concentric spheres on a simple synthetic dataset. This dataset shows the trade-off between the average distance between the test error and the nearest error. In particular, we have shown that any model that can classify a small number of points in a sphere will be vulnerable to an O (1 square root D) confrontation disturbance. In foreign countries, when we train a network of several different structures on this dataset, all of their errors reach this theoretical boundary. The theoretical conclusion is that the vulnerability of neural networks to small amount of disturbances is an inevitable result of observing the number of test errors. Hopefully our theoretical analysis of this simple example can drive this exploration: how complex geometric structures of complex datasets in the real world lead to adversarial samples.