Neural Network-making prime number Reader

Source: Internet
Author: User

It took a week to learn about neural networks after soy sauce in the Knowledge Engineering Center. The teacher arranged a question and asked me to try it. I did a little simple. I conducted several groups of tests and wrote a summary report. I posted it here.

After more than a week of experimentation, I have a simple understanding of this issue. The following is my thoughts on this issue. In the last two days, I suddenly felt that the problem was much clearer.

I believe that the primary problem to be solved is not the network's prediction ability, but how to train the network in a closed interval using appropriate methods, in order to make the network have a good prime number recognition capability in this closed interval.

Although I do not know the spatial dimension of prime numbers, I am sure that for prime numbers in a larger range, their spatial dimensions must be very high. I have consulted the relevant documents. The high-dimensional space requires intensive sample data for good learning. However, such intensive samples are difficult to find in dimensions, resulting in a dimension disaster. The increase in dimensions will also lead to exponential increase in complexity, resulting in space filling degradation of consistent random distribution points in high-dimensional spaces. In short, functions defined in a high-dimensional space are much more complex than those defined in a low-dimensional space, and these complex things are hard to distinguish.

The above conclusion is a conjecture I made in combination with the experiment. I feel that to solve this problem, breaking through the "dimension explosion" will be the key. The following is my analysis process.

We do not know the prime number dimension, but according to my experience and contact information, it is very complicated. This problem is a bit like how big the universe is. There is no experiment or theory to prove that the universe is infinite, but with our current experience and all the information we have, it shows that the universe is infinite. I try to imagine that, within a certain range, the growth of the prime number dimension is smooth. If this conjecture is true, it is possible to predict the prime number within a certain range. Next ~ All numbers whose tails within the pow2 (10) range are 1, 3, 7, 9 (even numbers and 5 can be excluded by 2 or 5) are trained as samples, then the closed detection is performed. The recognition rate is more than 99%, and the next interval is predicted. For example, for pow2 (10 )~ Pow2 (11), pow2 (11 )~ Pow2 (12 ),......, It is found that the recognition rate of prime numbers drops a little from the best 30% to the last 10%. For similar tests, I performed several tests on changing the network structure and training sample space, and found that all the results were with the increase of the prediction range, the recognition rate of prime numbers is decreasing linearly. The reason for the above phenomenon may be that the prime number dimension increases with the expansion of the range. Of course, I cannot exclude other factors now.

For example, training results are "over-fitting" due to excessive training in a small range ". Because the network is too good for sample learning, it is almost "dead". An extreme case is that the input judgment is completely similar to the process of "checking the table" for the training sample. As a result, some rules that may already exist in a local range (this rule does not apply to prime numbers in other ranges) become the basis for judgment, resulting in unsatisfactory prediction results. However, this situation often requires over-training for networks that are far more complex than those that require networks. The network structure I select follows the process from simple to complex experiments, the best effect is selected from many network structures, so this situation is unlikely, but I cannot provide the exact evidence to eliminate it completely. However, to be certain, the network with the best prediction capability is not the network with the best closed test performance! Of course, the closed test results cannot be too poor. This will lead to insufficient network learning capabilities and the inability to fully learn samples. As a result, it is impossible to get any good prediction results in the future.

In another case, the dimension is actually increased, so that we simply cannot get a good recognition network by relying on the samples in the previous range. As you said last time, the amount of information provided by the sample is insufficient. I didn't think of any good solution to this problem, because it is caused by the complexity of the prime number itself and it is really difficult to solve it. I think since prime numbers are so difficult, is it possible to accumulate the product of prime numbers within a known range to generate a combination of numbers in the space to be predicted, and then "sieve out" the prime number through the determination of the combination number. To verify this idea, I conducted a broader experiment on this idea, that is, training a group of samples (including prime numbers) in the prediction space, the obtained results are compared with the prediction results of the network before the training. What makes me very depressed is that this does not produce good predictions. On the contrary, the results do not show good network performance before training !!! Of course, I only conducted several groups of experiments. This cannot be considered a conclusion. Cause Analysis: there may be a problem of over-approximation in a certain space. Because the training of the BP neural network is a global approximation, the network prediction capability is always the best. However, if you overemphasize the approximation of a certain point or interval, global optimization will regress. The result is, the network performance in a certain range is better (the accuracy rate can reach 70% in the closed test of the re-training sample), but the fitting or prediction ability in other intervals will be worse. (Can you explain this way? Because the network has learned the new rules and "forgot" the old rules, this problem can be solved by changing the network structure, this problem will become more complicated and I have not gone into it.) even if we can "join the new rule and never forget the old rule", we can only provide information about the sum, however, we cannot add information about prime numbers. I did not think carefully about the difficulty of using this "anti-screening method" to identify prime numbers, but I can predict it, this will not be less difficult than prime number recognition (for humans, this learning method may seem simpler, but it should be the same for machine learning ). This is not the case. Based on the prediction results of the network for the next interval, the non-prime result is excluded through detection, and then the training is performed based on some known co-number samples. Of course, if the problem above can be well solved, it should be feasible in principle.

From the above description, we can see that there are not many exact statements that I use to support the conclusion. It is not entirely my intention to be lazy (of course this reason cannot be deleted) and I will not conduct the test, however, I do not have a good degree of control over neural networks. To identify a problem, we often need to conduct a lot of exhaustive tests to transform all the training parameters that can affect the results, to get a satisfactory Network (even so I don't trust my results ). Training a neural network is a time-consuming task. It usually takes a long time for a seemingly small-scale network to be trained for convergence (Improper use of training parameters ). Of course, in the course of the experiment, I also accumulated a lot of so-called training experience. For relatively simple polynomials, I can do a good fit (I have done a few experiments, it is really good ), I also feel that neural networks are more like art, and experience and intuition often help me save a lot of time.

Another reason is that due to the uncertainty of the neural network itself, different networks (such as the initial weights or differences between training samples), even if the same training parameters are used, the training results will not be the same. Due to the unknown performance of the network, my data sensitivity is greatly reduced. I cannot identify the cause of small data changes and can only make up for it through more experiments.

However, if my above conjecture can be verified, the first problem we need to solve is as mentioned in the opening section. If we can train a larger range of samples and get a good network, combined with the training method, a network with good prediction performance within a certain range of space can be obtained. But is it feasible to extend it like this? Of course, there is a premise that, due to the space of prime numbers in the high dimension, training exceptions will be difficult to converge, and dimension explosion cannot be avoided theoretically. We can only try to delay the explosion time as much as possible, to avoid explosion. Therefore, no matter how the spatial features of prime numbers behave, training an excellent network in a high-dimensional space will be the first point to start. Otherwise, the BP network will be a dead end!

I have read some documents and tried to find a solution. The classic method cannot be either of the following:

Add more existing human knowledge. I spent a whole day studying the feasibility of this method and found that the prime number samples based on human knowledge are more sparse than the prime number distribution (in fact, they are Super Sparse ), how to use such a small number of samples to train the network is beyond my understanding of machine learning.

Then, we can increase the smoothness of the fitting function, which can indeed save a lot of computing time. Because it involves more complex advanced mathematics and advanced algebra knowledge, I did not conduct in-depth theoretical studies (in fact, my current mathematical knowledge is very limited ).

Here, I think at this stage, the significance of creating a prime number identification device within a large scale far exceeds the meaning of analyzing the space dimension of prime number distribution (in fact, the latter's analysis also requires the former as the basis ), the main task is to conduct more in-depth research and analysis on neural networks (this process is a headache ).

The above is my simple view of this test question. There will be many problems, and my conjecture may be very different from the actual situation. However, it always provides a solution and train of thought to solve the problem, this may be the main purpose of the experiment.

Over the past few days, I have stabilized my uneasy mind. I am able to study, write programs, observe data, and really do something meaningful, instead of worrying about time. Sometimes YY seems to be a scientist, although I have not shown any fanatic views on this issue ......

The experiment is like searching for the node I need on an unknown tree. I only have two options for depth and breadth search. If you insist on in-depth detection, you may be stuck in the dead end. After looking for a long time, you will not have any gains. In the end, you need to trace back and try again. If we only pursue the breadth, we may fall into the infinite YY, and the experiment will not be able to get any progress. The ideal state is the combination of moderate depth and breadth, which requires a full understanding of the data and a more in-depth field. At the same time, this knowledge can also be used for a large number of pruning and truncation. Of course, there may be conflicts between knowledge and search speed, and too much domain knowledge may be required for small-scale problems. It seems faster to raise the question. In short, keen insights are the basis for successful experimentation.

The above is a few days of research experience.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.