Project planning and Practice
The analysis and answers are based on the following individual project requirements:
Personal Item Links
The form of my PSP2.1 is as follows, and I do not understand the step of calculating the workload. I think it's a step in the team work, so it's not filled in.
Time
PSP2.1 
Personal software Process Stages 

Planning 
Plan 

Estimate 
Estimate how long this task will take 
30 
Development 
Development 

Analysis 
Demand analysis (including learning new technologies) 
10 
Design Spec 
Creating a design Document 
10 
Design Review 
Design Review 
4 
Coding Standard 
Code specification (to develop appropriate specifications for current development) 
1 
Design 
Specific design 
20 
Coding 
Specific code 
8 
Code Review 
Code review 
5 
Test 
Test (selftest, modify code, commit changes) 
5 
Reporting 
Report 

Test Report 
Test report 
0.5 
Size Measurement 
Computational effort (this is?) 
? 
Postmortem & Process Improvement Plan 
Summarize afterwards and propose process improvement plan 
5 

Total 
68.5 
As can be seen from the plan and implementation tables above, my development time is actually twice times as long as the estimated time. And put on the design of the time is very long (requirements analysis + Generate design documents + Design Review + specific design =44). The time devoted to this is mainly because of the two great puzzles I encountered during the design process:
1. Efficient construction of the exchange law to meet the nonrepetition of the expression and 2. Able to withstand large enough data to test .
These two questions puzzled me for a long time, and most of my days were solved by these two seemingly notsosimple and actually more difficult problems: because the two problems are actually mutually restrictive.
If we want to be able to withstand a wide range of data testing, then we have to understand that the number of builds is large enough to bring about a rapid rise in repetition rate.
Assuming we are generating operands and operators in a randomized way, the probability that we might repeat at the outset is very small. For example, the number of all permutations within a range is about 100,000, and only 1000 is required, so the probability of repetition is much lower than 1% per second. If this is done randomly, the repetition of the expression may be very low, and in the end it will not waste too much time on random processing.
But what if we asked to generate 50,000 expressions? At this point, the probability of collisions between the "randomization" generated expressions will become very high. Then we may fall into a dilemma where the time taken to generate 80 of the expression may not be more than the time it takes to generate 20 of the expression.
However, there is no way, the high efficiency of the structure to meet the Exchange law of the nonrepetition of the expression itself requires randomization after the probability of repetition is low. It is necessary to iterate whether the query is repeated, so the bottleneck is the time that is randomly generated and whether the algorithm can be queried quickly when traversing.
Here are some of the problems I encountered during the project and some of my own trickery. It is not clear before (the smallest representation of the tree) this kind of magical judgment tree isomorphism method, also does not have the historical God so Solid algorithm Foundation, therefore gives everybody caught dead.
Project Difficulty Analysis
I think from my point of view, this project has a few more difficult to solve problems:
 How can I construct an expression that is nonrepeatable?
 How can I support expression construction in as large a range as possible?
 How do I mix fractions with integers?
 How can we guarantee the effectiveness of subtraction and division results?
 How can I calculate the value of an infix expression?
In my opinion, the first question and the second one are the most difficult to put together. In the former teacher Luo's blog discussion area with Mr. Luo confirmed that the "Union law" structure of the isomorphism is not part of the category of repetition, the heart slightly relaxed point. But the problem is still difficult to solve. Let me elaborate on the solutions to other problems.
Blending of fractions and integers
The first time I thought, was to redefine the input, like
static operator + (int a,fraction b)
Or do you ParseFrac
want to add one to explicitly turn integers into fractions? This function should be implemented with the ability to convert a positive integer to a denominator of 1, and the numerator to be the fraction of that positive integer.
I finally found a better way to solve this problem, the solution lies in C # 's custom implicit conversion! Using the keyword implicit to customize the implicit type conversion, let me cite a piece of code as an example.
Static Fraction (int input) { //implicit means obscure //code to convert from int to fraction fraction 1); return output; }
The use of this custom type conversion is also very useful, such as the following code
int 3; Fraction b = new fraction (3,4); //b = 3/4 Fraction C = a + B; //Here's ' a ' convert to fraction class automaticlly
Of course, in order to achieve the above operation, we need to overload the operator, which is the 2nd knowledge I learned.
Static Fraction operator + (fraction lhs,fraction rhs) ...
Above is the ' + ' addition operator overload for the fractional class. After overloading, we can use +
the middle of two Fraction
objects to directly calculate the formula.
Of course, for the sake of brevity in the calculation of fractions, the calculations in the process I use involve the following points:
1. Do not use with the Yu Jin fraction as the calculation unit, with the Yu Jin only in the display of the time will be printed, the rest of the time in the form of false fractions to facilitate the calculation.
2. True scores without excess are still printed as they are.
3. Integer unity custom is a false fractional form with a denominator of 1.
Of course, there is a need to consider that the numerator denominator has the numerator problem of the convention number, which is particularly important in the printing, calculation and the most important equivalence comparisons.
But the problem arises again, and if we need to simplify, the way I used to do this is to start by traversing the smaller numbers in the numerator and denominator, and then go down to find the twodigit greatest common divisor, and then divide the greatest common divisor into the simplest form. But there's a big problem with this: my simplification is needed after each operation, because each operation can result in the number of scores and molecules producing the Convention. such as the following example
 1/21/6 = 2/6
 1/9 + 2/9 = 3/9
 1/9 * 3/4 = 3/36
 1/9÷1/3 = 3/9
But there is a problem that arises as the range (r) expands, and since the molecule is of (0,r^2), the largest number we actually produce in the fractional formula may be as high as r^4, or even r^8! When the value field is 100, our numerator and denominator can be as many as hundreds of millions of orders of magnitude.
In this case, the timeconsuming is amazing when simplifying, especially when the numerator denominator coprime, the effect on the program after each operation is very large.
So later I improved the algorithm, will seek greatest common divisor algorithm improved to Euclid algorithm, sure enough, in the range of the scope of the optimized code effect greatly improved.
But we also have another problem, and the problem is also very frightening.
We have just mentioned that the expansion of the range has a very large effect on the maximum value of the final result, which can be as high as r^8 times. For example, if our domain value is 20:
+ 'one/4'/6 ' 1/2' 1/
There is also a hole in the area, which is the upper limit of the range. It is said that the highest possible level of r^8 in the molecule is based on calculation. We envisage the existence of a number whose molecules are close to r^2 and the denominator is close to R, but the numerator is coprime with the denominator. Set this number to X, then
X * x * x * x can completely reach the order of magnitude of r^8. According to this calculation, if you use int
a type that defines the numerator denominator, then as long as R reaches a number, it is possible to have numbers that exceed the int type range, resulting in a negative result.
To solve this problem, we first need to define the numerator denominator as a long
type, so that, as a rough calculation, we can know that this can be defined up to 8 square root 2^631
values, that is, the maximum can support to about 200 of the range, this range is sufficient.
But this time the pit appears, and the function in C # Random
does not support the long
generation of the type, but the question has already been answered on StackOverflow.
So I took this little brother's advice and ended up using a long Random
function:
Static Long Longrandom (longlong max, Random Rand) { byte[] buf = new byte[8]; Rand. nextbytes (BUF); long Longrand = Bitconverter. ToInt64 0); return (Math. Abs (Longrand% (maxmin)) + min);}
Finally, the problem was solved successfully.
Ensure the effectiveness of subtraction and division results
According to the requirements of the topic, the result of subtraction in this topic can not appear negative. I thought of three algorithms before:
1. Randomly generate two numbers, and if they subtract less than 0, discard the calculation. (The probability that the second number is actually larger than the first number is (11/n)/2)
2. Randomly generate a number as the meiosis, and this number is used to regenerate a random number for the new range, but it will waste a certain amount of time to produce a random number.
3. If ab<0
, then swap a
and b
. This algorithm just exchanged a bit a
and b
the location, so very simple and convenient!
Finally, I used a third algorithm to ensure the effectiveness of the subtraction results.
For division, if the random divisor is 0, it is added 1, so that an integer can be created.
Computes the value of an infix expression
In the sophomore data structure class, I learned the algorithm of infix expression to suffix expression and practiced it. So there is not much difficulty in calculating the infix expression algorithm.
Ways to construct an expression
My original idea was more mundane and simple, and the stochastic process constructs an expression. The meaning of randomization is that the random generation of operands and operators is then written into the string, followed by random parentheses into the string, and then evaluated by the string expression.
However, if you enclose the string in parentheses randomly, it can be cumbersome to match the parentheses before and after. So I've got two ideas:
1. Suffix expression> infix expression
2. infix expression> suffix expression
That is, to construct the suffix expression, calculate the value, and then turn to infix expression display it?
or directly constructs infix expression, displays, then becomes the suffix expression to carry on the computation.
At the beginning, considering that the parentheses match and reasonable (that is, the parentheses are necessary), I was prepared to choose to randomize the construction of the suffix expression, but the probability of randomization of such a set of expressions is too small, while generating a large number of expressions, there will be serious duplication problem. At the same time, it is very easy to repeat the random number because the interval of generating two random numbers is too small.
So I decided to construct a valid infix expression directly.
So how to generate infix expression How to generate fast and can meet a certain amount of demand it? I think about it again and again I thought of a method of lysis, the process of lysis is as follows:
1. Number of build operations
2. Splitting the operand into two operands and operator
3. Randomly specify an operand to be continuously split until the expected number of operators is reached (the number is also randomly generated).
In fact, such an algorithm would be to meet the real needs of elementary school students, because not only can generate real random expressions, but also can meet the value of each expression in the pupil recognized by the number of fields within the range.
But here's the problem:
1, generated by the operand operand and operator, equivalent to the result of the introduction process, then my first operand to be set in how much appropriate? If it is only within the range, will it be limited? Wouldn't it be hard to be legal if it wasn't in the range?
2, the operand and operator are generated by the operation number decomposition, which inevitably reduces the freedom of operation.
But this is still very inefficient, and because the expression value limits the number of formulas that can be generated. The repetition rate becomes high, the cost is too great, and the probability of generating a legal formula (that is, all the numbers are within the predetermined range) is not high.
So the algorithm was aborted. Here's a look at an algorithm I'm using now, and of course it has a number of drawbacks: The generalization is not strong (it can't be applied to more operations Shizi), and the arithmetic expression cannot get the great part. But it has a big advantage: The build process is simple and highly efficient.
repeatability Detection and avoidance
Note: After asking Mr. Luo, I got a strict definition of repeatability, which is based on the smaller differences in the Exchange law, so we just need to consider the smaller differences in the Exchange law.
With regard to repetitive testing, my previous thoughts were:
Each formula can correspond to a binary tree, we only need to construct a special unique code for each calculation of the corresponding twofork tree, and all of its father nodes for + or * place its subtree to construct its symmetric tree, and then add all the encoding of its symmetric tree into the hash sequence to detect duplication.
However, there is a relatively large drawback: each binary tree hash signature calculation, the value of the domain is quite large when the calculation of the hash code will become very large, very drag performance. So finally the idea was denied. (But then I realized that even with ulong, it would be possible to extend the range to about 200, so this is actually feasible.) Especially after seeing Shin's blog, 树的最小表示法
This method is more adaptable. Here is my design mistake, no practice denies some kind of scheme. )
So I began to think about how to avoid repetition in the idealized one thing, that is how to avoid repetition when constructing.
So I think: since the upper limit is three operators, then we can use all three operators.
So now my algorithm is a simple algorithm:
 First, a large number of individual expressions are generated by a twodollar operation, and the number of individual expressions for each of the four operations is approximately the same (I ended up with 1/20 numbers to generate the single expression).
The logical representation of generating a twotuple is as follows:
1, + Method generationin this process, the generated two is placed into the Add
array, and for each generated formula, to query whether the previous repetition, the comparison requires overloading the = = operator.
2,Method generationin this process, the resulting subtraction two is placed into the Sub
array, and in order to ensure that the result of subtraction must be greater than 0, if the previous number is smaller than the number of subsequent, then the two number is swapped order.
3. * Method Generationin this process, the generated multiply two elements into the Mult
array.
4,/Method generationIn this process, the generated Division Two is placed into the Div
array, and in order to ensure that the division of the divisor is not 0, if the divisor is 0 o'clock to change it to 1.
 Use a twotuple expression to generate a fourtuple expression. This time, for example: (2+3) * (8*7), 2+3,8*7 is a twodollar value. In order to avoid repetition in the exchangelaw sense, we follow the following principles when generating a fourtuple expression: 1, if the operator is * or +, then the second binary in the twotuple array position must not have the first binary ordinal small. 2. If the operator isand the result is negative, then the order of the twotuple is reversed. 3, if the operation identifier/and the divisor is 0, then the divisor two is reselected.
In fact, this algorithm can also increase the number of expressions, such as the use of twodollar + operand to generate threeway, and then the threedimensional plus operator to generate fouryuan. But because our topic actually does not have the extremely big request to generate the quantity, therefore I did not realize. Because the mathematical calculation found that the number of the fourdollar formula accounted for the total number of expressions (in most cases) (r^21)/r^2. So the number of fourdollar formula is enough to meet the needs.
Project Test and summary
Arithmetic Individual Project Reflection summary