Design and Implementation of an expression computing case

Source: Internet
Author: User
Tags class operator

Author Profile

Liu Yuan, male, software engineer, you can get in touch with the author through yliu@guanghua.sh.cn.

Problem cause


In my previous data collection system for network devices and hosts, some collected data needs to be computed before being saved into the database, rather than simply storing its original values. To provide users with maximum flexibility, I imagine a user interface that allows users to enter computing expressions (or calculation formulas ). In this way, in addition to complying with a small number of rules, users can enjoy the maximum flexibility.

What are the characteristics of such expressions? It is generally not a pure expression that can be computed immediately (for example, 1 + 2*3-4 ). It contains elements called variables. Variables generally have special internal syntaxes. For example, "@ totalmemory" may be used to indicate the total physical memory of a device or host (hereinafter referred to as a device, the expression "(@ totalmemory-@ freemory)/@ totalmemory * 100" indicates the current memory usage percentage of the device. If it is associated with the alarm system and the system sends a warning when the value exceeds 80, this becomes a meaningful thing. Different types of collected data may have to undergo different computations before they are stored in the database. However, when final value is evaluated, the specific variables must be replaced with specific values (that is, the specific values collected). Otherwise, the expressions cannot be calculated. This process occurs during running.

General problems


I think expression computing is a general topic, and it is not a new topic. We may encounter it in many places. I wrote an expression conversion and computing program while I was studying. I have seen some report systems. Many of them support calculation formulas, whether they are independent, included in the MIS system or financial software. I think the formula in these systems is roughly the same as what I encountered. For me, I encountered this problem in the data collection project and may encounter it in other projects next time. Since more than once, I hope to find a general solution.

Some existing designs and implementations cannot meet the requirements


I am not satisfied with the design and implementation of the first version. Then I spent some time searching the Internet (Keyword: expression suffix or expression infix postfix ). It is disappointing that some programs found about expression conversion and calculation cannot meet my requirements. Many such programs only support pure expressions that can be computed immediately, and do not support variables. In addition, expression parsing and calculation are coupled and difficult to expand. Adding a new operator (or a new variable syntax) almost always changes the source code. In my opinion, this is the biggest defect (in fact, although I was proud to write the expression conversion and computing program, it now seems to have the same defect ). However, expression conversion and computation have mature and stack-based classical algorithms, which are discussed in many computer books or textbooks. The expressions that people write in the natural way are in the form of infix. First, we need to convert the infix expression into a suffix expression, and then calculate the value of the suffix expression. I plan to continue using this classic process and algorithm.

My design ideas and goals


Since the expression conversion and computing Core algorithms are mature, I am eager to extract them and remove (resolution-related) coupling! Imagine that if a thing has a relatively complete connotation and independence, it will produce this need, and it can also be expressed through formal separation and extraction. This process is inseparable from abstraction. I soon realized that I was actually designing a small framework for expression computing.

The expression must support the addition, subtraction, multiplication, division, and Division operators. It may also support square, square (SQRT), and triangular operators such as sin and cos. What if there are other operations, including custom operators? Are you sure you have fully considered it? Such as custom operators are completely existing and reasonable requirements. In the data collection system, I once considered introducing a diff operator to indicate the difference between the collection of the same cumulative type and the collection of the last two times (that is, the collection cycle. The above thinking prompted me to decide that the design of operators must be open. Users (this refers to user programmers, the same below) can be expanded to add new operators.

The expression can contain variables. Variables are supported throughout the entire process of expression parsing, conversion, and calculation. In the parsing phase, users should be allowed to use the variable syntax that suits them or her own. I should not implement variable recognition based on a specific syntax in advance.

It supports extensible operators, unknown variable syntax, and even basic numeric values (such as 123, 12.3456, 1.21e17) in theory, there are also various types and precision (integer/long/float/Double/biginteger/bigdecimal), which determines that a fixed expression parsing method cannot be provided. Expression parsing also requires scalability. The best result is an easy-to-use and scalable parsing framework. For new operators and variable syntaxes, you can expand on this framework to provide enhanced parsing capabilities. From an abstract perspective, the expressions I intend to support are composed of only four elements: parentheses (including left and right parentheses), operators, values, and variables. An expression string provided by an end user. After parsing, an internal expression may be generated to facilitate subsequent processing. Each element of this expression can only be one of the above four types.

Value


At first, I wrote a class that expresses numeric values, called numeral. I am concerned about whether numeral represents an integer, a floating-point number, or a double-precision number. In a vague sense, I hope it can express any of the above types and precise values. But I also hope that it can clearly express the specific type and precision value, if needed. Even when I think of numeral, it is better to express biginteger and bigdecimal (imagine that in some cases, we need to parse and calculate such an expression, which allows a large range of precision and range of values, so long or double can not be accommodated), otherwise we will be in trouble in special circumstances. In terms of scalability, numeric classes are not similar to operator classes. They should be mature and almost do not need to be extended.

After repeated attempts and chaos (Numeral was modified or even rewritten later), I found a wise way. Java. Lang. number is used directly as a numerical class (this is actually an interface ). Fortunately, in Java, integer, long, float, double, biginteger, bigdecimal, and other numeric classes all implement Java. lang. number (hereinafter referred to as number) interface, the user views and uses number as the type and precision, and has the right to control it/her, I should not determine the numerical type and accuracy in advance. The choice is expressed by the number class, which seems to be the best choice with the lowest cost and maintains considerable flexibility. The numeral class is deprecated as a subfolder.

Brackets


In expressions, the role played by parentheses cannot be ignored. It can change the natural priority of an operation and calculate it according to the user's desired order. I use the bracket class to represent parentheses. This class can be seen as final because it does not need to be extended. Brackets are divided into brackets and right brackets. I use them as two static instance variables of the bracket class (and only two instance variables of the bracket class ).

public class Bracket{    private String name;    private Bracket(String name) {        this.name = name;    }    public static final Bracket        LEFT_BRACKET = new Bracket("("),        RIGHT_BRACKET = new Bracket(")");    public String toString() {         return name;     }}


Operator


The design requirements for operators are open, which almost immediately means they must be abstract. I hesitated to define the operator as an interface or an abstract class, and finally I chose an abstract class.

public abstract class Operator{    private String name;    protected Operator(String name) {        this.name = name;    }    public abstract int getDimension();    public abstract Number eval(Number[] oprands, int offset);     // throws ArithmeticException ?        public Number eval(Number[] oprands) {    return eval(oprands,0);    }    public String toString() {        return name;    }}


This operator is designed to contain two main interface methods. The getdimention () interface conveys the following information: What is the operator? That is, several operands are required. Obviously, the most common operators are unary and binary operators. This interface method also seems to allow more operators than binary, but I have not made a deeper look at operators than binary. I am not sure whether the conversion and computing algorithms based on Stack expressions fully support binary or above operators. Despite this concern, I still keep the current interface method.

The main interface method of operators is eval (), which is the computing interface of operators and reflects the essence of operators. In this interface method, we need to pass all the required operands to it. If the operator is several yuan, we need several operands, which should be consistent. Then, perform computation that matches the meaning of the operator and return the result. If a new operator is added, You need to implement the preceding operation method of the operator.

Variable

In a sense, a variable is a "value to be determined ". Should I design a variable class (or interface )? I did. When are variables replaced by specific values? I don't know these processes and should leave them to users for processing. I have almost no knowledge about variables, so the variable class has little significance. If this class/interface is retained, it also imposes a restriction on the user. He/she must inherit or implement the varibale class/interface, so soon I discarded the variable. I just declare and stick to this point: in an expression, if an element is not a bracket, not a value, or an operator, then I will treat it as a variable.

Expression parsing interface


The basic problem to be solved in expression Parsing is: For the expression string given by the user, the values, operators, Parentheses, and variables must be identified, then it is converted into an internal, easy expression form for subsequent processing. I provide a general expression parsing interface, as shown below.

public interface Parser{Object[] parse(String expr) throws IllegalExpressionException;}


In this resolution interface, I define only one method parse (). The expression string is used as the input parameter, and an array object [] is returned as the parsing result. If the parsing is successful, you can be sure that the element in the array is either number, operator, or bracket. If it is not one of the above three types, consider it as a variable.

Maybe the expression parsing design is too general. It does not seem helpful to users because it avoids the key question "how to resolve. In my opinion, how to parse expressions is a complex and even difficult problem.

The main difficulty is that it cannot provide a ready-made method and is suitable for the parsing implementation of various expressions. Consider that you may add new operators, introduce new variable syntax, and even support numerical processing of different types and precision. As mentioned above, if you can design an expression parsing framework, you can easily expand it on the basis of this framework. However, I cannot do this completely. A default Parser (simpleparser) that has been implemented will be mentioned later ). This default implementation attempts to establish such a framework, and I think there may be some limitations.

Converting an infix expression to a suffix


This is done by converter of the converter class. I can separate the conversion algorithm (and the following calculation algorithm) so that it does not depend on the extension of operators or variables, thanks to the basic work previously done-for expression elements (values, parentheses, operators, and variables) analysis and abstraction. The basic process of an algorithm is as follows (you can find it on the Internet or in related books, and I slightly changed it because it supports variables): create a work stack and an output queue. Read the expression from left to right. When reading a value or variable, it is directly sent to the output queue. when reading the operator T, all operators with priority over or equal to T in the stack are displayed, to the output queue, and then t into the stack; always push it into the stack when reading the left bracket; when reading the right bracket, the operators above the first left parenthesis near the top of the stack are all popped up in sequence. After being sent to the output queue, the left parenthesis is discarded. In the converter class, the core method convert () executes the above algorithm. The input is an infix expression, and the output is a suffix expression, which completes the conversion process.

Public abstract class converter {public abstract int precedencecompare (operator OP1, operator OP2) throws protocol; public object [] convert (object [] infixexpr) throws illegalexpressionexception, callback {return convert (infixexpr, 0, infixexpr. length);} public object [] convert (object [] infixexpr, int offset, int Len) throws illegalexpressionexception, unknownopera Torexception {If (infixexpr. length-offset <Len) throw new illegalargumentexception (); // creates an output expression to store the result arraylist output = new arraylist (); // create a working stack = new stack (); int currinputposition = offset; // The current position (in the input queue) system. out. println ("----------- Begin conversion procedure --------------"); // temp! While (currinputposition <OFFSET + Len) {object currinputelement = infixexpr [currinputposition ++]; If (currinputelement instanceof number) // The value element outputs {output directly. add (currinputelement); system. out. println ("number:" + currinputelement); // temp!} Else if (currinputelement instanceof bracket) // encounter parentheses, stack or match {bracket currinputbracket = (bracket) currinputelement; If (currinputbracket. equals (bracket. left_bracket) {// left brackets are added to the stack. push (currinputelement);} else {// right bracket, seeking to match (left bracket) // pop up all stack elements (operators) until (left) bracket object stackelement is encountered; do {If (! Stack. empty () stackelement = stack. pop (); else throw new illegalexpressionexception ("bracket (s) mismatch"); If (stackelement instanceof bracket) break; output. add (stackelement); system. out. println ("operator popup:" + stackelement); // temp!} While (true) ;}} else if (currinputelement instanceof operator) {operator currinputoperator = (operator) currinputelement; // all operators whose priority level is higher than or equal to the current value are displayed. // (until all operators that meet the conditions are displayed or left parentheses are displayed.) while (! Stack. empty () {object stackelement = stack. peek (); If (stackelement instanceof bracket) {break; // This must be left parenthesis and cannot be displayed.} else {operator stackoperator = (operator) stackelement; if (precedencecompare (stackoperator, currinputoperator)> = 0) {// a stack with a higher priority than or equal to the current one is displayed (to the output queue. pop (); output. add (stackelement); system. out. println ("OPERATOR:" + stackelement); // temp!} Else {// the priority level is lower than the current one. No break can be popped up;} // The current operator is added to the stack. push (currinputelement);} else // If (currinputelement instanceof variable) // other variables are considered to be variables, and the variables are also directly output {output. add (currinputelement); system. out. println ("variable:" + currinputelement); // temp!} // Bring the remaining elements (operators) in the stack to the output queue while (! Stack. empty () {object stackelement = stack. pop (); output. add (stackelement); system. out. println ("Left stack OPERATOR:" + stackelement); // temp!} System. Out. println ("------------ end conversion procedure --------------"); // temp! Return output. toarray ();}}


Readers may soon notice that the converter class is not a specific class. Since the algorithm is mature and stable, and we have made it independent, why is the converter class not a stable concrete class? In the conversion process, I found that an operator priority problem must be faced, which cannot be ignored. By convention, if the sequence of calculation is not explicitly determined by parentheses, the order of calculation is determined by the priority of comparison operators. This is because I cannot determine the size of the set of his/her operators when a user is using them. What is the priority order between any two operators. Users can only tell me this knowledge. If it is incorrect, it is told to the converter class. Therefore, the converter class provides an Abstract Operator comparison interface precedencecompare () which is implemented by the user.

For a while, I was confused about how to verify the expression's validity. I realized that conversion does not necessarily mean that the expression must be syntactically valid. Even if the suffix expression value is successfully calculated next, it cannot prove that the original expression is valid. Of course, in some cases where the conversion fails or the calculation fails, for example, if the number of operators does not match the number of operands or the left and right parentheses do not match, the original expression is certainly invalid. However, to prove that an expression is valid, there are many demanding conditions. Unfortunately, I failed to find the theoretical basis for verifying the expression validity.

Calculate the suffix expression


This is done through a calculator class. The core method of the calculator class is eval (), and the parameter passed to it must be a suffix expression. Before calling this method, if the expression contains a variable, it should be replaced by the corresponding value. Otherwise, the expression cannot be calculated and an incalculableexpressionexception will be thrown. The basic process of an algorithm is as follows: Create a working stack, read the expression from left to right, read the value and press it into the stack; read the operator to pop up n numbers from the stack, and calculate the result, in the stack, n is the number of elements of the operator. Repeat the above process and output the value at the top of the stack as the calculation result.

public class Calculator{public Number eval(Object[] postfixExpr) throws IncalculableExpressionException{return eval(postfixExpr, 0, postfixExpr.length);}public Number eval(Object[] postfixExpr, int offset, int len)            throws IncalculableExpressionException{if (postfixExpr.length - offset < len)throw new IllegalArgumentException();        Stack stack = new Stack();        int currPosition = offset;        while (currPosition < offset + len)        {            Object element = postfixExpr[currPosition++];            if (element instanceof Number) {                stack.push(element);            } else if (element instanceof Operator)            {                Operator op = (Operator)element;                int dimensions = op.getDimension();                if (dimensions < 1 || stack.size() < dimensions)                    throw new IncalculableExpressionException(                        "lack operand(s) for operator '"+op+"'");                                    Number[] operands = new Number [dimensions];                for (int j = dimensions - 1; j >= 0; j--)                {                    operands[j] = (Number)stack.pop();                }                stack.push(op.eval(operands));            } else            throw new IncalculableExpressionException("Unknown element: "+element);        }        if (stack.size() != 1)            throw new IncalculableExpressionException("redundant operand(s)");                    return (Number)stack.pop();}}


Default implementation


I have discussed the design of expression computing. A good design and implementation usually includes some default implementations. In this case, I provide the basic implementation of four operators and a default parser implementation (simpleparser ).

Operator


Implements four basic operators: addition, subtraction, multiplication, division, and so on.

It must be noted that for each basic operator, the current default implementation only supports the case where number is integer, long, float, or double. In addition, you need to pay attention to how to determine the type and accuracy of the result values when calculating values of different types and precision. The default implementation has some processing.

Parser


The default parser implementation is simple, so it is named simpleparser. The basic idea is to regard expressions as composed of parentheses, values, operators, and variables. Each expression element can be parsed relatively independently, to this end, an expression element Parser (elementparser) is provided ). Simpleparser calls four element Resolvers to complete all the parsing work.

Elementparser provides an element-level parsing interface for expressions. The four default expression element parser class basicnumberparser, basicoperatorparser, defaultbracketparser, and defaultvariableparser all implement this interface.

public interface ElementParser{    Object[] parse(char[] expr, int off);}


Parse () indicates the string to be parsed and the start offset. The returned result contains the specific elements (number, operator, bracket, or object) obtained by this resolution, and the resolution cutoff offset. The ending offset of this time is probably the starting offset of the next resolution, if blank characters are not considered.

So when will each element parser be called by simpleparser throughout the parsing process? My solution is: it calls each element parser in turn. It can be said that this is an attempt policy. The sequence of attempts is exquisite. The sequence is: Variable parser> operator parser> value parser> brackets parser.

Why is such an order executed? It reflects my worries in depth. This means that the parsing of expressions may be quite complex. There may be such an expression that cannot fully execute the "divide and conquer" parsing method, because there is a need for "overall resolution. For example, consider a substring such as "Diff (@ totalbytesreceived. The user may use it to express the difference between the first and second collections of totalbytesreceived. Diff cannot even be understood as an operator in the traditional sense. The final reasonable choice is probably to regard "Diff (@ totalbytesreceived)" as a variable for parsing and processing. In this case, splitting it into "Diff", "(", "@ bytereceived", ")" is meaningless and incorrect.

This is why the variable parser is called first, which allows the user to intercept and redefine a parsing method that goes beyond the conventional method to meet actual needs. In fact, I arrange to make the parser of the most likely change part (such as the variable) be called first, and its parser of the smallest part (such as parentheses) be called finally. In each step, if the resolution is successful, the subsequent parser will not be called. If the expression string cannot be parsed by all element Resolvers at a certain position, the expression is unresolvable and will throw an illegalexpressionexception.

Extension implementation


Due to the length of the article, we will not discuss the extension implementation here. This does not mean that there is currently no extension implementation. In the data collection project mentioned above, because the basic intention is to support variables with special syntax, I have implemented a variable extension implementation and supported some other operators, except the four operators. I believe that the work I have done reflects and satisfies scalability. Scalability is mainly reflected in operators and variables.

Summary


The requirements I have put forward for expression computation are challenging, but not too high. However, in order to approach or achieve this goal, I have made a lot of effort in design, and it is easy to write. As mentioned above, I discarded the numeral class and the variable class. In fact, more than that. I have also designed the element class, and the expression is represented as an array element [] internally. In the element class, an enumeration variable is used to specify the type of elements it contains (numerical values, Parentheses, operators, or variables ). However, I find that this method is not clear and natural enough (if it is the root cause, it can be said that it is not object-oriented), and finally this class is discarded. Correspondingly, element [] is replaced by a more direct object.

My motivation for continuous improvement is to keep the design concise while pursuing other goals. Note that this does not mean that the pursuit is too simple! I hope that my efforts will basically achieve this goal. I removed the major coupling, made the relatively unchanged part-expression conversion and calculation part independent, and opened the changed part-operators and variables. Although I still regret the expression parsing, the general parsing interface of the expression is too broad and cannot provide substantial help for user expansion. Fortunately, the implementation of the default parser is somewhat compensated.

Finally, I hope that the design and implementation of expression computing can be used and extended by others. I 'd like to see it stand the test.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.