Workflow Modeling: Failure and Exception Handling

Source: Internet
Author: User
Summary
Workflow Modeling refers to various activities, including the workflow process, the execution sequence and relationship of the process, the agent related to the execution of the process, and the representation of resources used during execution. Multiple technologies are being used for workflow modeling. One of them is to use a temporary Object-oriented Data Model TF-ORM to represent a workflow model [20]. Failure and exceptions that may cause serious problems during workflow execution, especially for applications with critical tasks. This paper will show how the TF-ORM is used for workflow modeling, how to allow the expression of multiple exceptions and the definition of information for system recovery.

1 Overview
A workflow, also called a transaction process, focuses on the coordination of activities in a transaction environment [26]. A workflow model is a logical representation of the work performed by an enterprise in order to achieve a specific goal. Each process that makes up a workflow may consist of several activities, specific command execution sequence, different execution proxies, and different locations. The representation of all this information is called workflow modeling. This model helps you understand the entire process and allows you to identify possible problems [29]. the concept of workflow modeling is not only used as an example in commercial enterprises, but also can be used in the field of education to implement a web course [22].
In some special applications, such as those related to regional health, insurance companies, banking and e-commerce, it is very important to understand and implement workflows, these applications propose key and non-failure features: security or potentially life-threatening features. These features require special attention to ensure (or at least improve) their security. The Analysis of workflows can identify possible exceptions and failures, and the corresponding modeling improves the security of the entire system.
Several workflow modeling technologies have been proposed in recent studies. They use different modeling examples. The workflow technology alliance recommends [30], Casati, CERI, pernici & pozzi model [4], Joosten's Switch Model [17], wamo Activity Model [9], object-oriented model [19], and Petri net [1, 12].
A workflow management system, also known as a workflow engine, controls the execution of all activities. The importance of managing workflows is determined by recent research institutes in this field and draws on a set of academic experience in workflow engines [2, 14, 16, 27, 28], and many commercial tools (such: process builder da Action Technologies, Lotus Notes ).
As a computing system, a workflow engine may have problems during execution due to abnormal events or system faults. Experience shows that "exception is not an exception and it has always occurred ". New research on this issue shows that when considering workflows, this issue is very important [10, 11, 13, 15, 23, 24, 25, 26]. if some predictable exceptions are represented as workflow models, you can write a workflow engine to solve the problem.
Since the solution to this problem is very important, the workflow engine needs special information to provide failure recovery. When selecting a modeling tool, the possibility of describing the workflow model for information recovery is also a very necessary feature.
An optional Technique of workflow modeling was proposed in [20] using a temporary object-oriented model TF-ORM [5, 6]. the action of a workflow process is defined as a class, which is instantiated when each process is executed. A finite state machine that proposes temporary logical expressions to determine changes in the process and is used for forced state conversion. Compared with the traditional object-oriented model, the main difference between this method is to use the role to represent activities in combination with the process, and also give these roles conditional state conversion. The possible relationships between processes and activities can also be clearly described.
The goal of this article is to display the representation of possible exceptions that the TF-ORM form permits at the model level, and the actions that the workflow engine will perform. It also shows the recovery mechanism that the TF-ORM model can be exploited by describing information.
The article structure is as follows: Part 1 provides a brief introduction to exceptions and failures in the workflow system. Those proposed forms will be briefly presented in part 1, followed by exceptions expressed using TF-ORM (Part 1) and failures (Part 2 ).

1. Exceptions and failures
Two different types of problems occur during workflow execution: exceptions and failures. In addition, each of them requires different types of information for system recovery.
An exception is a semantic failure that may be caused by a system fault or a new situation introduced by the external environment [11]. possible exception: proxy changes for an activity and arbitrary changes to the structure of a process [26]. generally, the workflow engine can handle exceptions by determining the behavior. However, it is impossible to identify all exceptions that can occur during workflow execution.
A failure is a system fault caused by an error in a device, communication facility, or program. Because these failures are unpredictable, they cannot be modeled. When such a problem occurs during the execution of an activity, a solution will be assigned to the activity or the agent responsible for the whole process. To provide system recovery, information about actual activities and past situations (logs) is required ).
One of the main features of a workflow system is the time it takes to execute a workflow, starting from the first activity until the last end of computing. The interval may be quite large. When an activity of the entire workflow fails, the entire workflow will not be aborted, and the system recovery will be provided to protect the work that has already been done.
Several possible workflow exceptions can be identified in the analysis phase. In addition, activities executed for system recovery can be expressed in the workflow modeling phase. However, identifying all possible abnormal representations will produce a very complex workflow model. On the one hand, there is always a compromise between the reality of modeling and the simplicity of the model. We want to describe more loyal to reality (in this case, the workflow), and on the other hand, less complex models are easier to understand and implement. A large number of abnormal representations, including restoration actions, will generate a complex and incomprehensible model. Sometimes, it is best to only describe important exceptions (which have a major impact only during workflow execution) and provide other information about failure recovery mechanisms.
Typical exceptions that can occur during workflow execution include:
Due to personal or professional issues, the agent responsible for an activity cannot execute his role and is not foreseen to be perfect for his replacement;
Due to some important features identified during the process, the agent responsible for the whole process decides to change the workflow;
An activity is blocked, waiting for unavailable resources;
Cyclic events during workflow execution;
Deadlock event-two active wait events are caused by another event. A deadlock may be caused directly or indirectly between two activities (the latter is more difficult to determine );
When the recipient of a message sent to another activity is not confirmed, the message may be lost when the sending activity is unknown.

1. Exceptions and modeling errors
Due to the lack of workflow representation methods, an important distinction must be made between exceptions and errors during workflow execution. To illustrate this difference, we can consider the following example based on "traveluck example" published in [18 ". It is related to a travel agency. When a travel agency customer wants to make a trip, including plane tickets, hotel reservations, and car rental companies, the event will be executed. As shown in Activity 1 of the application, figure 1 uses an activity Graph Expression, which is often used to describe the workflow-circle to represent the activity, the arrows determine the sequence of activity execution. The workflow starts from a travel-defined activity. At this time, the customer and the travel agent will select a travel route. When the travel route is fixed, three different parallel activities will be activated: flights, hotels, and ticket booking.
Two optional and independent results can output these three activities: the reservation may be confirmed or rejected. If one of the three reservations is not confirmed, the travel agency will contact the customer (OR operator indicates the condition combination ). And the next activity of the (and operator) will be activated only when all three reservations are confirmed, and a travel schedule will be printed and then paid by way of tickets and coupons, end the workflow.
In this example, the workflow raises a modeling error: when one of the scheduled activities is not confirmed, the other may have been confirmed and cannot be eliminated. Once the customer decides what measures should be taken to overcome the detected booking problems, the workflow will start again and confirmed reservations will be repeated. The workflow needs to confirm whether any other reservation has been made in this case.
If the customer does not pay, the workflow will wait for the payment to proceed indefinitely. In this example, it is confirmed that such an exception may occur. The workflow model can provide some processing to avoid this exception.

2. workflow modeling using TF-ORM
In this article, the workflow modeling method used is TF-ORM workflow modeling (role object temporary Function Method) [5, 6], it is a temporary Object-oriented Data Model [20]. TF-ORM has been proven to be an effective tool for workflow modeling for the following reasons:
It is a formal model that allows a complete representation of the workflow;
Allow representation of structured and unstructured information (such as decisions) using the same formal model;
It can indicate all possible synchronizations in the activity;
Communication between processes and activities;
The role concept allows a proxy to assume different roles.
With this model, a workflow can describe the determination process, proxy and resources, and propose each of their own static and dynamic features. These object methods are represented by messages that can be sent and accepted by each class. In addition, the states that each object can represent are also defined, and the state conversion rules describe the evolution of objects. These rules may be restricted by conditions written using time series logic, which is an important feature of this modeling tool.
An important difference between this model and other object-oriented models is that TF-ORM classes can represent different actions based on role modeling. The role concept is related to the object-oriented example [8]. It can represent the evolution of objects over time: an object can simultaneously represent more than one behavior, and each of them is gradually formed independently, multiple instances with the same behavior can also be proposed, and they are also gradually formed independently.
Use a TF-ORM to create a workflow model and use a process class to describe the process. Each activity of a process is described as a role of the corresponding process class. Agents responsible for activities and processes are described as agent classes. Roles represent different roles played by each agent during the process and activities. Resources involved during execution are described as resource classes. The whole process between a proxy and a resource is represented by an activity.
Object behavior is described by the state transition rules defined in each role. Each role has a set of possible states of objects in this role, and a set of state conversion rules that may evolve between these States.
Synchronization of different activities in a workflow (role of a process class) can also be represented by a TF-ORM through the state transition rules of the role. Messages sent from other activities are activated. Activation not only indicates the start of the activity execution, but also indicates the suspension of an activity. When an event continues to be executed, the activity continues. Synchronization is represented by state conversion rules. The following is the basic structure:
ST (SI), MSG (M1) MSG (m2), ST (SF );
<Conversion condition>
The transition from initial S1 to the final state of the active SF A1 is only when the active A1 is in S1. when the message M1 is received and the conversion condition is true, message m2 is transmitted to the same activity, the same process of another activity, or the activity of another process. The last form can represent interactions between different processes, which often appears in workflows, but traditional workflow modeling tools are always not competent.
The following describes the status conversion rules:
(1) when the State is not defined, you can propose a switch that applies to any state of the activity;
(2) When several messages are available, the conversion can be executed only when all messages arrive;
(3) Several messages can be sent at the same time;
(4) When the final state is not defined, the conversion only defines the input and output information that does not change the activity state;
(5) The conversion condition is optional-when not defined, the conversion execution is only based on the initial status and input messages.
For the examples above, see the appendix, which is modeled with TF-ORM.

2. synchronization between activities
The activities of the process that constitute the workflow can reflect multiple different synchronous features and are represented as the TF-ORM model through the state conversion rules. Activities can be executed in a serial or parallel manner independently or in a coordinated manner.
When the previous activity is completed, the second activity can be executed, and the two activities are serialized. Figure 2 indicates an active A1, which is converted from S1 to SF due to message M1. This conversion causes another activity A2. Send a message to create a new instance for this activity (described in the message add_role ). The conversion condition is not described. If SF is the final state of the first activity, the two activities described are serialized.
In the same example, if SF is not the final state of A1, the activity continues to be executed. The first activity only activates the second one, and then the activity is concurrently activated from there.
A similar situation is that an activity is temporarily suspended, and the synchronization execution feature is displayed when a specific message is resumed for execution. In the example in Figure 2, the second activity is in the waiting state, and the output information of A_1 is not add_role, but the message indicates the event waiting for the second activity.
The execution of an activity can be based on a set of input information sent by different activities. This is usually called aggregation because it describes the intersection of input information. In the TF-ORM This is described by a unique state transition rule and a set of input information (example in the Appendix. Two different situations can occur by a set of possible input messages: (1) All messages must arrive in a certain order for conversion (full intersection) or (2) any subset of these messages needs to arrive (partially intersection ). The second form represents in the TF-ORM to define the number of messages that arrive before various input messages.
On the contrary, an activity activates the execution of several other activities, known as fork ). This is described as a state conversion rule for sending multiple messages (see the appendix example ).
All of these conditions may be restricted by conversion conditions, in which you can determine the current and past values of attributes and States to determine whether to perform certain conversions.

3. Indicate exceptions in the TF-ORM
When using the TF-ORM form, exceptions may be represented as workflow models with corresponding actions being executed. In the following sections, identify some possible exceptions and explain their representation in the TF-ORM.

3. 1. Statement of proxy responsible for an activity
Each workflow activity has a responsible proxy to control execution and solve possible problems. When a TF-ORM model is used for workflow modeling, each activity requires defining a responsible proxy. In the model, only the proxy role is determined. The specific person is determined when the active instance is created.
When a statement is made by the contemporary manager, one exception will occur and the responsibility for the corresponding activity will be transferred to another. To prevent this exception, the TF-ORM model allows the representation of the proxy that a group of activities might be granted. When the activity is instantiated, these proxies are also identified.
Even if some agents may be assigned the responsibility for activities, exceptions will still occur if the entire directory is declared. To prevent the above, the TF-ORM model requires a definition of the agent responsible for the entire process, and this will declare who is the new agent responsible for the activity.

3.2. Change the activity flow during execution
When the activity sequence is changed during workflow execution, a serious problem occurs. To avoid these problems, the TF-ORM needs to define the agents responsible for each process. Only this proxy can change the execution sequence.
To implement this interference, each activity puts forward two predefined input messages so that they can be sent by the proxy responsible for the activity-defined process. These messages can be received at any time, regardless of the status quo of the activity. The agent responsible for the process will send these messages to all activities related to his planned changes. The first message changes the initial status of the activity, and the status defined by the Programming proxy is transmitted as a parameter. The second message is related to the property value. If you need to fit some property values, the specific message will also be sent with the name and new values of these attributes.
As an example, we will consider the example of Travel Agencies in the second part. Assume that the agent in charge wants to suspend an ongoing Hotel Booking Activity and withdraw it if any reservation has been made. This will send the following messages to this activity:
MSG (agent_interfer (pai_reserve, susponded ))
MSG (values_interfer (partition _name, null )),
MSG (values_interfer (pai_reservation, NOK ))
The impact of these messages is like a rule that defines each status of a hotel booking role, such as the following content (reservation status ):
RI: State (reserving ),
MSG (agent_interfer (pai_reserve, susponded )),
MSG (values_interfer (partition _name, null )),
MSG (values_interfer (pai_reservation, NOK ))
State (susponded)
It is important to remember that the execution agent is responsible for the sequential changes in the execution of the activity, and it is his task to adapt all relevant processes and activities to the new execution order. Modification may reflect the attribute values changed during the previous execution activity.

3. Wait for resources that cannot be obtained
An exception may occur when an activity is suspended and waiting for a resource, and such a resource is unavailable. The availability of a resource is represented in a TF-ORM as a message sent to an activity by the corresponding resource class.
One way to avoid this problem is to model the interaction between the Process and the resource class and request a response that is determined or denied by the requested resource. Two responses will be taken into account in the activity transition rules and an alternative solution will be provided to prevent unavailability of resources. If there is no other solution, the activity will send a pre-defined message to the proxy responsible for the activity (represented by a message sent to the same role), and the request is interrupted once. This message has the following forms:
MSG (resource_interfer) to itself

3. 4. Loop
Cyclic events are one of the most important issues that may occur during workflow execution. Loop indicates that when one activity A1 activates another activity A2, activity A2 directly or indirectly activates activity A1. Considering all possible developments, this situation can only be avoided through rigorous analysis of workflows.
However, in order to reduce the possibility of cyclic events, the TF-ORM model puts forward predefined attributes, it is defined in the basic roles of all classes (a role that will be defined in all classes and the global attribute of all objects of that class ). This attribute is called cycle_alert. It has the function of controlling loops, detecting possible cyclic events, and reminding corresponding proxies. The domain of this attribute is a combination of the role name and the number of instances of the role activity. The agent responsible for the process will control this attribute and discover possible loops in this way.

3. 5. deadlock
Due to the possibility of creating time series logical conditions that limit conversion rules, it is particularly difficult to express rules that prevent deadlocks. What can be done is to associate a period of time with a possible deadlock status. After this time, another rule will be executed to destroy the deadlock.
The TF-ORM model extends the definition of a time class to avoid deadlock exceptions. This class is pre-defined in all structural models for calculating the time consumed. Use the maximum wait time as a parameter to create an instance of this class. The behavior of the time class is as follows: once an instance is created, the instance will calculate the time consumed. When the time parameter matches, the interruption message (Interrupt) will be sent to the class ).
The complete process of deadlock is as follows. When a deadlock occurs in the active _ 1 status _ 1, a message is sent to the time class, with the maximum time this activity will wait for another event (represented as a new state transition ). The message format is as follows:
MSG (timing (<class. Role>, <State>,
<Waiting time>, <temporal granularity> ))
Time used to start computing. When the wait time is reached, an interrupted message is sent to the sending class/role (describing the activity ). If this role is still in the same state, a deadlock will occur and a state of transition to the recovery state will be provided. On the other hand, for a role in another State, the message will not be interrupted or the activity will not be affected. A possible deadlock example has a series of conversion rules in the state dl_state:
ST (x), MSG (m_in) MSG (m_out ),
MSG (timing (c.r, dl_state, 5, min )),
ST (dl_state );
ST (dl_state), MSG (waited_msg) ST (y );
ST (dl_state), MSG (interrompt)
ST (recovery_state );

. Wait for confirmation
A message sent to another activity cannot be received. When receiving a message is required, this will cause problems in the entire workflow. When a TF-ORM is used to represent a workflow, a message is sent without actually receiving a confirmation, and once a conversion condition is executed only when a message is received in a specific State, a conversion condition must also be met. If a message (status or conversion condition) does not meet the requirements, the message cannot be received.
To ensure that messages are received, the TF-ORM model describes the following:
The role that sends the message will gradually form a waiting state, and keep the status for a certain period of time (using the time class) until the confirmation message is received;
If the message is correctly received, the receiver sends a confirmation message to the sender and then proceeds with its introspection changes;
If the message is not received, the maximum waiting time is used up. The role changes from the waiting status to the status where the problem can be solved, maybe sending another message instead.

4. Information for failed recovery
Two aspects will be clearly identified in the Failure Recovery Process:
WHO (which proxy) is responsible for recovery and makes necessary decisions to overcome failures;
Which information is required in this process.
An agent must recover unexpected system faults to automate the recovery. To ensure that a proxy is available when a failure occurs, the TF-ORM model requires that the proxy responsible for the entire workflow be defined. Each confirmation process must have a responsible proxy, just like every activity. In the event of a failure, the responsible activity tries to resume execution. If the failure occurs in the activity domain, the responsible process will take over.
After a failure is solved, the activities will be resumed from the previous execution point. This process is called rollback. To make it possible, additional information is necessary. An efficient rollback format has been demonstrated, which stores past states consisting of activities and processes with Temporary Information (each suspected time in these states ). This is a feature of the tform model-a temporary model. The Temporary Information (transaction and effective time) is related to all information (Status and attribute value, all past information should be stored in the temporary database of the workflow model. This enables the recovery agent to analyze the stored information, select the temporary moment in the past, and delete the information after this moment (taking into account the transaction processing time of each stored value, restore a previous state of a process or activity in this way.
An optional method for implementing system recovery is to define an attribute in each class to store the activity history of each instance of that class, and to contact the relevant transaction processing time. Analyze the changes of these instances, and the responsible proxy will choose how to recover.

5. Conclusion
Workflow Modeling is a very important topic in the enterprise analysis process. The number of expressive studies recently published in this field proves this fact. The following three dimensions are determined in the workflow [18]:
(1) What will be described-activities that comprise workflows and their execution order;
(2) who will execute each activity, that is, the proxy of each workflow;
(3) what activities will be executed simultaneously-the resources involved in the execution, including automatic tools and support.
Information related to the three items will be defined in the workflow model. However, most modeling tools can only represent information related to the first one. This is the main difference between enterprise modeling and workflow modeling.
A workflow model should include (1) the description of all processes that constitute the workflow, and the activities are displayed in each process. (2) definitions of agents responsible for the execution of each confirmed activity, (3) definitions of temporary restrictions on the execution of the activity, and (4) determine the connection between activities and processes (for information change and control ).
To achieve this goal, you can use the following policies:
Create a workflow mathematical model;
The relationship between the formatting process and the process. The temporal logic is used to generate a logical computing tree for all possible workflows (in this tree, the child of a node may be considered as a possible state starting from the State indicated as the corresponding root );
Some methods are used to verify this model, usually through a simulated execution of a workflow.
Workflow Modeling can be considered as the first stage of the enterprise restructuring process. To fully understand an application, it is very important to analyze the possible exceptions during execution and the final system failure results. An exception (also known as an academic failure) occurs when an activity cannot be executed according to a workflow model or fails to get the expected results. Failures include device problems (program errors, device faults, and communication errors ).
Workflow Modeling can take different forms. We recommend that you use a temporary model to indicate synchronization of activities. In this paper, the temporal Object-oriented Data Model TF-ORM is used as a workflow modeling tool. The focus of this article is to describe the possibility of exception handling and Failure Recovery using this method. It also briefly analyzes the implementation of recovery under the responsibility of the corresponding agent.
A fully TF-ORM-based environment is still under development, including modeling tools, ing of TF-ORM models of commercial databases [21], and the use of TF-ORM query language [7] in an intuitive interface [3] to achieve the database query.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.