Towards Distributed Workflow Process Management
|On leave from University of Klagenfurt firstname.lastname@example.org||Database Research Department email@example.com|
AT&T Labs Research
180 Park Avenue
Florham Park, NJ 07932
Today, companies are driven by the rate at which technologies change and move towards global enterprises and virtual organizations in order to stay competitive. In global and virtual enterprises, business processes consist of multiple sub-processes that may span multiple time zones, organizational boundaries, and legal domains. Current workflow technology does not provide the necessary functionality to model, implement, and manage such processes due to its mostly centralized, coupled architecture and limited process management capabilities. In this paper, we present our on-going efforts to address some of these shortcomings. In particular, we outline an event-based workflow infrastructure that supports distributed, heterogeneous, and dynamically changing environments. In addition, we present time-related constructs for addressing the time aspects of processes in such environments.
Today, we experience an ever-increasing number of corporate mergers, acquisitions, and strategic alliances. In this environment, parts of the infrastructures already employed by the member companies and existing business processes have to be integrated. However, integration can be a herculean task due to the heterogeneity of the information and communication systems involved and the fact that the organizational structure of the resulting entity may not follow the traditional, strictly hierarchical delegation model. The latter implies that a decoupled infrastructure, which preserves autonomy and supports dynamic changes, should be employed. Such decoupled infrastructure would also benefit individual organizations that attempt to combine several of their consumer and business services into a single, bundled offer. Typically, these services are being developed and maintained by different business units within the same organization and, usually, the infrastructures employed by these business units differ considerably.
Traditionally, advances in Internet and distributed middleware technologies were not the driving factors in the design of workflow systems. Rather, a centralized, tightly coupled architecture was employed, together with proprietary components and tools. Furthermore, the majority of the existing workflow products provide support for processes that span single organizations only. The process of service provisioning for new customers in a telecommunications company is a typical example where the representative who interacts with the customer (consumer or business) is the exclusive process initiator. Today, the above shortcomings have been identified, and advances in distributed middleware (e.g., OMGs CORBA ) and Internet standards are influencing workflow technology. Nevertheless, several areas still need to be addressed in order to be able to support dynamic, compound, and distributed processes that may span multiple organizational domains. Among the most important of these areas are information exchange, process modeling, and process management.
Information exchange mostly addresses business-to-business communications, and existing technologies (e.g., e-mail, Internet, groupware, and distributed message queuing and event notification) can be used for that. On the other hand, process modeling in an environment that spans multiple organizations, each having its own processes and ways of modeling them, presents a significant challenge. This is mostly due to the diversity of the workflow systems used by the member organizations, the differences in their available resources and policies, and any autonomy requirements that may exist. These differences and requirements also complicate several aspects of process management. In particular, some of the aspects that need to be addressed are monitoring the lifecycle of a process across multiple workflow systems, managing worklists, organizational roles and their bindings in a dynamic way, and specifying, monitoring, and reacting to violations of both intra- and inter-organizational time constraints.
While there have been efforts by the Workflow Management Coalition (WfMC)  and the OMG Business Object Domain Task Force  to address the above shortcoming, these efforts have not been fully materialized yet. In addition, these efforts address only a small fraction of the issues involved. In particular, they primarily focus on the selection, instantiation, and enactment of business processes. They do not address issues of process modeling and distributed exception handling, nor do they provide the appropriate infrastructure for addressing the time aspects of process management. Furthermore, the proposed standards offer limited interoperability support, and they address workflow systems that operate in a coupled mode, i.e., workflow engines have to know each other in order to communicate and jointly execute processes.
The contributions of this position paper include an event-based workflow infrastructure and modeling constructs for addressing the time aspects of process management. In particular, we discuss the functionality an event service should provide in order to address distributed, decoupled, and dynamic workflow executions. In addition, we briefly cover some of the additional components we believe they should be part of the overall infrastructure for supporting global and virtual companies as well as e-commerce service providers. Finally, we introduce timing constructs that are required for expressing the various time dependencies that may exist between intra-organizational activities and inter-organizational sub-processes.
According to the Workflow Management Coalition (WfMC) reference model , a Workflow Management System (WFMS) employs a client-server architecture that consists of an engine, application agents, invoked applications, a process definition tool, and administration and monitoring tools. An interoperability component, which is used for selecting, instantiating, and executing remote processes , is also part of the WFMS. The process definition tool is a visual editor that is used to define the schema of workflow processes (i.e., specify the activities that constitute the workflow and precedence relationships between them). The same schema can be used for creating multiple instances of the same process at a later time. The workflow engine and the various tools communicate with a workflow database to store and update workflow-relevant data, such as schemas, statistical information, and control information required for executing and monitoring the active process instances.
The OMG Business Object Domain Task Force has produced a workflow specification that utilizes events during workflow execution for monitoring the lifecycle of processes . Events are raised when process instances are created and terminated, when work items are created and terminated, when state changes occur and so on. However, the specified events and the proposed architecture cannot efficiently handle distributed and dynamic workflows. While the IDL-based specification addresses interoperability at the API level, the existing OMG event and notification services [4,5] do not provide the necessary high-level constructs for handling distributed workflow executions. For example, there are no mechanisms for grouping together event producers in order to encapsulate the events that are raised within a particular sub-workflow. In addition, no time management functionality is discussed in the above proposals.
Existing WFMSs maintain audit logs that keep track of information about the status of the various system components, changes to the status of workflow processes, and various statistics about past process executions. This information can be used to generate events and provide real-time status reports about the state of the system and the active workflow process instances, as well as various statistical measurements such as the average completion time of an activity belonging to the particular process schema. However, no standard format exists for these logs, and each vendor uses its own tools for analyzing the logs. Furthermore, support for time management in existing WFMSs is limited to process simulation (to identify process bottlenecks, analyze execution duration of activities, etc.), assignment of activity deadlines, and triggering of process-specific exception handling activities, referred to as escalations, when deadlines are missed at run-time.
In this section, we outline the distributed workflow infrastructure we are currently working on for addressing decoupled, distributed, and dynamic business processes. Figure 1 shows the high-level architecture, having an event-notification service as its key component module. In addition, we assume the availability of a distributed storage component that is used for exchanging process-specific data as well as workflow-specific data. Such a distributed storage component is based on Internet technology (similar to  and the WFMS Panta Rhei ). In particular, we assume that XML is used for workflow-specific data. For process-specific data, XML or HTML can be used. Furthermore, we assume that a security mechanism is employed to ensure authenticated and secure access to the various system components. Finally, a distributed LDAP-based directory is employed for storing role information together with process information. This directory provides naming and lookup facilities.
Figure 1: Decoupled, distributed workflow architecture
By using an event notification infrastructure that provides decoupled notifications of events (i.e., event suppliers need not be aware of the event consumers and event consumers need not be aware of event suppliers), the following benefits are realized:
An important feature of an event notification service for supporting distributed workflows is the ability to support multiple event domains. Domains may correspond to different organizations, departments within organizations, administrative domains, and so on. This functionality is required for being able to distinguish the various events that are raised within each domain, while maintaining the autonomy of the domain. Here the goal is to avoid imposing a new naming scheme on all the involved members of a global or virtual organization. Our event-notification service, READY , provides such support and, in addition, it uses domain routers for connecting event domains in a hierarchical or peer-to-peer topology. Domain routers can be used for encapsulating event mappings between domains, enforce access restrictions, and regulate the flow of events between domains.
READY clients interact with READY using admin, supplier, and consumer sessions. Admin sessions are used for creating/destroying supplier and consumer sessions and sessions groups and for other administrative operations. Supplier sessions are used to supply events to the service, and suppliers must declare the kinds of events that they will supply. Consumer sessions are used to register specifications, which describe both event patterns and actions to take when matches are found for these patterns. Any legal expression from the filter grammar specified for the OMG Notification Service  can be used to describe event patterns. The most common action, notify, causes the consumer session to deliver a notification with the matched event(s). Notification behavior is controlled by session properties, including quality of service properties (QoS) that control the reliability of notification delivery, and delivery properties that control the delivery order. Suppliers, consumers, and specifications can be grouped together, enabling sharing of specifications, uniform control over QoS and delivery properties, and efficient suspend/resume operations of a large number of specification.
Another important feature of READY is its ability to support self-describing events. A READY event type definition specifies a set of required and optional fields, where each field contains a field name, a type identifier, and a value. In particular, READY events follow the structured event format specified in OMG's Notification Service . READY types can have subtypes, and subtype declarations simply add additional required or optional field specifications to those of their parent types. Compound type identifiers such asWF.Process.Start can be used with the convention that the structure of the type corresponds to the type hierarchy (this is similar to the publish-subscribe products that offer subject-based matching functionality). In our proposed workflow infrastructure, event types are registered with the LDAP directory. XML is used for describing the event type structure and their semantics.
Event notifications can contain process-specific data as well as pointers to application-specific data. The process-specific data may correspond to results of the execution of a particular step in a workflow as well as specifications regarding the next steps that have to be followed. In addition, notifications can contain access control rights and time-to-live fields. Time-to-live fields are important in the case where the next step of a given workflow needs to be carried out within a given time period. Consequently, the appropriate agent or workflow engine should receive the notification before the expiration of this deadline otherwise a time exception is raised. Finally, since event notifications may be delivered in a variety of ways (e.g., email, paging, entry in a worklist, etc.), agents and workflow engines can choose the appropriate medium for receiving them. The medium can be selected based on properties of notifications, such as time-to-live, event source, priority, and workflow process instance id, and it can be modified dynamically.
Time constraints are crucial in designing and managing distributed business processes and, therefore, time management should be part of the core management functionality provided by workflow systems to control the lifecycle of processes. At build-time, workflow modelers need means to represent time-related aspects of business processes (e.g., activity durations and time constraints between activities) and check their feasibility. At run-time, process managers need pro-active mechanisms for being notified of possible time constraint violations. Workflow participants need information about urgencies of the tasks assigned to them to manage their personal work lists. If a time constraint is violated, the workflow system should be able to trigger exception handling to regain a consistent state of the workflow instance. Business process re-engineers need information about the actual time consumption of workflow executions to improve business processes. Controllers and quality managers need information about when and how long activities of a workflow instance have been performed.
Time constraints belong to two categories: structural and explicit. Structural time constraints follow implicitly from control dependencies and activity durations of a workflow schema. They arise from the fact that an activity (sub-process) can only start when its predecessor activities (sub-processes) have finished. Explicit time constraints are derived from organizational rules, laws, commitments, and so on. Examples of such constraints include: (1) an invitation for a meeting has to be mailed to the participants at least one week before the meeting; (2) after a hardware failure is reported, the service team has to be at the customer's site within 4 hours; (3) vacant positions can be announced at the first Wednesday of each month; (4) loans above USD 1M have to be approved at a regular meeting of the board of directors (i.e., such applications can be approved on dates where a regular meeting of the board of directors is scheduled).
Explicit time constraints are temporal dependencies between events or bindings of events to certain sets of calendar dates. In workflow systems these events correspond to the start and end of processes and activities. We introduce the following explicit time constraints:
An example of lower-bound constraint includes a chemical process control, where one reaction may be initiated only when certain time passes after the start of some other reaction. Upper-bound constraints are even more common, e.g., the requirement that a final patent filing is done within a certain time period after the preliminary filing, or time limits for responses to business letters.
We have already developed techniques for addressing some of these constraints at process build-time and instantiation-time, and taking pre-emptive actions at run-time when potential time exceptions may materialize [8,9,10,12]. Currently, we are in the process of extending our techniques to check the satisfiability of these constraints by annotating the workflow graph with time information based on the CMP or PERT techniques . While preliminary checks can be performed at process build-time and when a workflow process is instantiated, we are particularly interested in run-time techniques that monitor (using our event-based infrastructure) the execution progress and changes in the state of the systems (e.g., agent availability) and enforce the specified time constraints.
In this position paper, we presented our approach to management of workflow processes in a distributed environment. The core or our approach is an event notification service that provides efficient, asynchronous, and decoupled notification of events. The main goals of the work are to provide an infrastructure that supports distributed workflows, which may span multiple organizations, as well as workflows that are dynamic in nature and, in addition, to provide timing constructs so that time management is facilitated in such an environment.
We would like to thank Michael Rabinovich for his valuable comments and for helping us to improve the presentation of this position paper.