Cost Estimation in CPAM, an Access Protocol for Remote and Autonomous Services

Position Paper for the Workshop on Cross-Organisational Workflow Management and Co-ordination, WACC'99, San Francisco, Feb. 22, 1999

Dorothea Beringer and Gio Wiederhold
Computer Science Department, Stanford University
{beringer,gio}@db.stanford.edu
http://www-db.stanford.edu/CHAIMS/

Introduction

In the CHAIMS project we have developed an access protocol for distributed services that is specifically targeted at composing services of autonomous modules. We refer to that protocol as CPAM (CHAIMS Protocol for Autonomous Megamodules). Like some related protocols, e.g. SWAP [Swenson:98], it allows the asynchronous invocation of remote services. However, CPAM offers more than invocation primitives.

Because we assume remote services to be autonomous, and assume that the programmer does not have reliable information about performance, CPAM allows the client to get performance estimates before invocations take place, and to check the status of an ongoing execution. We assume that the services we compose by using CPAM are large, remote, and autonomous. Remote services can be offered by servers within the same organization, or by servers of other organizations. The location of the service is unimportant as long as the servers and the client are connected by some distribution system on which CPAM is implemented. We assume that the services offered by servers in other organizations are autonomous. Autonomy implies that the implementation, execution and maintenance is under the control of the organization owning the service. The only control the client has is to select another service, or to negotiate improvements, easier done when the service is being paid for. Changes to the servers and their services can be made any time as long as the services still cooperate with their interface posted in a generally accessible repository. Autonomy also means that the person doing composition has no control over the resources made available to the services by the organization providing the services. Yet availability of resources has a great influence on the performance of services.

Of course, we could require that performance characteristics of the services are posted along with the interface definition in a repository. Yet this approach has several weaknesses: 1) performance does not only depend on resources but also on the specific input data, 2) static performance information cannot take into account dynamic fluctuations in the performance of services, and it might only give an average or an upper bound for the performance, 3) static performance information can help in deciding if a certain service is to be used or not, but does not give any possibility to monitor the execution and performance of the service at run-time. Therefore we have introduced into CPAM the capability to obtain estimates about the performance of services prior to the invocation from the module offering the service.

We also enable the monitoring of the performance of a service during its execution. This allows to make educated decisions prior and during service invocations, and allows various novel optimization possibilities specific to the composition of large, remote, and autonomous services.

The principal characteristics of the CPAM protocol

Asynchrony: CPAM has several basic primitives for initiating remote computational tasks. They are SETUP for setting up the connection to and initializing a remote server offering interesting services, INVOKE for actually starting remote computational services, and EXTRACT for obtaining the results of those computations. An ongoing service invocation can be monitored with EXAMINE, which also allows a client to determine if the results are ready or not. Services offered by different remote servers can be easily invoked in parallel. Results can be extracted either all at once, or step by step. In case of services that provide ongoing results, e.g., monitoring or simulation services, one service invocation can lead to arbitrary many extractions of the same result parameter, always reflecting the newest monitoring or simulation results.

In order to allow composition by a simple sequential client, all the primitives in CPAM are initiated by the client and are synchronous procedure calls from the client to the modules offering the services. This has the advantage of being a simple and very generally applicable paradigm that can be easily implemented on top of various distribution systems like RMI, CORBA or DCE. Having a simple client structure also simplifies system optimization, e.g., exploiting the inherent parallelism between various services.

A client can terminate an invocation in which it is no longer interested with the TERMINATE primitive. Any results stored by the server for extraction as well as other invocation specific information will be deleted.

Presetting of parameters: Input parameters for invocations as well as results are transmitted in CPAM as name-value lists. These lists contain for each parameter its name as specified by the server providing the service in a generally accessible repository. The parameter value consists of triplets for simple data elements containing type, descriptive information and the actual value. Complex data elements are hierarchies of triplets, with each node having type and descriptive information as well as either the actual value or another complex data element.

In order to avoid data flow redundancy, INVOKE only needs those input parameters that are different from default values provided by the service. CPAM has the primitives SETPARAM and GETPARAM that allow to preset parameters. Parameters that remain the same for several service invocations by the same client have to be set only once.

Partial result extraction: A client can extract a subset of the results of a service, only including the elements it needs. CPAM also allows progressive extraction: the client can repeatedly extract more accurate values of the same result parameter. Incremental extraction of results is used if the service makes a result available as soon as its computation is completed, even before the computation of the next result is done.

Invocation monitoring: The EXAMINE primitive allows to monitor the progress of an invocation. It returns the status of the invocation (e.g. DONE, NOT_DONE), as well as a progress estimate (e.g. 30%). While the first return parameter, the invocation status, is well defined in CPAM, the interpretation of the second parameter, the progress, is service specific. It can be a quantitative measure, denoting the progress of work in time or data volume. In case of simulation services it can express the quality of the current results.

Cost estimation: EXAMINE only allows to monitor and get progress estimations after a service has been invoked. Yet in many cases, especially for optimization and invocation scheduling, a client would like to get various cost estimates prior to invocation. This is done with the ESTIMATE primitive which for a specific service and a specific set of preset parameters returns estimations of the execution time of the service, the fee to be paid for the service, and the data volume of the results to be expected.

For more information about CPAM see [Melloul:99], or the description of the various CHAIMS components on our web pages (http://www-db.stanford.edu/CHAIMS/).

Optimization with ESTIMATE and EXAMINE

When composing services, optimizing the composition becomes an important issue. The objective is to minimize overall execution time, fees and resources, yet not by optimizing service execution times on the server side but by optimizing the composition of the services on the client side. ESTIMATE and EXAMINE enable sophisticated monitoring and optimization of service invocations. These optimization techniques are model free, i.e. they do not rely on a static cost model of services. Instead, they get the necessary information at run-time directly from the server providing the service. This is especially important as we assume the servers to be autonomous. The client has no influence on or knowledge about the resources made available to the various services. Therefore, having pre-invocation estimates and examination of ongoing processes at run-time is the only way a client can get accurate performance and cost information.

Pre-invocation estimates (ESTIMATE primitive) can be used for various optimization objectives:

Choosing services: In case there are several options for choosing and composing services that all lead to the desired results, a client can choose those services that come at lowest cost. Depending on the situation, lowest cost can mean lowest time, lowest fees or lowest data volume (which of course can translate again into time and fee). The decision about which services to choose can be made dynamically at run-time, depending on the actual availability of resources and the actual requirements.
Scheduling of invocations: The asynchronous nature of the CPAM protocol supports the parallel invocation of several services. A service can be invoked as soon as all input data needed by that service is ready. However, even if all input data is ready it may not yet be clear if the results of the service will be needed at all. Waiting with invoking the service means that overall execution time is lost in case the results are needed, because potential parallelism has not been exploited. Invoking the service as soon as possible carries the potential penalty of wasted server time, and thus of wasted fees, in case the results are finally not needed. This is not an important issue for short services, but as we assume that most of the services used with CPAM are large and time consuming services, this optimization issue can not be neglected. Having pre-invocation estimates for the questionable service as well as for the services that determine if the results will be needed or not, allows to calculate the minimal overall cost and to make an educated decision.
Scheduling of extractions: Knowing the estimated execution time of a specific invocation allows to wait with monitoring the status of the invocation by EXAMINE until the results could be ready.
Pre-scheduling of an invocation plan: Getting pre-invocation estimates of all services needed for a specific task prior to starting any invocations allows to make a sophisticated plan of the order of invocations. The invocation schedule can then be refined during the execution of the plan, taking into account new estimates as well as the actual availability of input data.

Further optimization possibilities are offered by the EXAMINE primitive:

Aborting invocations and rescheduling: Though a pre-invocation estimate should give quite an accurate prognosis of the time needed by a service, it can happen that the actual invocation does not live up to the expectations and its progress is too slow. Or a client might want to invoke two services providing the same results in parallel in order to choose the faster one after a certain time. In both cases, EXAMINE allows to get the necessary progress information. Based on this information, too slow invocations can be aborted with TERMINATE and alternative plans for achieving the desired tasks can be determined.

The CPAM protocol is intentionally kept simple. It's strength lies in providing explicit support for cost estimation and invocation scheduling without relying on external information sources like static cost models of services. We believe that these issues will become more important as the composition of large and autonomous services of different nature (computation, workflow, monitoring) emerges.

Implementing CPAM

CPAM has been developed for the composition of large, autonomous services, with a special attention on optimization issues and the constraints for optimization given by the autonomy of the services. The services we consider are mainly computational services, and the execution time is in the size of seconds to hours, or even days. Services can also be ongoing processes, to which a client connects for the time it is interested in its results, and disconnects when no more interested. CPAM makes no assumption concerning how modules provide services. Though we assume that in most cases computations are carried out by some software modules, results could also be provided by hardware modules, or even by humans. The software interface specified for services does not put any restrictions on the implementation of the service.

So far we have used the CPAM protocol in the CHAIMS project, where we implemented it on top of CORBA and RMI. We also provide wrapper templates for wrapping legacy code into CPAM compatible modules. These wrapper templates take care of handling several concurrent invocations, presetting parameters, transforming parameter values, and dispatching method calls to the legacy code. For the cases where a legacy module does not provide any pre-invocation estimates, future versions of the wrappers will also provide estimates based on the history of previous invocations. The repository, accessible either as a simple text file or via a graphical browser, contains the names of available services and their parameters together with additional information. The CHAIMS environment ([Beringer:98], [Perrochon:97]) furthermore contains a compiler for the language CLAM (Composition Language for Autonomous Megamodules) [Sample:99]. This compiler generates client code that uses CPAM within various distribution systems in order to access distributed, CPAM compliant modules.

Due to the ESTIMATE and EXAMINE primitives various optimization techniques become feasible. It is possible to hand code clients that have optimization based on these primitives, either directly in C++ or JAVA, or in a higher level language like the composition language CLAM. Yet this is a very tedious task. We are therefore investigating automatic optimization and scheduling techniques. Based on given preferences and constraints, these optimization techniques find optimal services and invocation schedules. No pre-existing cost model is used. Optimization can be done at compile time as well as run-time, based on the ESTIMATE and EXAMINE primitives. Run-time estimates take into account the influence of actual input parameters as well as the availability of resources. Besides avoiding additional dependencies and information flows between the highly independent provider of autonomous services and the clients using these services, this has also the advantage of being accurate even in fast changing environments.

Our current demonstration example for CPAM comes from the domain of logistics. Several information and reservation services are used to determine the best way of transportation from a city A to a city B. Yet the simple CPAM protocol is not limited to software services, and there are no constraints concerning the maximum execution time of a service. Therefore, we also plan to investigate the applicability of the CPAM protocol to workflow management. We believe that especially the usage of pre-invocation estimates could be of high interest in the domain of workflow management.

Conclusions

While our research is still in progress we are convinced that the composition of complex software, as seen in workflow systems as well as in logistics, can benefit from high level language primitives and control that include cost estimation and progress monitoring. Based on these primitives, optimization that exploits the inherent parallelism of distributed computations become possible and can even be automated.

References

[Beringer:98] Beringer, Tornabene, Jain, Wiederhold: A Language and System for Composing Autonomous, Heterogeneous and Distributed Megamodules; DEXA International Workshop on Large-Scale Software Composition, August 28, Vienna Austria

[Melloul:99] L. Melloul, D. Beringer, N. Sample, G. Wiederhold: CPAM, A Protocol for Software Composition; submitted

[Perrochon:97]Perrochon, Wiederhold, Burback: A Compiler for Composition: CHAIMS; Fifth International Symposium on Assessment of Software Tools and Technologies (SAST'97), Pittsburgh, June 3-5, 1997

[Sample:99] N. Sample, D. Beringer, L. Melloul, G. Wiederhold: CLAM: Composition Language for Autonomous Megamodules; submitted

[Swenson:98] K. Swenson; "SWAP Simple Workflow Access Protocol (SWAP)", Internet Draft, http://www.ietf.org/internet-drafts/draft-swenson-swap-prot-00.txt