Using a Global Event Model in Distributed Resource Management Algorithms

PhD Thesis Research Proposal Summary

Submitted for Microsoft Graduate Fellowship

Also Available in Postscript

Adam Rifkin, adam at xent dot com

California Institute of Technology

Computer Science 256-80, Pasadena, CA 91125

January 30, 1998

MOTIVATION

Local events are a useful programming abstraction for triggering procedure calls in applications; for example, events have been widely applied to control flow in graphical interface toolkits. By clearly separating the event producers and the listeners that deal with generated events, event-oriented programming has several key properties: encapsulation of event generation, notification, and handling; scalability of producer and listener membership; and automatic propagation of events when something interesting happens.

We believe that global events, distributed among multiple virtual machines, can aid in the development of and reasoning about distributed systems of autonomous components interacting through event-based abstractions. Developing a global event model for wide-area Internet applications presents a research opportunity to adapt event-oriented programming to the Internet's unique scalability and availability requirements.

Distributed resource management is an archetypal challenge of wide-area coordination that illuminates many of the tradeoffs involved. We model the problem with resource tokens, providers, consuming requestors, and constraints. Colored tokens network represent resources of specific types, providers of resources can join into worldwide provider pools, and clients can request resources from specific pools (based on scope and custom preferences) with constraints such as:

``Please arrange for a plane (from Burbank), hotel, and car for my trip to Seattle for March 23--28, 1998, and although I prefer to fly United, it is more important to keep the total cost under $1200.''

What makes such resource management problems interesting is that the providers, requestors, token colors, and constraints can all be dynamic, decentralized, and distributed.

Since distributed programming skills are scarce, such problems are usually solved as mission-critical custom applications. We believe global events provide a flexible framework for creating on-the-fly networks. Groups of providers should be able to join together to form instant organizations that can provide collections of resources that perhaps no single provider could. Furthermore, ``online just-in-time middlemen'' which themselves are neither explicitly resource providers nor requestors, can be useful intermediaries for discovering requestors and providers, coordinating requests and their servicing by potentially competing providers, and collating information relevant to end-parties. The key characteristic of algorithms using these grouping and middlemen abstractions is the composability of the participants in a manner that is dynamic, decentralized, and distributed.

In this thesis, we will examine event-oriented algorithms for distributed resource management, and compare our solutions to existing approaches. Since we can demonstrate the isomorphism between events, messages, and remote procedure calls, we expect our results to be widely applicable.

RESEARCH AGENDA

Local event models lack several mechanisms useful to developers when components can be globally distributed and dynamically reconfigurable: facilities for handling component and communication channel faults and for dealing with exceptional behavior such as excessive communication latencies; filtering facilities based on predicates that can include soft real-time constraints (because decentralized clockscannot guarantee global accuracy); and composition facilities for using technologies such as multicast and techniques such as event notifier aggregation. We intend to analyze and to understand the event-oriented paradigm in general, and its composing, filtering, and scaling properties in particular.

There is a tension between scalability and availability of the components participating in event-oriented algorithms. In some cases, resource consumers poll providers with announcements in some order until the request is satisfied. In other cases, resource consumers merely announce a call for service, and collaborating providers take responsibility for listening and assembling a basket of resources; or, middlemen take responsibility for listening and assembling a basket of resources, by collating resource offers from competing providers. In all cases, widening the announcement scope increases the pool of available providers, at the expense of increased wide-area message traffic.

Our overall algorithm design goal is to strike a balance between scalability and availability of the providers, requestors, and resources; to that end, we will employ techniques such as aggregating, hierarchical structuring, and scoping. We will compare and contrast our solutions with traditional methods for data consistency (for example, two-phase commit from the database and the distributed shared memory communities) and for hot-spot reduction (for example, soft state from the multicast community and caching from the Web client and randomized algorithm communities).

Several applications can benefit from our analysis, including transaction services, automatic intermediation, and metacomputing. We will design algorithms that exploit event-oriented development, and reason about them using probabilistic models and temporal logic. Then we will implement and evaluate selected algorithms, comparing them with traditional approaches based on performance metrics such as: average response time to receive resources, resilience to faults, bandwidth utilization, and real-time guarantees.

RESEARCH SCHEDULE

Identify an interesting set of applications representing the problem space
Perform a requirements analysis and define proper performance metrics
Formalize these requirements and propose a hypotheses of expected findings
Propose an experimental design, including how to collect and compare data
Solve these problems using existing tools, and develop event-oriented alternatives that are demonstrate composability (by clean architecture and component reuse) and high-confidence (by correctness proofs)
Implement selected algorithms and evaluate them based on metrics

Adam Rifkin, Caltech Infospheres Project

Last modified: Sun Jan 2 06:20:23 PDT 2000