Web Services

Kenneth M. Anderson <kena@cs.colorado.edu>

Lecture 02: Distributed Information Systems

Copyright Notice

Some material in this lecture is adapted from the teaching materials of the book “Web Services: Concepts, Architectures and Applications” and is thus Copyright © 2003 Gustavo Alonso, ETH Zürich and/or Copyright © 2004 Springer-Verlag Berlin Heidelberg.

All other material is Copyright © 2006 Kenneth M. Anderson

Distributed Information Systems

Web services are a form of distributed information system. Many of the problems that Web services try to solve, as well as the design constraints encountered along the way, can be understood by considering how distributed information systems evolved in the past.
    — Introduction to Chapter 1 of our Textbook

Overview: Information System Design

Layers and Tiers

A client is any user or program that wants to perform an operation on the system. Clients interact with the system through a presentation layer.
The application logic determines what the system actually does. It takes care of enforcing the business rules and establishing the business process. The application logic can take many forms: programs, constraints, workflows, etc.
The resource manager deals with the organization (storage, indexing, and retrieval) of the data necessary to support the application logic. This is typically a database but it can also be a text retrieval system or any other data management system providing querying capabilities and persistence.

Boxes and Arrows

Each box represents a part of the system. Each arrow represents a connection between two parts of the system.
Adding boxes makes the system modular: this provides opportunities for adding distribution and parallelism. It also supports encapsulation, component–based design, reuse, etc. Adding arrows, on the other hand, adds connections that need to be maintained; more coordination is necessary. The system becomes more complex to monitor and manage.
The more boxes, the greater the number of context switches and intermediate steps to go through before one gets to data. Performance suffers considerably. System designers try to balance the flexibility of modular design with the performance demands of real applications.

There is no problem in system design that cannot be solved by adding a level of indirection. There is no performance problem that cannot be solved by removing a level of indirection.

Top Down Design

The functionality of a system is divided among several modules. Modules are typically not stand-alone components, their functionality depends on modules located in a lower layer.
Hardware is typically homogeneous and the system is designed to be distributed from the beginning.

The Process of Top Down Design

Bottom Up Design, Part 1

Bottom Up Design, Part 2

The Process of Bottom Up Design

Overview: Information System Architecture

One Tier: Monolithic

The presentation layer, application logic and resource manager are built as a monolithic entity.
Users/programs access the system through “dumb” terminals, whose display is controlled by the information system.
This was the typical architecture of mainframes, offering several advantages:
  • no forced context switches in the control flow
  • everything is centralized; managing and controlling resources is easier
  • the design can be highly optimized by blurring the separation between layers

Two Tier: Client/Server

As computers became more powerful, it was possible to move the presentation layer to the client. This has several advantages:
  • Clients are independent of each other: one can have several presentation layers depending on what each client needs to do.
  • One can take advantage of the computing power at the client machine to have more sophisticated presentation layers while also saving computer resources on the server.
  • It introduces the concept of API (Application Program Interface). An interface to invoke the system from the outside.
  • The resource manager only sees one client: the application logic. This greatly helps with performance since there are no client connections/sessions to maintain.

Two Tier: Server API

  • Client/server systems introduced the notion of service (the client invokes a service implemented by the server)
  • Client/server systems also introduced the notion of service interface (how the client can invoke a given service)
  • Taken together, the interfaces to all the services provided by a server define the server's API

Two Tier: Advantages/Disadvantages

Three Tier: Middleware

  • In a 3 tier system, the three layers are fully separated; they are also typically distributed
  • Middleware introduces an additional layer of business logic encompassing all underlying systems
  • By doing this, a middleware system:
    • simplifies the design of clients by reducing the number of interfaces it needs to know
    • provides transparent access to the underlying systems
    • acts as a platform for inter-system functionality and high level application logic
    • takes care of locating resources, accessing them, and gathering results
Middleware systems also enable the integration of systems built using other architectures

N-Tier: Web Integration

  • N-tier architectures result from connecting several 3-tier systems to each other and/or by adding an additional layer to allow clients to access the system via the Web
  • The Web layer was initially external to the information system (a true additional layer); today, it is being incorporated into a presentation layer that resides on the server side (part of the middleware infrastructure in a three tier system, or part of the server directly in a two tier system)
  • The addition of the Web layer led to the notion of “application servers” which was used to refer to middleware platforms supporting Web access

N-Tier Systems in the “Real World”

Overview: Communication Styles

Blocking Interactions

  • traditional, information systems use blocking calls (client waits while server processes a request)
  • synchronous interaction requires both parties to be “on-line”
  • advantage: simple to understand and implement
  • disadvantages: connection overhead, higher probability of failures, failures hard to manage
  • one solution: transactions
  • another solution: non-blocking interactions

Non-Blocking Interactions

  • with non-blocking interactions, a call to the server returns immediately
  • client can continue to run and occasionally check with server to see if a response is ready
  • typically implemented via message queues
  • disadvantage: adds complexity to client architecture
  • advantages: more modular, more distribution modes (multicast, replication, message coalescing, etc.), more natural way to implement complex interactions between heterogeneous systems

Next Week