ECE1770S Middleware Systems

Special Topic 8: Caching

prepared by Chung Kai Lee

Papers

  1. M. Franklin and M. Carey. "Client Server Caching revisited". (required reading)
  2. Louis Degenaro, Arun Iyengar, Ilya Lipkind and Isabelle Rouvellou. A Middleware System Which Intelligently Caches Query Results. In Proceedings of Middleware '00, IFIP/ACM International Conference on Distributed systems platforms, 2000, Pages 24-44. (additional reading)

1. Client Server Caching revisited

Summary

This paper is a follow-up on a previous paper which investigated the performance implications of inter-transaction data and lock caching. It has 2 main objectives:
  1. Re-examination of heuristics for deciding dynamically (O2PL-Dynamic) between propagating changes or invalidating remote copies of data pages in order to maintain cache consistency
  2. Study of the "Callback Locking" family of caching algorithms.
The paper also performed experiments under different workload conditions based on its client-server caching model.

Benefits and Drawbacks of Data and Lock Caching

Benefits

Drawbacks

Caching Algorithms under Investigation

  1. Server-based 2PL
  2. Optimistic 2PL (O2PL)
  3. Callback Locking

Server-based 2PL (variant called Caching 2PL, C2PL)

Optimistic 2PL (O2PL)

Callback Locking

Client-Server Caching Model

Figure 1 above (from paper) shows the structure of the simulation model, constructed using the DeNet discrete event simulation language [Livny, 1988]. For details on how exactly the model is simulated, please refer to the paper. Also the paper describes a set of parameters used to specify the resources and overheads of the system, and are omitted here.

Workloads

The above table (from paper) summarizes the workloads used. Here are the brief description of each of them:

New Adpative Heuristic (O2PL-ND)

O2PL-D uses the following simple heutristic to determine invalidation or propagation: Specifically, O2PL-D will propagate a new copy of a page if both of the following conditions meet:
  1. the page is resident at the site when the consistency operation is attempted
  2. if the page was previously propagated to the site, then the page has been re-accessed since that propagation
Problem with O2PL-D: its willingness to err on the propagation side resulted in its performance being lower than that of O2PL-I in most cases.

Solution with new heuristic O2PL-ND (for "New Dynamic"): err on the side of invalidation if a mistake was to be made:

Experiment Results for O2PL-ND

The performance of the new heuristic was examined using the HOTCOLD, FEED and UNIFORM workloads. The results are plotted in graphs and can be viewed here (taken from paper).

Observations from HOTCOLD Workload

Note: HOTCOLD workload benefits from invalidation

Observations from FEED Workload

Note: FEED workload benefits from propagation

Observations from UNIFORM Workload

Note: For UNIFORM workload, caching is not expected to provide much of a performance benefit
In summary, O2PL-ND performs as well as the static O2PL-I algorithm when invalidation is advantageous, and also retains the performance advantages of the O2PL-D heuristic in cases where propagation is advantageous.

Experiment Results for Callback Locking

The performance of the 2 Callback Locking algorithms (CB-Read and CB-All) were examined using the HOTCOLD, FEED, UNIFORM and HICON workloads. The results are plotted in graphs and can be viewed here and here for HICON (taken from paper).

Observations from HOTCOLD Workload

Observations from FEED Workload

Observations from UNIFORM Workload

Observations from HICON Workload

In summary, CB algorithms have similar but slightly lower performance to O2PL-ND. The CB algorithms performed much better than C2PL under most workloads, while retaining the lower abort rate than O2PL-ND. They are more robust than O2PL-ND in the presence of data contention.


2. A Middleware System Which Intelligently Caches Query Results

Summary

This paper describes how caching was used to improve performance in the Accessible Business Rules framework (ABR) for IBM's Websphere. While this is specific to an application, the paper suggests that the techniques can be applied to other caching environments besides ABR. The paper has 2 main objectives:
  1. Apply the General-Purpose Software cache (GPS cache) to improve performance of ABR.
  2. Use data update propagation (DUP) to solve the problem of keeping the cache current after database updates

Overview of the Accessible Business Rules Framework (ABR)

Benefits of externalizing business rules

Problem: significant overhead which largely resulted from querying

Solution: caching the query results

Caching used by ABR

GPS cache

Cache Invalidation using DUP

Data update propagation (DUP) determines how cached data are affected by changes to underlying data which determine the current values of the data. DUP maintains correspondences between objects which are defined as entities which may be cached and underlying data which periodically change and affect the values of objects. A query result may depend on several attributes and the dependency relationships are represented by an object dependence graph (ODG).

There are 2 key innovations to DUP:

  1. Value-aware update policy: when attributes change, the old and new values of the attributes are considered in order to determine how to update the cache; this is implemented by annotating edges of ODG's with values based on queries
  2. DUP for ABR automatically generate ODG's from the ABR queries, in contrast to the application program responsible for generating the ODG

Constructing ODG's from Queries

The following query:
select A where A.x > 2 and A.x < 9 and A.z = B.y
would generate the following ODG (taken from paper):

Note the following:

Performance of Query Caching Techniques

The paper also describes some experiments to investigate the performance of query caching techniques. The experiments were conducted for 3 different invalidation policies:
  1. Policy I: invalidated all cached data after any update
  2. Policy II: basic DUP algorithm without the enhancements described above; being value-unaware as it uses only object dependency information without considering the values involved in the update
  3. Policy III: the value-aware policy which uses the enhanced DUP algorithm with edge annotations on the ODG
The paper performs the experiment on several different query types and update sizes. It also investigates the cache hit rates for the benchmarks TPC-C (models on-line transaction processing applications) and TPC-D (models data warehousing applications). And in summary:

Policy III outperforms policies I and II. Policy I is obvious to be the worst performer and serves as the lower bound of performance. Policy II is used to compare the benefits of the enhancements of DUP applied in policy III.


prepared by Chung Kai Lee, Apr 2001