ECE1770S Middleware Systems
Special Topic 8: Caching
prepared by Chung Kai Lee
Papers
- M. Franklin and M. Carey. "Client Server Caching revisited". (required reading)
- Louis Degenaro, Arun Iyengar, Ilya Lipkind and Isabelle Rouvellou. A Middleware System Which Intelligently Caches Query Results. In Proceedings of Middleware '00, IFIP/ACM International Conference on Distributed systems platforms, 2000, Pages 24-44. (additional reading)
1. Client Server Caching revisited
Summary
This paper is a follow-up on a previous paper which investigated the performance implications of inter-transaction data and lock caching. It has 2 main objectives:
- Re-examination of heuristics for deciding dynamically (O2PL-Dynamic) between propagating changes or invalidating remote copies of data pages in order to maintain cache consistency
- Study of the "Callback Locking" family of caching algorithms.
The paper also performed experiments under different workload conditions based on its client-server caching model.
Benefits and Drawbacks of Data and Lock Caching
Benefits
- reducing reliance on the server, thus offloading a potential bottleneck and reducing communication
- allowing better utilization of the CPU and memory resources that are available on clients
- increasing the scalability of the system in terms of the impact of adding client workstations
Drawbacks
- increased communication
- increased load on the server CPU
- additional path length
- extra load placed on clients
Caching Algorithms under Investigation
- Server-based 2PL
- Optimistic 2PL (O2PL)
- Callback Locking
Server-based 2PL (variant called Caching 2PL, C2PL)
- data pages are cached at clients across transaction boundaries, but not locks
- consistency is maintained using a "check-on-access" policy
- deadlock detection is performed exclusively at the server, with server's copy of each page being treated as the primary copy of that page
Optimistic 2PL (O2PL)
- optimistic refers to the deferral of detection of conflicts among locks cached at multiple sites until transaction commit time
- each client has its own local lock manager and maintains a local waits-for graph
- locks are obtained locally at clients during transaction execution, deferring global acquisition of locks until the commit phase
- once all the required locks have been obtained, 3 variants can be used to maintain consistency:
- O2PL-Invalidate (O2PL-I): invalidates the remote cached copies of data pages
- O2PL-Propagate (O2PL-P): propagates the new values of data items to remote caching sites
- O2PL-Dynamic (O2PL-D): chooses between invalidation and propagation on a per-copy basis, based on some heuristic
Callback Locking
- allows caching of data pages and non-optimistic caching of locks
- clients must obtain a lock from the server immediately (rather than at commit time) prior to accessing a data page, if they don't have the proper lock cached locally
- when a client requests a lock that conflicts with one or more locks that are currently cached at other clients, the server "calls back" the conflicting locks by sending requests to the sites which have those locks cached
- the lock request is granted only when the server has determined that all conflicting locks have been released.
- invalidation based
- copy sites always respond to a callback request immediately to allow for deadlock detection
- 2 variants: Callback-Read (CB-Read) and Callback-All (CB-All)
CB-Read
- caches only read locks
- when a request for a write lock on a page arrives at the server, the server issues callback requests to all sites (except the requestor) that have a cached copy of the page.
- at a client, the callback request is considered as a request for an exclusive lock on the specified page
- if the request cannot be granted immediately, the client responds to the server that the page is in use.
- when the callback request is granted at the client, the page is removed from the client's buffer and an acknowledgement message is sent to the server
- when all callbacks have been acknowledged to the server, the server grants a write lock on the page to the requesting client
- any subsequent read or write lock requests for the page will be blocked at the server until the write lock is released by the holding transaction
- at the end of transaction, the client sends copies of the updated pages to the server and releases its write locks while retaining copies of the pages in its cache (implicit read locks on the pages)
CB-All
- works similarly to CB-Read, except that write locks are kept at the clients rather than at the server and are not released at the end of a transaction
- a downgrade request is sent if a read request for a page arrives at the server and an exclusive copy of the page is currently held at some site
- the client reacts to a downgrade request by noting that it no longer has an exclusive copy of the page; in effect downgrades its cached write lock to a read lock
Client-Server Caching Model

Figure 1 above (from paper) shows the structure of the simulation model, constructed using the DeNet discrete event simulation language [Livny, 1988]. For details on how exactly the model is simulated, please refer to the paper. Also the paper describes a set of parameters used to specify the resources and overheads of the system, and are omitted here.
Workloads

The above table (from paper) summarizes the workloads used. Here are the brief description of each of them:
- HOTCOLD: high degree of locality per client and a moderate amount of sharing and data contention among clients
- FEED: simulates a stock quotation system in which one site produces data while the other sites consume it
- UNIFORM: low-locality, moderate write probability workload which is used to examine consistency algorithms when caching is not expected to pay off significantly
- HICON: varying degrees of data contention
New Adpative Heuristic (O2PL-ND)
O2PL-D uses the following simple heutristic to determine invalidation or propagation:
- initially propagates updates
- invalidate copies of subsequent consistency operation if it detects that the preceding page propagation was wasted
Specifically, O2PL-D will propagate a new copy of a page if both of the following conditions meet:
- the page is resident at the site when the consistency operation is attempted
- if the page was previously propagated to the site, then the page has been re-accessed since that propagation
Problem with O2PL-D: its willingness to err on the propagation side resulted in its performance being lower than that of O2PL-I in most cases.
Solution with new heuristic O2PL-ND (for "New Dynamic"): err on the side of invalidation if a mistake was to be made:
- updated copy of a page will be propagated to a site only if conditions 1. and 2. of OP2L-D are met plus 3. the page was previously invalidated at that site and that invalidation was a mistake
- the new condition ensures that O2PL-ND will invalidate a page at a site at least once before propagating it to that site
Experiment Results for O2PL-ND
The performance of the new heuristic was examined using the HOTCOLD, FEED and UNIFORM workloads. The results are plotted in graphs and can be viewed here (taken from paper).
Observations from HOTCOLD Workload
Note: HOTCOLD workload benefits from invalidation
- O2PL-ND improves performance over O2PL-D
- O2PL-ND performs as well as O2PL-I (the better of O2PL-I and O2PL-P), while O2PL-D tracks the lower performance of O2PL-P
- all O2PL algorithms outperform the C2PL algorithm prior to reaching the disk bottleneck
Observations from FEED Workload
Note: FEED workload benefits from propagation
- both dynamic algorithms fall roughly between the 2 static ones, with O2PL-ND having a slight advantage
Observations from UNIFORM Workload
Note: For UNIFORM workload, caching is not expected to provide much of a performance benefit
- O2PL-I and O2PL-ND perform similarly, and both are much better than O2PL-P and O2PL-D
|
In summary, O2PL-ND performs as well as the static O2PL-I algorithm when invalidation is advantageous, and also retains the performance advantages of the O2PL-D heuristic in cases where propagation is advantageous.
|
Experiment Results for Callback Locking
The performance of the 2 Callback Locking algorithms (CB-Read and CB-All) were examined using the HOTCOLD, FEED, UNIFORM and HICON workloads. The results are plotted in graphs and can be viewed here and here for HICON (taken from paper).
Observations from HOTCOLD Workload
- C2PL has lowest throughput
- O2PL-ND has the highest throughput
- CB algorithms perform at a lower level than O2PL-ND
Observations from FEED Workload
- O2PL-ND has the best reader throughput prior to 10 clients
- CB algorithms have the best reader throughput beyond 10 clients
- all algorithms (except C2PL) approach a network bottleneck at 15 clients and beyond
Observations from UNIFORM Workload
- O2PL-ND achieves the highest throughput
- Callback algorithms perform below O2PL-ND but better than C2PL
- CB-Read performs slightly better than CB-ALL
Observations from HICON Workload
- in the range of 1 to 5 clients, C2PL performs worst
- at 10 clients and beyond, O2PL-ND suffers dramatically and has the lowest utilization of all three major system resources (disk, server CPU and network)
- CB-All send more messages than CB-Read
- O2PL-ND was found to have a significantly higher abort rate that the other algorithms
|
In summary, CB algorithms have similar but slightly lower performance to O2PL-ND. The CB algorithms performed much better than C2PL under most workloads, while retaining the lower abort rate than O2PL-ND. They are more robust than O2PL-ND in the presence of data contention.
|
2. A Middleware System Which Intelligently Caches Query Results
Summary
This paper describes how caching was used to improve performance in the Accessible Business Rules framework (ABR) for IBM's Websphere. While this is specific to an application, the paper suggests that the techniques can be applied to other caching environments besides ABR. The paper has 2 main objectives:
- Apply the General-Purpose Software cache (GPS cache) to improve performance of ABR.
- Use data update propagation (DUP) to solve the problem of keeping the cache current after database updates
Overview of the Accessible Business Rules Framework (ABR)
- one of the e-business application frameworks available on IBM's Websphere middleware
- enables application writers to build applications where the time and situation-variable parts of their business logic are externally applied entities called business rules
- structure of the application matches the built in core behaviour with variations specified, managed and applied externally
- an ABR rule is a persistent object encapsulating code implementing variable behaviour as well as a number of attributes defining the business context in which this behaviour applies
- ABR defines structured exit points from the main application logic, referred to as decision points
- code in decision points selects the particular business logic to be executed via a query
Benefits of externalizing business rules
- clarity of the application
- ease of maintenance
Problem: significant overhead which largely resulted from querying
Solution: caching the query results
Caching used by ABR
- caching is an extremely useful technique for improving performance in a variety of software applications
- General-Purpose Software cache (GPS cache) is implemented in order to achieve performance improvements for multiple applications
GPS cache
- POSIX-compliant C++ library
- an application uses the GPS cache application programming interface (API) to manage the cache and is linked with the GPS cache library
- applications add, delete and query the cache via a set of API function calls
- used to improve performance in ABR and in a Web server accelerator
- can be configured to store data in memory, on disk, or both
- cached objects can have expiration times associated with them after which they are no longer valid; GPS cache implements an efficient algorithm (data update propagation, DUP) for invalidating objects based on expiration times
Cache Invalidation using DUP
Data update propagation (DUP) determines how cached data are affected by changes to underlying data which determine the current values of the data. DUP maintains correspondences between objects which are defined as entities which may be cached and underlying data which periodically change and affect the values of objects. A query result may depend on several attributes and the dependency relationships are represented by an object dependence graph (ODG).
There are 2 key innovations to DUP:
- Value-aware update policy: when attributes change, the old and new values of the attributes are considered in order to determine how to update the cache; this is implemented by annotating edges of ODG's with values based on queries
- DUP for ABR automatically generate ODG's from the ABR queries, in contrast to the application program responsible for generating the ODG
Constructing ODG's from Queries
The following query:
select A where A.x > 2 and A.x < 9 and A.z = B.y
would generate the following ODG (taken from paper):

Note the following:
- each class.attribute term in the query has a corresponding vertex in the ODG
- edges are drawn from each class.attribute vertex to the query result objects it affects
- annotation of the edge originating from the A.x vertex indicates that if A.x changes, query result Q1 would only be affected if either:
- A.x was previously between 2 and 9 and is no longer in this range
- A.x was previously not between 2 and 9 but now is in this range
- no annotations of edges originating from A.z and B.y indicates that value-aware invalidation is not in use. Any change to A.z and B.y might affect the value of Q1
- ODG is stored in the GPS cache, and GPS cache can traverse the ODG's efficiently to locate query results affected by changes to underlying data
Performance of Query Caching Techniques
The paper also describes some experiments to investigate the performance of query caching techniques. The experiments were conducted for 3 different invalidation policies:
- Policy I: invalidated all cached data after any update
- Policy II: basic DUP algorithm without the enhancements described above; being value-unaware as it uses only object dependency information without considering the values involved in the update
- Policy III: the value-aware policy which uses the enhanced DUP algorithm with edge annotations on the ODG
The paper performs the experiment on several different query types and update sizes. It also investigates the cache hit rates for the benchmarks TPC-C (models on-line transaction processing applications) and TPC-D (models data warehousing applications). And in summary:
|
Policy III outperforms policies I and II. Policy I is obvious to be the worst performer and serves as the lower bound of performance. Policy II is used to compare the benefits of the enhancements of DUP applied in policy III.
|
prepared by Chung Kai Lee, Apr 2001