Cristiana Amza: What I've been doing
Cristiana Amza - Research Summary
My recent research work broadly focuses on distributed systems, databases,
cluster servers and web technologies.
I have made contributions in the following areas.
-
I have designed and implemented a replication technique using
a novel scheduler design that provides scaling, strong data
consistency and availability for dynamic content servers.
The technique, called conflict-aware scheduling
[USITS '03],
improves scaling by using a lazy replication scheme with
asynchronous updates, and
by avoiding conflicts at each replica.
At the same time, the user sees serializable executions.
-
I have implemented a new concurrency control algorithm based on explicit
versions for the database back-ends in a replicated dynamic content server.
This algorithm reduces the overhead of consistency maintenance
by optimizing the duration of conflicts
in problem applications with frequent conflicts.
Distributed versioning [MiddleWare '03] is a novel replication technique that uses this explicit
versioning concurrency control in conjunction with a conflict-aware scheduler.
Distributed versioning provides both the simplicity
in application semantics and programming of eager replication, and
performance close to the best achievable using lazy replication for most
workloads studied.
-
I have performed an analysis of scale and performance in
dynamic content web sites. In particular, in collaboration with other
researchers at Rice, I have implemented three common dynamic content
applications [WWC-5 '02]:
e-commerce (TPC-W ), on-line bidding
(modeled after eBay ),
and bulletin board (modeled after Slashdot).
Using these applications, I have studied the common
bottlenecks and the effect of admission control and load balancing
[TR '02],
and transparent query
caching [OSDI '02 submission] for a dynamic
content server.
-
Data Replication for Persistent Memory
I have designed and implemented novel replication algorithms
for applications that use in-memory persistent storage.
The algorithms [DSN '00],
provide both reliability and data availability
while still maintaining very high transaction throughput.
I investigated four possible designs in a primary-backup configuration,
using a cluster of commodity servers connected by a
write through system area network.
I showed that logging approaches outperform mirroring approaches,
because of better locality and in spite of
communicating more data.
I also showed that the logging versions scale well to
small shared-memory multiprocessors.
-
Software Distributed Shared Memory Systems
My contributions in this area include a
novel adaptive algorithm that switches between a single-writer and
a multiple-writer protocol [HPCA '97].
This algorithm provides the benefits of
lazy release consistency in software distributed shared memory
systems (SDSM) while reducing its computation and memory overheads
whenever the application pattern allows it.
Later work explores using different consistency units in page-based
SDSM [PPoPP '97],
and an extension of my previous dynamic protocol adaptation in SDSM to
include adaptation of the consistency unit to fit the application
pattern [IEEE '99].
Back home