Scaling and availability for web sites serving dynamic content has been an active focus of recent research. However, bursts of dynamic content traffic created by flash crowds can still cause the server to become overloaded, so much overloaded in fact, that the server appears unavailable to the clients. Hardware gross over-provisioning may be possible for very large sites but infeasible for smaller sites. Moreover, even with careful studies of past peak loads, future demands may exceed expectations and overwhelm the site (also known as the Slashdot effect). Finally, the inherent complexity in the multi-tiered architecture of such sites makes it difficult to predict exactly where the bottleneck will be for a particular workload at each point in time. Over-provisioning is hence currently needed within all tiers of a multi-tiered dynamic content site. To address these problems, I intend to explore adaptive techniques for better utilization of all resources available to the dynamic content server. In particular, I intend to explore the following mechanisms that could be added to current dynamic content sites transparently, with no client-side modifications:
I am interested in defining the database features, the interfaces, and the protocols necessary for a closer integration between the scheduler and database consistency maintenance policies, in replicated database clusters.
Up to now, I have studied only scheduler policies that reduce consistency maintenance overheads, while the database has remained unchanged. I believe that further improvements are possible if the database is no longer treated as a black box, or with only minimal changes to the database engine. In particular, a promising avenue is combining the scheduler's conflict avoidance with a fine-grained concurrency control such as multiversioning at the database.
While caching techniques for a standalone dynamic content web server have proven to be beneficial, such techniques increase CPU contention and memory pressure at the Web server. In a cluster with multiple Web servers, it seems natural to try to aggregate their memories and CPUs into what logically appears as a single larger cache.
Two problems need to be solved to accomplish this goal: First, consistency needs to be maintained between the different caches and between the caches and the back-end databases. Second, an incoming request needs to be directed to the appropriate server to optimize its chance of being found in the cache at that server.
Currently, state-of-the-art web proxies (i.e. machines at the client edge of the network), with very limited exceptions, cache only static web content and media streams produced by origin servers. Now let's assume that the proxies at the client edge are trusted machines, either by being part of a corporate content delivery network (CDN) operated by the content provider itself, or otherwise by being part of a CDN operator that is under contract with the content provider. In this context, I intend to investigate moving some of functionality from the server machines at the data center clusters to proxies at the client edge of the network. My goal here is to overcome limitations inherent in a single-site server (cluster), including issues such as long network latencies, network disconnection, and limits on the scalability of any single cluster.
In this design, the proxies will be capable of executing application logic and store permanent data belonging to that application in a local file system or in a local relational database. The proxy may simply cache query results in its file system, and operate as a dynamic content cache, or it may operate by means of SQL queries on a local view of the database, just like the main server.
While consistent replication and caching can be done transparently in a local cluster, more sophisticated methods may be needed in a geographically distributed system, because of problems of latency, bandwidth, and possible disconnected operation.
In CDNs for static content, a client request is re-directed to a nearby proxy, wherein near-ness is often equated with physical proximity in the network (for instance, using the number of hops). Load on the network may also be taken into account, but it is always assumed that the proxies are not loaded, and load on the proxy is not taken into account. For static content proxies, this is a reasonable assumption, but it may not be so on a proxy that executes application logic and contains persistent data. I intend to develop an algorithm for load balancing that also takes proxy load into account. In such an algorithm, request distribution may still be driven primarily by locality, but load balancing considerations will take over in the case of overload.
In this context, I also intend to explore a somewhat less intuitive idea. Currently, Internet data centers over-provide in terms of hardware to cover for overload periods created by flash-crowds. However, since some of the computation necessary for dynamic content generation is expensive, it may pay off to execute such computation remotely, on a trusted server or proxy, temporarily off-loading the main server during the load spike.