What is this?
This wiki was created to keep track and summarize work in Snoop Filtering a technique for reducing the number of messages needed to locate copies of data in multi-processors and multi-cores. This is a work in progress and we welcome additions, suggestions and revisions. Please do feel free to contact us at (zebchuk @ eecg.toronto.edu, babak.falsafi @ epfl.ch, or moshovos @ eecg.toronto.edu).
For small scale multiprocessors, broadcast-based snoop coherence protocols offer the most attractive solution for supporting shared memory programming models. Unfortunately, un-optimized snoop protocols use massive amounts of network bandwidth, significantly increase the energy required for cache tag lookups and communication, and thus may hurt performance and energy.
With the explosive growth of chip-multiprocessors (CMPs), almost all modern computers are built with multiple processing cores. The ubiquity of the shared memory programming paradigm makes it necessary for these multiprocessors to employ cache coherence protocols to maintain coherence between the many different caches in the system. Almost all coherence protocols that have been proposed fall under one of two broad categories:
Snoop protocols face three main challenges:
Scaling snoop protocols to larger systems requires somehow reducing the network bandwidth and tag lookup costs. For most commercial and scientific workloads, the majority of tag lookups performed as a result of snoop requests fail to find copies of the requested block. This means that many snoop-induced tag lookups are unnecessary and simply waste lookup bandwidth and energy, and the snoop messages and replies are wasting network bandwidth and energy. An ideal protocol would avoid such unnecessary broadcasts and tag lookups to reduce energy consumption and improve scalability.
A number of optimizations have been proposed to try to improve snoop protocols to approach the behaviour of an ideal protocol. Most of these proposals act as snoop filters to remove out unnecessary actions from a typical snoop protocol. There are many different attributes that can be used to categorize these filters. We chose to use the point of origin as the first order attribute for classification. These filters can be grouped into three broad classes:
Destination-based snoop filters reduce the number of snoop-induced tag lookups without reducing the number of snoop broadcasts. Snoops are still send to all nodes, but each node then can filter the snoop request and avoid local tag lookups. These filters primarily target tag energy and bandwidth.
On the other hand, source-based filters reduce the number of snoop broadcasts instead of just reducing tag lookups. These filters determine when a broadcast will be unnecessary and avoid the overhead of sending a full broadcast for such messages.
While the two most obvious options are to fitler snoops either at their destination or at their source, other filters don't fall under either of these categories. These rely on specific properties of the interconnect used to connect the various processors or cores. Or, they rely on the use of virtualized execution environments.
The following pages list specific filters along these categories.
An earlier overview of snoop filtering work in powerpoint slides prepared for a short summer course offered at the University of Zaragoza.
The first version of this wiki was created using text that Jason Zebchuk wrote. The wiki is maintained by Jason Zebchuk, Babak Falsafi, and Andreas Moshovos.