Advances in Distributed Systems
ECE 1746, Fall 2003
University of Toronto
Course time: Thursday, 4-6pm
Course Room: Galbraith Building, GB-120
Start date: Sep 11, 2003
Course Description
The exponential growth of Internet services demonstrates the importance and potential of large-scale distributed systems. Today, Web services allow online shopping of virtually any product from cheap second-hand items to expensive art collections. Content delivery networks can potentially speed these services by cleverly caching Web pages. Peer-to-peer applications allow sharing of content in ways that are making industry nervous about their profit margins. Multimedia services provide streaming delivery of audio and video. The new classes of distributed applications that are becoming ubiquitous seems endless: cluster computing, grid computing, game services, pervasive mobile computing, sensor networks, etc. In this scenario, a fundamental challenge is to provide scalable and robust services in the presence of best-effort communication and unreliable nodes.
This graduate-level course focuses on distributed computing from a systems software perspective. Students are expected to read and critique recent research papers that cover some of the distributed applications mentioned above and span areas such as operating systems, networks and multimedia systems. They are also expected to work on a research project and make a presentation.
Textbooks
There are no required textbooks for this course. The optional textbook is Distributed Systems: Concepts and Design (Third Edition), by George Coulouris, Jean Dollimore and Tim Kindberg. Published by Addison-Wesley, 2001. ISBN 0-201-61918-0.
Class Format
Please read carefully
Each week this class will cover a group of papers that focuses on a specific aspect of distributed systems. Students are expected to read all the papers in the group that will be presented (the number of presentations depends on the number of students in class). At the beginning of the term, each paper will be assigned to a student who will be presenting the paper. Presentations will be limited to 15 minutes.
While students are welcome to present papers as they wish, here is an outline of a presentation that should help you get started.
- Start by stating the thesis or the goals of the paper, i.e., what is the paper trying to achieve.
- Next, state the major contributions of the work, i.e., what is new about the work.
- Briefly describe each contribution. Choose one (or two) contribution(s) that you think is most interesting or novel and explain it in some detail.
- If there are experiments in the paper that highlight the benefits of the work, present some of these results. Ideally, the results you show will focus on the contributions that you explained in detail.
- Next, present related work in the area, i.e., how is this work related to other projects or systems.
- Present your conclusions about the work, i.e., does the paper achieve what it set out to achieve.
If students use slides, please use at least 20-24 point font for text. For a 15 minute presentation, do not use more than 15 slides or else the presentation will appear rushed. Students are welcome to send slides to the instructor a week before the presentation to get additional help.
After the presentation the student is expected to lead a 20-30 minute in-depth discussion of the paper (the length of the discussion will depend on the number of students in the class). This discussion should aim to answer the following questions:
- What were the main contributions of the work?
- What were the advantages and disadvantages of the approach?
- How does it compare to other related work in the paper group?
- What are potential avenues for further work and improvements?
To aid in this discussion, each student presentation must end with a list of 5 specific questions that the student can ask other students and should be prepared to answer (the student should preferably have the answers at the end of the slides).
The answers to these questions should not be obvious, i.e. they should not be stated clearly in the paper. Instead, the questions should help in critial analysis of the paper. For example, suppose one of the stated contributions of the paper is that it "Enables secure peer-to-peer routing". One question might be: how secure is the routing and what strategy discussed in the paper makes it secure?
Choosing Papers for Presentation
After the first class in the term, each student should send mail to the instructor with a list of three or more papers that the student would like to present in class. The list of papers is available on the class web site and is broken by subject and the week in which the paper will be presented.
The paper choice is first-come, first served. So it is in the student's interest to send mail to the instructor soon so that they get the first paper of their choice.
The reason for sending additional papers is to resolve conflicts so that if a student doesn't get their first choice, then their second choice can be assigned to them, etc. It is better to send a long list than a short list since if all the papers in the list are taken, then the instructor will have to send the student mail asking for another set of papers from the student. By the time, the student sends the next mail, other students may have chosen many more papers! We are solving a little distributed consistency problem here!
The papers are available under reading list. The first three papers in each group will generally be required reading while the later papers will generally be optional. The number of required readings in each group depends on the number of students in the class. The instructor will inform students which papers they should be choosing. However, it is best to choose the first three papers in a group as the first three choices in your list for your presentation. Also, papers that are already listed on the main class page (under each week) are already taken. Don't choose them. Since those papers are being presented, they are required reading.
Paper Reading List
Available under reading list.
Grading Policy
Grades will be based on class presentation and the questions prepared for the discussion, class project, quizzes and class participation and discussion. There will be no final exam in this course. There are no assignments for students who attend all classes. The grading breakup is as follows:
- Class presentation: 30%
- Class project: 40%
- Quizzes: 20%
- Class participation: 10%
Note: If you unable to attend a class, you will have to submit an assignment to me. Please see the quiz format below for more details.
Quiz Format
Too often, in a seminar class like this one, students do not read material or skip the presentations on topics other than the one they're scheduled to present. To discourage this attitude, the instructor will conduct four short quizzes during the semester. Each quiz will count 5% towards the final grade.
Here are some of the salient features of these quizzes:
- These quizzes will be in class, and will last less than 10 minutes.
- The format of the quiz will be about ten questions requiring one or two word answers.
- The schedule of these quizzes will not be pre-assigned.
- For each topic, the quizzes will be based on the papers presented in the class (not the optional papers).
- You should be current with the readings up to and including the date of the quiz.
If you are unable to attend a class, you should submit an assignment to me that summarizes the papers that will be presented in class that week. The summary for each paper should be one paragraph long, and it should state the topic of the paper, the contributions or the novel ideas in the paper and the results of the paper. Do not write more than 3-5 sentences per paragraph.
You should submit the assignment to me by email in a text file (not word or PDF file) before the end of class, i.e. I should get the mail by 6:00 pm Thursday. If there is no quiz, then I will ignore the assignment. However, if there is a quiz, I will read your assignment. Each such assignment will have the same points as a quiz or 5%.
Project Format
A major component of this course is devoted to a term-long project. The topic of the final project is largely up to you, but to help you choose a project, a list of projects is described below. These projects should help students determine whether their own projects are of reasonable size and scope.
The goal of the project is to encourage students to explore some aspect of distributed systems in detail. Some guidelines for choosing a project are: 1) the work should be in an area related to distributed systems (e.g., look at the topics for each week), 2) the work should be completed in less than three months, and 3) talk to the instructor and get a verbal agreement about a project before committing to it.
Students have two project options: 1) design and implementation of a system, or 2) writing a position paper. For the implementation option, 2-4 students should collaborate on the project. Make sure that the project is structured so that you can evaluate the system quantitatively. This option has the deliverables described below. Each of these deliverables is per-project (and not per-student). Note that each future deliverable contains much of the contents of the previous deliverables.
- Project Description: 1 page (Due Oct 2, 2003)
- Purpose of the project
- Expected outcome or result of the project
- Three or more intermediate steps in the project
- Status Report: 3-4 pages (Due Oct 30, 2003)
- Purpose of the project
- Expected outcome or result of the project
- Background research with bibliography of relevant research
- Research methodology or approach taken in the project
- Status of implementation
- Experiments that will be performed
- Final Report: 8-10 pages (Due Dec 4, 2003)
- Purpose of the project
- Expected outcome or result of the project
- Background research with bibliography of relevant research
- Details of the research methodology or approach taken in the project
- Status of implementation
- Evaluation results
- Conclusion: did your results meet expectations
- Future work
- Code
The second option, the position paper option, is for individuals. Students should pick an area of distributed systems such as the topics discussed each week. First, they should conduct detailed background research and cover as much literature as possible. Then they should compare the approaches and discuss the benefits or drawbacks of each. Finally they should come up with their "position". Your position should be a novel statement based on solid background research and sound judgement that you articulate clearly. Your position should not be obvious from the papers or background research. In other words, the position paper option encourages research (and not just a survey of previous work). Since there is no implementation with this option, the grading will be stricter regarding the quality of the final report and the novelty of your idea.
There are three main differences in the deliverables with this option compared to the implementation option: 1) since implementation and evaluation will not exist, you don't have to include it, 2) the background research should be more thorough and 3) the focus of the paper should be on the details of your approach which should clearly justify your position, i.e. your novel statement. Think of this option as a proposal for your research. If you are already conducting research in an area that is somewhat related to distributed systems, this option is a great way to force yourself to put your thoughts clearly on paper. If you are not conducting research yet, it will help you get started.
Based on the number of students in the class, the instructor will decide later whether there will be short project presentations.
Project Suggestions
Available under project suggestions.
Projects
Available under projects.
Weekly Reading List
Introduction
Introduction by Instructor
Presentation
Efficient Readings of Papers in Science and Technology
Michael J. Hanson, Dylan J. McNamee
Paper
How (and How Not) to Write a Good Systems Paper
Roy Levin, David D. Redell, Operating Systems Review 17(3), July 1983.
Paper Additional Material
Fault Tolerance
Understanding Fault-Tolerant Distributed Systems
Flavin Cristian, CACM Feb 1991
Paper
Presenter: Jason Yuen
Exploring Failure Transparency and the Limits of Generic Recovery
David E. Lowell, Subhachandra Chandra, Peter M. Chen, OSDI 2000
Paper
Presenter: Ivan Matosevic
Myriad: Cost-effective Disaster Tolerance
Fay Chang, Minwen Ji, Shun-Tak A. Leung, John MacCormick, Sharon E. Perl, Li Zhang, FAST 2000
Paper
Presenter: Charles Zhang
Security and Denial of Service
Practical Network Support for IP Traceback
Stefan Savage, David Wetherall, Anna Karlin, Tom Anderson, SIGCOMM 2000
Paper
Presenter: Aron Brener
Backtracking Intrusions
Samuel T. King, Peter M. Chen, SOSP 2003
Paper
Presenter: Idon Wong
Terra: A Virtual-Machine Based Platform for Trusted Computing
Tal Garfinkel, Ben Pfaff, Jim Chow, Mendel Rosenblum, Dan Boneh, SOSP 2003
Paper
Presenter: Levon Stepanian
Naming Schemes
Active Names: Flexible Location and Transport of Wide-Area Resources
Amin Vahdat, Michael Dahlin, Thomas Anderson, Amit Aggarwal, USITS 1999
Paper
Presenter: Andrés Lagar Cavilla
On the Effectiveness of DNS-based Server Selection
Anees Shaikh, Renu Tewari, Mukesh Agrawal, INFOCOM 2001
Paper
Presenter: Katherine Lam
Instructor's notes:
Naming, Location Services and Binding
Distributed File Systems
The Google File System
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung, SOSP 2003
Paper
Presenter: Borys Bradel
Petal: Distributed Virtual Disks
Edward K. Lee, Chandramohan A. Thekkath, ASPLOS 1996
Paper
Presenter: Kurniadi Asrigo
Routing
Enabling Conferencing Applications on the Internet using an Overlay Multicast Architecture
Yang-Hua Chu, Sanjay G. Rao, Srinivasan Seshan, Hui Zhang, SIGCOMM 2001
Paper
Presenter: Ali Tizghadam
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan, SIGCOMM 2001
Paper
Presenter: Martin Labrecque
P2P Storage
Protecting Free Expression Online with Freenet
Ian Clarke, Theodore W. Hong, Scott G. Miller, Oskar Sandberg, and Brandon
Wiley, IEEE Internet Computing 2002
Paper
Presenter: Kamran Farhadi
Wide-Area Cooperative Storage With CFS
Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, SOSP 2001
Paper
Presenter: Catalin Drula
P2P Search and Applications
Querying the Internet with PIER
Ryan Huebsch, Joseph M. Hellerstein, Nick Lanham, Boon Thau Loo, Scott Shenker, Ion Stoica, VLDB 2003
Paper
Presenter: Henry Luk
Distributed Query Processing and Catalogs for Peer-to-Peer Systems
Vassilis Papadimos, David Maier, Kristin Tufte, CIDR 2003
Paper
Presenter: Taimur Javed
Web Caching and Content Delivery Networks
Internet Indirection Infrastructure
Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh Surana, SIGCOMM 2002
Paper
Presenter: Kai Yi Kenneth Po
FastReplica: Efficient Large File Distribution Within Content Delivery Networks
Ludmila Cherkasova, Jangwon Lee, USITS 2003
Paper
Presenter: Gokul Soundararajan
Cluster-based Computing and Scalable Internet Services
Cluster-Based Scalable Network Services
Armando Fox, Steven D. Gribble, Yatin Chawathe, Eric A. Brewer, Paul Gauthier, SOSP 1997
Paper
Presenter: Matt Medland
Capriccio: Scalable Threads for Internet Services
Rob von Behren, Jeremy Condit, Feng Zhou, George C. Necula, Eric Brewer, SOSP 2003
Paper
Presenter: Kirk Stewart
Replication and Grid Computing
Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System
D. B. Terry, M. M. Theimer, Karin Petersen, A. J. Demers, M. J. Spreitzer, C. H. Hauser, SOSP 95
Paper
Presenter: Stanley Fung
The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration
I. Foster, C. Kesselman, J. Nick, S. Tuecke, GGF 2002 (Global Grid Forum)
Paper
Presenter: HungJu Tze
Instructor's notes:
Recovery in databases with undo and redo logging
Borrowed from computer science course at Duke University (CPS 216)
Sensor Networks
System Architecture Directions for Networked Sensors
Jason Hill, Robert Szewczyk, Alec Woo, Seth Hollar, David Culler, Kristofer Pister, ASPLOS 2000
Paper
Presenter: Ramy Farha
Wireless Sensor Networks for Habitat Monitoring
Alan Mainwaring, Joseph Polastre, Robert Szewczyk, David Culler, and John Anderson, WSNA 2002 (Wireless Sensor Networks and Applications)
Paper
Presenter: Alex Cheung
Games
An Efficient Synchronization Mechanism for Mirrored Game Architectures
Eric Cronin, Burton Filstrup, Anthony R. Kurc, and Sugih Jamin, NetGames 2002
Paper
Presenter: Daniel Lin
The Effect of Latency on User Performance in Warcraft III
Nathan Sheldon, Eric Girard, Seth Borg, Mark Claypool, Emmanuel Agu, Netgames 2003
Paper
Presenter:Peter Yiannacouras