ECE 1747H Parallel Programming

Instructor: Cristiana Amza
Pratt 484E
(416) 946-0299
amza@eecg.toronto.edu
Location: RS 310
Class time: Monday 4:00-6:00 PM
Project List: project_suggestions.txt
Office Hours: TBA

Class Project Description 


Overview

This course is an intermediate graduate course in the area of parallel programming. In the first part of the course we will briefly introduce the architecture of parallel systems and the concept of data dependencies/races. The three most commonly used parallel programming paradigms (shared memory, distributed memory and data parallel) will  then be examined in detail. An overview of automatic parallelization of programs and the use of parallel processing in related domains such as parallel and distributed database transaction processing will also be given. 
In the second part of the course selected research topics will be examined. This part of the course consists of student-lead discussions of relevant research papers.  A research-intensive group project in an area related to program parallelization is a fundamental part of the course.  The projects can be done individually or in small teams of two or three people. The project outcome will be presented in a class session at the end of the semester. A list of suggested research projects has been posted (project_suggestions.txt). Students are also encouraged to propose their own projects and discuss them with me. Please also read:
Class Goals and Advice from Instructor 

Grading

Textbooks and Pre-requisites

There is no required textbook for the class. You should be fine with the lectures and papers posted on this site. However, here are some suggestions for additional reading:

Parallel Programming in C with MPI and OpenMP
by Michael J. Quinn

Threads Primer: A Guide to Multithreaded Programming
by Bil Lewis, Daniel J. Berg

Concurrency Control and Recovery in Database Systems
by Philip A. Bernstein, Vassos Hadzilacos, Nathan Goodman (free on-line edition you can download from http://research.microsoft.com/pubs/ccontrol/ in .pdf)

It would be good for you if you had basic understanding of operating system principles, basic architecture and some knowledge of network programming. These are not strict pre-requisites though, most of the necessary material will be covered in class.

Programming Assignments

The first programming assignment is out: Assignment 1. Pthread program examples have been posted here: Code Examples. Please also consult this pthread How to use guide and this pthread reference manual

MPI program examples have been posted here: Code Examples. Please also consult this MPI How to use guide  

Schedule

Date Topic Reading Assignment
Due Dates
Handouts
Sep 10 Synchronization & Programming Models: Pthreads Slides-part1 Slides-part2      
Sep 17 Programming Models (contd) Shared Memory Programming & MPI: Pthreads Slides MPI Slides      
Sep 21   List of group members,

Project discussions - come by my office Pratt 484E 11 am - 6 p.m.

Sep 24 Automatic Parallelization and Optimization OpenMP.
 
Oct 1

Lock Synchronization and Optimization

John M. Mellor-Crummey and Michael L. Scott. Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Transactions on Computer Systems, 9 (1):21-65, February 1991. tocs91.pdf Cedomir Segulja 

Ahmed Abdelkhalek and Angelos Bilas. Parallelization and performance of interactive multiplayer game servers. In Proc. of the 18th International Parallel and Distributed Processing Symposium      
(IPDPS2004), April 2004  quake-parallel-ipdps04.pdf Bogdan Simion

Short Introduction to Distributed Shared Memory by Instructor

Informal oral project proposal

Paper summaries

 
Oct 8 Thanksgiving Holiday  No class
 
Oct 15 Distributed and Parallel Applications and Environments

Fundamentals of Grid Computing, IBM    http://www.redbooks.ibm.com/redpapers/pdfs/redp3613.pdf  Robert Ma, Tony Chau

General Purpose Computations and Parallel Programming Paradigms on Graphics Processing Units  graphics.pdf  Christian Lessig 

Kai Li, Paul Hudak Memory Coherence in Shared Virtual Memory Systems, 1991 ivy.pdf  Michael Kipper

Paper summaries

 
Oct 22 Software Distributed Shared Memory

John Carter, John Bennett, and Willy Zwaenepoel Implementation and Performance of Munin munin91.pdf  David Han

TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems, P. Keleher, A.L. Cox, S. Dwarkadas and W. Zwaenepoel, OSDI '94.  treadmarks94.pdf   David Goldman   

OpenMP for Networks of SMPs, Y.C. Hu, H. Lu, A.L. Cox, and W. Zwaenepoel, Journal of Parallel and Distributed Computing, vol. 60 (12), pp. 1512-1530, December 2000  
openmpsmp.pdf  Adam Czajkowski 

Short Introduction to Event-Driven Servers by Instructor

Paper summaries
Oct 29  Multithreading vs Event-Driven model for Server Code

Flash: An Efficient and Portable Web Server, Vivek S. Pai, Peter Druschel, Willy Zwaenepoel, USENIX Annual Technical Conference, 1999. flash.pdf   Alireza Heidarbarghi

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services. Presented at the Eighteenth Symposium on Operating Systems Principles (SOSP'01), Lake Louise, Canada, October 24, 2001. seda-sosp01.pdf   Raymond Lo

Papers summaries  
Nov 5 Advanced Synchronization Mechanisms in Multiprocessor and Distributed Systems

 

Lazy Asynchronous I/O for Event Driven Servers. Khaled Elmeleegy, Anupam Chanda, Alan L. Cox and Willy Zwaenepoel, in Proceedings of the USENIX 2004 Annual Technical Conference. laio04.pdf   Mark Teper   

Adaptive Overload Control for Busy Internet Servers, Matt Welsh and David Culler. In Proceedings of the 4th USENIX Conference on Internet Technologies and Systems (USITS'03), March 2003.  seda-usits03.pdf   Zhen Huang

Locality-Aware Request Distribution in Cluster-based Network Servers. Vivek S.Pai, Mohit Aron, Gaurav Banga, Michael Svendsern, Peter Druschel, Willy Zwaenepoel, Erich Nahum (ASPLOS'08).  lard.pdf   Bryce Leung

Short Introduction to Transactions and Novel Synchronization Paradigms based on them Transactions by Instructor 

Paper summaries

 
Nov 12 Optimizing for Underlying Parallel/Distributed Architecture Code Transformations to Improve Memory Parallelism Vijay S. Pai and Sarita Adve pai_micro99.pdf  Ben Hur

Data Replication Strategies for Fault Tolerance and Availability on Commodity Clusters  C. Amza, A. Cox, W. Zwaenepoel.
Proceedings of the International Conference on Dependable Systems and Networks 
(DSN), June 2000 vistareplication.pdf  Yong Zeng

Distributed Versioning: Consistent Replication for Scaling Back-end Databases of Dynamic Content Web Sites C. Amza, A. Cox, W. Zwaenepoel.
Proceedings of the ACM/IFIP/Usenix Middleware Conference, June 2003 dversioning.pdf  Weihan Wang, Lee Chew    

Paper summaries
Nov 19 Concurrent and Distributed Transaction Processing Systems

A Case for Staged Database Systems. Stavros Harizopoulos and Anastassia Ailamaki  stages.pdf  Ming Tse 

M. M. Michael and Michael L. Scott. Nonblocking Algorithms and Preemption-Safe Locking on Multiprogrammed Shared Memory Multiprocessors. JPDC (1998) Journal of Parallel and Distributed Computing 51(1):1-26, May 1998.
jpdc98.pdf   Daniel Lupei

STEPS Towards Cache-Resident Transaction Processing, Stavros Harizopoulos and Anastassia Ailamaki. Proceedings of the 30th International Conference on Very Large Data Bases, Toronto, Canada, August 2004.
steps.pdf   Aron Roth, Peter Michaels

Paper summaries

Informal written (one page) project progress report



Nov 26 Putting it All Together:

Nonblocking Synchronization, Transactional Memory

 

Exploring Thread-Level Speculation in Software: The Effects of Memory
Access Tracking Granularity
Spiros Papadimitriou, Todd Mowry, 2001 swdsm_tls.pdf  Eric LaForest

Read Copy Update: Using Execution History to Solve Concurrency Problems. Mc Kenney P. and Slingwine J. rcu.pdf  Eric Tran

Maurice Herlihy, Victor Luchangco, Mark Moir, William N. Scherer III: Software transactional memory for dynamic-sized data structures. PODC 2003 podc-moir.pdf  Jeremy Cutler

Paper summaries
Dec 3

 

Transactional Memory

 

Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza.Exploiting Distributed Version Consistency in a Transactional Memory Cluster. PPoPP 2006 dtm.pdf  Adrian Popescu

Sanjeev Kumar, Michael Chu, Christopher J. Hughes, Partha Kundu, Anthony Nguyen. Hybrid Transactional Memory. PPoPP 2006 hybridtm.pdf  Lyndon Carvalho

Bratin Saha, Ali-Reza Adl-Tabatabai, Richard L. Hudson, Chi Cao Minh, Benjamin Hertzberg. McRT-STM: A High Performance Software Transactional Memory System for a Multi-Core Runtime. PPoPP 2006 mcrt.pdf  William Lo

Paper summaries
Dec ? Project class presentation

Dec 17 Final project report due