Instructor: |
Cristiana Amza |
TA: |
Arnamoy Bhattacharyya |
Location: |
GB 120 |
Class Time: |
Fridays 3:00-5:00 PM |
Project List: |
|
Office Hours: |
Fridays 1:00-3:00 PM (BA 4142) |
- Sep. 16, 2015: Website Launched.
This course is an intermediate graduate course in the area of
parallel programming. In the first part of the course we will briefly
introduce the architecture of parallel systems and the concept of data
dependencies/races. The three most commonly used parallel programming
paradigms (shared memory, distributed memory and data parallel) will then
be examined in detail. An overview of automatic parallelization of
programs and the use of parallel processing in related domains such as
parallel and distributed database transaction processing will also be
given.
In the second part of the course selected research topics will be
examined. This part of the course consists of student-lead discussions
of relevant research papers. A research-intensive group project in
an area related to program parallelization is a fundamental part of the
course. The projects can be done individually or in small teams
of two or three people. The project outcome will be presented in a class
session at the end of the semester. A list of suggested research
projects has been posted (project_suggestions.txt). Students are also encouraged to propose their own projects and discuss them with me. Please also read:
Class Goals and Advice from Instructor
There is no required textbook for the class. You should be fine with the lectures and papers posted on this site. However, here are some suggestions for additional reading:
Parallel Programming in C with MPI and OpenMP
by Michael J. Quinn
Threads Primer: A Guide to Multithreaded Programming
by Bil Lewis, Daniel J. Berg
Concurrency Control and Recovery in Database Systems
by Philip A. Bernstein, Vassos Hadzilacos, Nathan Goodman (free on-line edition you can download from http://research.microsoft.com/pubs/ccontrol/ in .pdf)
It would be good for you if you had basic understanding of operating
system principles, basic architecture and some knowledge of network
programming. These are not strict pre-requisites though, most of the necessary material will be covered in class.
Sample project report: Link to Sample Project Report.
Pthread program examples have been posted here: Code Examples.
Please also consult this pthread How to use guide and this pthread reference manual
Date |
Topic |
|
Assignment |
|
Sep 18 |
Intro and project suggestion Slides-part1 Slides-part2 |
|
|
|
Sep 25 |
Parallel Programming and Optimizations Pthreads OpenMP Slides |
|
|
|
Oct 2 |
Parallel Programming and Optimizations Project ideas: TM/Games |
1.Locality Aware Dynamic Load Management for Massively Multiplayer Games, Jin Chen, Baohua Wu, Margaret Delap, Bjorn Knutsson, Honghui Lu and Cristiana Amza, ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2005),June 2005 Chen Zhou, Ao Wan |
Paper summaries |
|
Oct 9 |
Lock Synchronization and Optimization
|
Informal oral project proposals. 2. Parallelization and Performance of Interactive Multiplayer Game Servers, Ahmed Abdelkhalek and Angelos Bilas. In Proc. of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), April 2004 Timothy Sham, Shahriar Ninad 3. Donnybrook: Enabling Large-Scale, High-Speed, Peer-to-peer Games, Ashwin Bharambe, John R. Douceur, Jacob R. Lorch, Thomas Moscibroda, Jeffrey Pang, Srinivasan Seshan, and Xinyu Zhuang, SIGCOMM, 2008. Ding Zhu, Kirk Rodrigues |
Proposal & Paper summaries |
|
Oct 16 |
Distributed Applications and Environments |
Informal oral project proposals (contd).
4. Algorithms for scalable synchronization on shared-memory multiprocessors , John M. Mellor-Crummey and Michael L. Scott. ACM Transactions on Computer Systems, 9 (1):21-65, February 1991. Xin Zhuang, Haipei Song Yingjian Liu, Zhehui Zhou |
Proposal & Paper summaries |
|
Oct 23 |
Software Distributed Shared Memory |
6. Memory Coherence in Shared Virtual Memory Systems, Kai Li, Paul Hudak, 1991 ivy91.pdf Ramy Shahin 7. Implementation and Performance of Munin. John Carter, John Bennett, and Willy Zwaenepoel Xinyi Lin, Yi Li 8. TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems , P. Keleher, A.L. Cox, S. Dwarkadas and W. Zwaenepoel, OSDI '94. Yuqing Du Short Introduction to Event-Driven Servers by Instructor |
Paper summaries |
|
Oct 30 |
Programming Paradigms for new Environments: OpenMP on Clusters, CUDA, and OpenCL for GPUs |
9. OpenMP for Networks of SMPs , Y.C. Hu, H. Lu, A.L. Cox, and W. Zwaenepoel, Journal of Parallel and Distributed Computing, vol. 60 (12), pp. 1512-1530, December 2000 Peiwen Zhong, Jing Wang 10. From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming , Du Peng, Rick Weber, Piotr Luszczek, Stanimire Tomov, Gregory Peterson, and Jack Dongarra, Parallel Computing 38, no. 8 (2012): 391-407. Di Wu, Dade Sheng |
Papers summaries |
|
Nov 6 |
Multithreading vs Event-Driven model for Server Code
|
11. Fractal video compression in OpenCL: An evaluation of CPUs, GPUs, and FPGAs as acceleration platforms , Doris Chen and Deshanand Singh, IEEE Explore, 2013. Shane O'Connell 12. Productivity of GPUs Under Different Programming Paradigms, Maria Malik,Teng Li, Umar Sharif, Rabia Shahid, Tarek El-Ghazawi, Greg Newby, Concurrency and Computation: Practice and Experience 2012. Xi Chen, Wei Feng 13b. Flash: An Efficient and Portable Web Server , Vivek S. Pai, Peter Druschel, Willy Zwaenepoel, USENIX Annual Technical Conference, 1999. Xiongbin Zhao, Aiping Xiao
|
Paper summaries |
|
Nov 13 |
Advanced Synchronization Mechanisms in Multiprocessor and Distributed Systems |
14. SEDA: An Architecture for Well-Conditioned, Scalable Internet Services , Presented at the Eighteenth Symposium on Operating Systems Principles (SOSP'01), Lake Louise, Canada, October 24, 2001. Yi Xin Wang 15. Adaptive Overload Control for Busy Internet Servers, Matt Welsh and David Culler. In Proceedings of the 4th USENIX Conference on Internet Technologies and Systems (USITS'03), March 2003. Victoria Odeyemi, Ishtiaque Latif 16. Lazy Asynchronous I/O for Event Driven Servers, Elmeleegy, Anupam Chanda, Alan L. Cox and Willy Zwaenepoel, in Proceedings of the USENIX 2004 Annual Technical Conference. Chong Zhu, Fangzai Hong |
Paper summaries |
|
Nov 20 |
Nonblocking Synchronization and TM |
17. Code Transformations to Improve Memory Parallelism, Vijay S. Pai and Sarita Adve Dai Tian, He Dai 18. Read Copy Update: Using Execution History to Solve Concurrency Problems, Mc Kenney P. andSlingwine J. Serguei Makarov, Teresa Lo 19. Scherer III: Software Transactional Memory for Dynamic-Sized Data Structures, Maurice Herlihy, Victor Luchangco, Mark Moir, William N. PODC 2003 Peter Yi Ping Sun, Jin Xu |
Paper summaries |
|
Nov 27 |
Nonblocking Synchronization and TM |
20. McRT-STM: A High Performance Software Transactional Memory System for a Multi-Core Runtime, Bratin Saha, Ali-Reza Adl-Tabatabai, Richard L. Hudson, Chi Cao Minh, Benjamin Hertzberg. PPoPP 2006 Shehbaz Jaffer, Qiannan Zhao 21. Exploiting Distributed Version Consistency in a Transactional Memory Cluster, Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza. PPoPP 2006 Xiang Ren |
Paper summaries |
|
Dec 4 |
Transactional Memory |
22. A Case for Staged Database Systems, Stavros Harizopoulos and Anastassia Ailamaki Jingya Wang 23. Transactional Memory Support for Scalable and Transparent Parallelization of Multiplayer Games, Daniel Lupei, Bogdan Simion, Don Pinto, Matthew Misler, Mihai Burcea, William Krick and Cristiana Amza. Yi Ding, Le Deng 24. Scheduling Support for Transactional Memory Contention Management, Maldonado et al. PPoPP 2010 Arnamoy B |
Paper summaries | |
Dec 17
|
Project class presentation |
|
|
|
Dec 23 |
Final project report due (by e-mail to me, just the report please) |