A tentative set of papers that we will cover is listed below, though
the list may change based on the interests of the class.
All students are required to read the papers
before they are presented and will be graded based on apparent
understanding of the material in the papers and contributions to class
discussions on the papers. Students will be asked to
explain various aspects of the papers during class as part of the
discussions.
September 10 - Course Overview and Background
- Roy Levin and David D. Redell, "An Evaluation of the Ninth SOSP Submissions", Operating Systems Review, 17(3), July 1983, pp. 35-40.
- Alan Jay Smith, "The Task of the Referee", IEEE Computer, 23(4), April 1990, pp. 65-71.
September 17 - Virtualization
- Christoffer Dall and Jason Nieh, "KVM/ARM: The Design and Implementation of the Linux ARM Hypervisor", Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Salt Lake City, UT, March 2014.
- Steven Osman, Dinesh Subhraveti, Gong Su, and Jason Nieh, "The Design and Implementation of Zap: A System for Migrating Computing Environments", Proceedings of the 5th USENIX Symposium on Operating Systems Design and Implementation (OSDI), Boston, MA, December 2002.
September 24 - Orchestration
- Abhishek Verma, Luis Pedrosa, Madhukar Korupolu, David Oppenheimer, Eric Tune, and John Wilkes, "Large-scale Cluster Management at Google with Borg", Proceedings of the 7th European Conference on Computer Systems (EuroSys), Bordeaux, France, April 2015.
- Chunqiang Tang, Kenny Yu, Kaushik Veeraraghavan, Jonathan Kaldor, Scott Michelson, Thawan Kooburat, Aravind Anbudurai, Matthew Clark, Kabir Gogia, Long Cheng, Ben Christensen, Alex Gartrell, Maxim Khutornenko, Sachin Kulkarni, Marcin Pawlowski, Tuomas Pelkonen, Andre Rodrigues, Rounak Tibrewal, Vaishnavi Venkatesan, and Peter Zhang, "Twine: A Unified Cluster Management System for Shared Infrastructure", Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI), Virtual, November 2020.
October 1 - No class
October 8 - File Systems and Storage
- Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung, "The Google File System", Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP), Bolton Landing, NY USA, October 2003.
- Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber, "Bigtable: A Distributed Storage System for Structured Data", Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI), Seattle, WA USA, November 2006.
October 15 - Databases
- Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels, "Dynamo: Amazon's Highly Available Key-value Store", Proceedings of the 21st ACM Symposium on Operating Systems Principles (SOSP), Stevenson, WA USA, October 2007.
- James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak, Christopher Taylor, Ruth Wang, and Dale Woodford, "Spanner: Google's Globally-Distributed Database", Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI), Hollywood, CA USA, October 2012.
October 22 - Caching
- Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C. Li, Ryan McElroy, Mike Paleczny, Daniel Peek, Paul Saab, David Stafford, Tony Tung, Venkateshwaran Venkataramani, "Scaling Memcache at Facebook", Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI), Lombard, IL USA, April 2013.
- Nathan Bronson, Zach Amsden, George Cabrera, Prasad Chakka, Peter Dimov, Hui Ding, Jack Ferris, Anthony Giardullo, Sachin Kulkarni, Harry Li, Mark Marchukov Dmitri Petrov, Lovro Puzar, Yee Jiun Song, Venkat Venkataramani, "TAO: Facebook's Distributed Data Store for the Social Graph", Proceedings of the 2013 USENIX Annual Technical Conference (USENIX ATC), San Jose, CA USA, June 2013.
October 29 - Midterm Project Presentations
November 5 - TPUs and GPUs
- Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, Rick Boyle, Pierre-luc Cantin, Clifford Chao, Chris Clark, Jeff Coriell, Mike Daley, Matt Dau, Jeffrey Dean, Ben Gelb, Tara Ghaemmaghami, Rajendra Gottipati, William Gulland, Robert Hagmann, C. Richard Ho, Doug Hogberg, John Hu, Robert Hundt, Daniel Hurt, Julian Ibarz, Arjun Jaffey, Alek Jaworski, Aaron Kaplan, Harshit Khaitan, Andy Koch, Naveen Kumar, Steve Lacy, James Laudon, James Law, David Le, Carole Leary, Zhuyao Liu, Kyle Lucke, Alan Lundin, Gordon MacKean, Adriana Maggiore, Maire Mahony, Kshitij Miller, Rahul Nagarajan, Ravi Narayanaswami, Ray Ni, Kathy Nix, Thomas Norrie, Mark Omernick, Narayana Penukonda, Andy Phelps, Jonathan Ross, Amir Salek, Emery Samadiani, Chris Severn, Gregory Sizikov, Matthew Snelham, Jed Steinberg, Ambuj Sukhwani, Matt Swett, Alice Thorson, Bojian Tian, Greg Toma, Erik Tuttle, Vijay Vasudevan, Richard Walter, Walter Wang, Eric Wilcox, Dave H. Yoon, "In-Datacenter Performance Analysis of a Tensor Processing Unit", Proceedings of the 44th Annual International Symposium on Computer Architecture (ISCA), Toronto, ON Canada, June 2017.
- Zhe Jia, Marco Maggioni, Benjamin Staiger, Daniele Paolo Scarpazza, "Dissecting the NVIDIA Volta GPU Architecture via Microbenchmarking", Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Belfast, Northern Ireland UK, April 2018.
November 12 - Scaling Models
- Deepak Narayanan, Mohammad Shoeybi, Jared Casper, Patrick LeGresley, Mostofa Patwary, Vijay Korthikanti, Dhruv Vainbrand, Prethvi Kashinkunti, Julie Bernauer, Bryan Catanzaro, Matei Zaharia, Peter Bailis, "Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM", Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC), St. Louis, MO USA, November 2021.
- Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, Yuxiong He, "ZeRO: Memory Optimizations Toward Training Trillion Parameter Models", Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC), Atlanta, GA USA, November 2020.
November 19 - AI Inference Serving
- Gyeong-In Yu, Joo Seong Jeong, Gyubum Lee, Soojeong Kim, Byung-Gon Chun, "Orca: A Distributed Serving System for Transformer-Based Generative Models", Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI), Carlsbad, CA USA, July 2022.
- Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph E. Gonzalez, Hao Zhang, Ion Stoica, "Efficient Memory Management for Large Language Model Serving with PagedAttention", Proceedings of the 29th ACM Symposium on Operating Systems Principles (SOSP), Koblenz, Germany, October 2023.
November 26 - No class
December 3 - Optimizing AI Inference
- Yaniv Leviathan, Matan Kalman, Yossi Matias, "Fast Inference from Transformers via Speculative Decoding", Proceedings of the 40th International Conference on Machine Learning (ICML), Honolulu, HI USA, July 2023.
- Jiaming Huang, Yiming Zhang, Zhaodong Zhu, Yibo Zhu, Chuan Wu, "DiffKV: Differentiated Memory Management for Large Language Models with Parallel KV Cache Compaction", Proceedings of the 31st ACM Symposium on Operating Systems Principles (SOSP), Seoul, South Korea, October 2025.
December 10 - Final Project Presentations
|