Instructor: Jiarong Xing, Office hour: Mon after class DCH 2099
TA: Yuning Xia, Office hour: Wed 6-7pm DCH 3070, Email: yx87@rice.edu
Lectures: 4:00-5:15 pm, Monday & Wednesday
Location: DCH 1042
Piazza: https://piazza.com/rice/fall2025/comp436536 (Contact me for access code)
What is cloud computing, and how can we build cloud-scale services that stay performant and secure? This course takes a holistic tour of modern cloud infrastructure—beginning with foundational principles and progressing to cutting-edge topics that appear in today’s datacenters. We will cover:
COMP 321 (Introduction to Computer Systems) or equivalent.
The class meets twice per week. There will be regular homework assignments and a course project that requires a significant amount of hands-on implementation, experimental validation, as well as a written report.
Homework will be posted on Piazza. Submit your solutions to Canvas before the due time.
Students with a documented disability needing academic adjustments or accommodations are encouraged to contact the instructor and Disability Support Services (Allen Center, Room 111).
Date | Topic | Details | Reading | Remarks |
---|---|---|---|---|
8/25/2025, Mon | Introduction | Principles of building systems Course overview | Lampson: Hints for computer systems design | |
8/27/2025, Wed | The Cloud | Cloud applications Datacenters Web vs. cloud vs. cluster | Armbrust et al.: A view of cloud computing | |
LABOR DAY | ||||
9/3/2025, Wed | Virtualization & containerization | VMs & containers | Barham et al.: Xen and the Art of Virtualization | HW1 online |
9/8/2025, Mon | Networking basics | Datacenter networks | Pluribus Networks: Trends in Data Center Networking: Past to Future | |
9/10/2025, Wed | Software-defined networks | SDNs | Feamster et al.: The Road to SDN: An Intellectual History of Programmable Networks | |
9/15/2025, Mon | Programmable switches & P4 | Hardware & language | Bosshart et al.: P4: Programming Protocol-Independent Packet Processors | HW1 due |
9/17/2025, Wed | Hands-on P4 lab | HW2 online | ||
9/22/2025, Mon | End-host networking | RDMA & eBPF | Mitchell et al.: Using One-Sided RDMA Reads to Build a Fast, CPU-Efficient Key-Value Store | |
9/24/2025, Wed | GPUs & AI datacenters | Emmanuel Ohiri: A beginner's guide to NVIDIA GPUs | ||
9/29/2025, Mon | LLM inference 1 | Basis & Batching | Yu et al.: Orca: A Distributed Serving System for Transformer-Based Generative Models | |
10/1/2025, Wed | LLM inference 2 | Memory management | Kwon et al.: Efficient Memory Management for Large Language Model Serving with PagedAttention | |
10/6/2025, Mon | LLM inference 3 | Scheduling | Agrawal et al.: Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve | |
10/8/2025, Wed | Project announcement | Quiz 1 | ||
MIDTERM RECESS | ||||
10/15/2025, Wed | LLM inference 4 | Parallelisms & Other optimizations | Alex McKinney: A Brief Overview of Parallelism Strategies in Deep Learning | HW2 due |
10/20/2025, Mon | LLM training | Communication & checkpointing | Wang et al.: GEMINI: Fast Failure Recovery in Distributed Training with In-Memory Checkpoints | |
10/22/2025, Wed | Serverless | Serverless computing/LLMs | Jonas et al.: A Berkeley View on Serverless Computing | |
10/27/2025, Mon | Cloud Storage | Key value stores Concurrency control & DynamoDB | ||
10/29/2025, Wed | Concurrency | Consistency models Synchronization & Deadlocks | Werner Vogels: Eventually Consistent | |
11/3/2025, Mon | Faults and Failures | Internet basics & Byzantine faults Handling failures & Correlated failures | A. Yigit Ogun: Byzantine Generals Problem | |
11/5/2025, Wed | MapReduce | Programming model & Hadoop | Dean et al.: MapReduce: Simplified Data Processing on Large Clusters | |
11/10/2025, Mon | Security basics | Crypto basics & Attacks | ||
11/12/2025, Wed | Denial of service | Smurf attacks & DDoS & Botnets | Antonakakis et al.: Understanding the Mirai Botnet | |
11/17/2025, Mon | Routing security | BGP & Prefix hijacking | Nordstrom et al.: Beware of BGP Attacks | |
11/19/2025, Wed | Side-channel attacks | Harnik et al.: Side Channels in Cloud Services | ||
11/24/2025, Mon | The future of cloud | Edge cloud & cloud-edge | Azure: What is edge computing? | Quiz 2 |
THANKSGIVING RECESS | ||||
12/01/2025, Mon | Project presentations | |||
12/03/2025, Wed | Project presentations |