Below are some projects that our group is actively working on. If you are a Ph.D. student, email Sanidhya.

Our group also has some specific semester and optional projects for students at EPFL.

Transient Operating System Design

The main goal of this big project is to dynamically modify various subsystems to cater to heterogeneous hardware and varying application requirements. Most of the prior works focus on IO, while our focus is mostly on the concurrency aspect. In particular, we are exploring how applications can fine-tune the concurrency control mechanisms and underlying stack to improve their performance. Some of the projects are as follows:

  1. A concurrency control runtime to efficiently switch between locks at various granularity.
  2. New low-level language to support lock design while ensuring lock properties, such as mutual exclusion, starvation avoidance, and fairness.
  3. A lightweight hypervisor that caters to various forms of virtualization: bare-metal to serverless.
  4. Re-architecting OS for microsecond IO.

We will further extend this project to reason about data structures' concurrency and consistency.

Scalable Storage Stack

With blazing fast IO devices, saturating them is becoming a difficult task. Unfortunately, the current OS stack is the major bottleneck that is still operating in the era of 2000s. As a part of this big project, we are looking at ways to redesign the OS stack to support fast storage devices. We are working on designing new ways to improves the design of file systems. Some of the projects are as follows:

  1. Designing new techniques to saturate and scale operations for various storage media.
  2. Understanding the implication of storage class memory over traditional storage media, such as SSDs.
  3. Designing new storage engines for upcoming storage media, such as ZNS SSDs.
  4. Offloading file system stack to computational SSDs.

Concurrency Primitives and Frameworks

With our particular interest in designing new synchronization primitives and concurrency frameworks, we are looking at designing new primitives that further squeeze the performance out of hardware for two scenarios: heterogeneous hardware (such as BIG/Little architectures and high bandwidth memory) and rack-scale systems. We are revisiting some of the primitives and trying to reason about their practicality. Some of the ongoing projects are as follows:

  1. Revisiting the design of locking primitives for very large multicore machines.
  2. Redesigning concurrency primitives for microsecond scale application in a rack scale environment.
  3. Reasoning about various bugs in a concurrent environment.

Projects for Bachelors and Masters students

[Scalable OS] Continuous lock switching across the stack

Applications are increasingly becoming more complex and are being deployed on heterogeneous hardware (NUMA, AMP etc.). However, in current systems, the lock mechanism remains static. SynCord is the first framework to implement dynamic lock switching for the kernel. This work proposes to extend the framework for userspace applications. The end goal is to create a holistic framework to change any locks across userspace and kernel-space.

In this project, you will:

  • Implement dynamic lock switching for userspace applications.
  • Implement a mechanism to enforce lock policies across the stack.


  • Comfortable in exploring large codebases like the Linux kernel.
  • Have a basic understanding of how operating systems work.

You will learn about:

  • Synchronization primitives.
  • Benchmarking using software and hardware performance counter.

[Scalable OS] Admission control for system calls

Multi-threaded applications increasingly use system calls to access shared resources (Network, IO, CPU, Memory). In current design, system calls done in parallel are admitted and it contends for resources. Increasing this parallelism of system calls however results in diminishing returns. This project aims to analyze this threshold and implement an admission control mechanism for system calls.

In this project, you will:

  • Figure out thresholds for a set of system calls after which it results in diminishing returns.
  • Implement an admission control mechanism and use it to implement the admission control policy based on threshold.


  • Comfortable in exploring large code-base like the Linux kernel.
  • Have a basic understanding of how operating systems work.

[Scalable OS] Modifying Linux kernel with eBPF based policies

Applications often suffer from the performance regression due to the general interface provided by Linux. To address this issue, currently Linux provides eBPF to dynamically modify Linux to improve application performance. As a part of this project, you will explore how to modify various sub-systems of the Linux kernel, such as file systems, memory management, network stack, userspace probes to improve the performance of applications.

In this project, you will:

  • Understand applications requirements from the perspective of Linux.
  • Implement policies at various subsystems to improve application performance.


  • Comfortable in exploring large code-base like the Linux kernel.
  • Have a basic understanding of how operating systems work.

[Concurrency] Verified Concurrent Algorithms

Concurrent algorithms are the basic building of today's infrastructure. However, verifying their correctness is quite challenging. For example, the spinlock code in the Linux kernel had a bug for several years. In this project, we aim to verify the correctness of a set of concurrent algorithms using existing program verifiers, such as dafny or coq. By doing this, we can not only enable designing code that is not only scalable but also correct-by-construction from day one.

In this project, you will:

  • Take an existing concurrent algorithm, specifically synchronization primitive.
  • Understand and come up with safety properties.
  • Implement the code in any verification language of your choice.

You will learn about:

  • Concurrent algorithms.
  • Concurrent program verification.
  • Memory models.
  • Hardware behavior.

[Systems for ML] Improving the performance of ML workloads

ML workloads are at the center stage of the 21st century computing evolution. However, software that runs them is not entirely efficient. Thus, to efficiently utilize current hardware, whether for inference or training, within a single machine or across machines, we need to understand and redesign the current software stack.

In this project, you will:

  • Analyze and understand the overhead current software stack.
  • Do a complete breakdown of the cost associated within a single machine and across machines for both inference and training.
  • Propose a set of optimizations that improves such systems performance.


  • Using existing ML software.
  • Basics about ML algorithms.

You will learn:

  • Understanding the performance of software systematically.

[Storage] Understanding new storage devices

Storage devices comes in various forms and factors. For instance, today's machines are already equipped with flash based storage devices, such as SSDs. However, there are several performance issues with the stack supporting current SSDs, mostly because of how SSDs have been designed. To avoid some set of problems, there are new devices coming in the market, such as Zoned Namespace SSDs (ZNS), SmartSSDs, and persistent memory (PM). Hence, this projects aims to understand the performance of storage stacks for these new storage media.

In this project, you will:

  • Evaluate the performance characteristics of any of these devices.
  • Design a set of benchmarks that specifically target such devices for various scenarios.
  • Contrast their performance with SSDs.

You will learn:

  • File system design.
  • Scalability of storage stacks.
  • Behavior of existing storage hardware.

In case you have projects that are not mentioned above but fall under the purview of our group's interest, feel free to contact us.