I am an ETH Postdoctoral Fellow at ETH Zurich, working with Professor Torsten Hoefler in the Scalable Parallel Computing Laboratory. I previously completed my PhD in computer science at the University of Illinois at Urbana-Champaign, advised by Professor Marc Snir. I work heavily with members of the Center for Applied Scientific Computing at Lawrence Livermore National Laboratory and many other collaborators.

My research focuses on the intersection of high-performance computing and machine learning. I am particularly interested in scalable training of deep neural networks and applying neural networks to scientific and computational simulation datasets. I also work on parallel algorithms and runtimes, graph analytics, and communication and performance optimization.

Current students

  • Andreas Zingg (masters)
  • Neville Walo (bachelors)
  • Peter Tatkowski (masters)
  • Ali Nasser (masters at KAUST)
  • Tobia Claglüna (bachelors)
  • Anton Schäfer (bachelors)
  • Lukas Ernst (masters)
  • Simon Jacob (bachelors)

Former students

  • Christoph Amevor (bachelors)
  • Roman Böhringer (bachelors)

Selected projects


Aluminum is a generic communication framework enabling high-performance asynchronous point-to-point and collective operations, especially on GPUs. It includes more GPU-friendly semantics than MPI, and a suite of latency- and bandwidth-optimized algorithms, both from existing library and custom implementations. Aluminum has been integrated into both the LBANN deep learning toolkit and the Hydrogen distributed linear algebra library.


LBANN (Livermore Big Artificial Neural Network Toolkit) is a research toolkit for scaling the training of deep neural networks on HPC systems. My work includes optimized communication algorithms and patterns, communication sparsification and quantization, more general distributed-memory convolution algorithms, and more scalable data-parallel training algorithms.


PPL is an experimental C++11 runtime system for exploring different implementation tradeoffs, especially in the context of future exa-scale systems. See the paper on it below. If you’re interested in further details, contact me.

The name (probably) inventively stands for “Parallel Programming Library”, and is certainly not meant to be confused with the nice people right next door at the other PPL (Parallel Programming Laboratory).


PGDB is a parallel debugger for large-scale MPI applications, written primarily in Python with some C/C++. I haven’t found time to work on it in quite a while, but I continually find situations where it would be useful.


Xenos was a web-based RPG I did back-end PHP (the horror!) and database work for between 2005 and 2008, primarily in collaboration with Alistair Lynn, Nick Farley, Taylor Vaughan, and Alec Ingulsrud. At its height, we had several hundred players.