Publications
*: Equal contribution.
- Torsten Hoefler, Dan Alistarh, Tal Ben-Nun, Nikoli Dryden, and Alexandra Peste. “Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks”, arXiv 2021.
- Roman Böhringer, Nikoli Dryden, Tal Ben-Nun, and Torsten Hoefler. “Clairvoyant Prefetching for Distributed Machine Learning I/O”, arXiv 2021.
- Andrei Ivanov*, Nikoli Dryden*, Tal Ben-Nun, Shigang Li, and Torsten Hoefler. “Data Movement Is All You Need: A Case Study on Optimizing Transformers.” MLSys 2021.
- Yosuke Oyama, Naoya Maruyama, Nikoli Dryden, Erin McCarthy, Peter Harrington, Jan Balewski, Satoshi Matsuoka, Peter Nugent, and Brian Van Essen. “The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs with Hybrid Parallelism.” IEEE Transactions on Parallel and Distributed Systems, 2021.
- Shigang Li, Tal Ben-Nun, Giorgi Nadiradze, Salvatore Di Girolamo, Nikoli Dryden, Dan Alistarh, and Torsten Hoefler. “Breaking (Global) Barriers in Parallel Stochastic Optimization with Wait-Avoiding Group Averaging.” IEEE Transactions on Parallel and Distributed Systems, 2021.
- Peter Grönquist, Chengyuan Yao, Tal Ben-Nun, Nikoli Dryden, Peter Dueben, Shigang Li, and Torsten Hoefler. “Deep Learning for Post-Processing Ensemble Weather Forecasts.” Philosophical Transactions of the Royal Society A, 2021.
- Bryan Plummer, Nikoli Dryden, Julius Frost, Torsten Hoefler, and Kate Saenko. “Shapeshifter Networks: Cross-layer Parameter Sharing for Scalable and Effective Deep Learning.” arXiv 2020.
- Peter Grönquist, Tal Ben-Nun, Nikoli Dryden, Peter Dueben, Luca Lavarini, Shigang Li, and Torsten Hoefler. “Predicting Weather Uncertainty with Deep Convnets.” ML4PS Workshop at NeurIPS 2019.
- Nikoli Dryden, Naoya Maruyama, Tim Moon, Tom Benson, Marc Snir, and Brian Van Essen. “Channel and Filter Parallelism for Large-Scale CNN Training.” Supercomputing 2019.
- Nikoli Dryden*, Naoya Maruyama*, Tom Benson, Tim Moon, Marc Snir, and Brian Van Essen. “Improving Strong-Scaling of CNN Training by Exploiting Finer-Grained Parallelism.” IPDPS 2019.
- Nikoli Dryden, Naoya Maruyama, Tim Moon, Tom Benson, Andy Yoo, Marc Snir, and Brian Van Essen. “Aluminum: An Asynchronous, GPU-Aware Communication Library Optimized for Large-Scale Training of Deep Neural Networks on HPC Systems.” MLHPC 2018.
- Chen Wang, Nikoli Dryden, Franck Cappello, and Marc Snir. “Neural network based silent error detector.” Cluster 2018. (Best Paper)
- Roshan Dathathri, Gurbinder Gill, Loc Hoang, Hoang-Vu Dang, Alex Brooks, Nikoli Dryden, Andrew Lenharth, Marc Snir, Keshav Pingali. “Gluon: A Communication Optimizing Framework for Distributed Heterogeneous Graph Analytics.” PLDI 2018.
- Hoang-Vu Dang, Roshan Dathathri, Gurbinder Gill, Alex Brooks, Nikoli Dryden, Andrew Lenharth, Loc Hoang, Keshav Pingali, and Marc Snir. “A Lightweight Message Passing Runtime for Distributed Graph Analytics.” IPDPS 2018.
- Sam Adé Jacobs, Nikoli Dryden, Roger Pearce, and Brian Van Essen. “Towards Scalable Parallel Training of Deep Neural Networks.” MLHPC 2017.
- Nikoli Dryden, Tim Moon, Sam Ade Jacobs, and Brian Van Essen. “Communication Quantization for Data-parallel Training of Deep Neural Networks.” MLHPC 2016.
- Alex Brooks, Hoang-Vu Dang, Nikoli Dryden, and Marc Snir. “PPL: An abstract runtime system for hybrid parallel programming.” ESPM2 2015.
- Nikoli Dryden. “PGDB: A Debugger for MPI Applications.” XSEDE 2014.