I work on efficient training and inference of video diffusion models at Genmo.

Previously, I did research on machine learning performance optimizations including speculative decoding and sparsity at Mosaic Resesarch/Databricks. I used to be a software engineer at Tesla Autopilot, exploring efficient quantization such as FP8 on large vision models along with improving the training systems on Dojo. Before that, I worked on Large Language Models at Cohere as an early engineer, specializing in model inference and machine learning systems.

I studied Computer Science at the University of Waterloo. My academic interests include Machine Learning (focus on performance optimizations and training dynamics), and problems in Distributed Systems and PL/Compilers.

Other things that take up my time include reading, learning about the histories and philosophies of both science and art, crosswords, and fitness (nowadays lifting and occasionally running).