About Me

  • I am a PhD student at Mila.
  • My research broadly focuses on understanding the principles behind how and why deep learning models work, in particular from a data-driven perspective with a current focus on sequence models. I also dabble in applied natural language processing and AI research.
  • I am grateful to have been supported by NSERC, FRQNT, Hydro-Québec and 日本学術振興会 during my studies, allowing me to conduct research on topics of personal fulfillment and interest.
  • I am always on the hunt for research internships or visiting opportunities. Please reach out by e-mail if there is overlap in research interest or if you simply feel I would be a good fit/addition to your team. For an understanding of my background and skills, refer to my (current) representative papers (below) and GitHub (which may be slightly out-of-date). Any unavailable manuscripts or code is available on request.
  • Additionally, I anticipate graduating in the 2026-2027 academic year and am looking for post-doctoral or research scientist positions; if you see a fit within your group, please do not hesitate to contact me. References are available upon request.

Relevant Works

A list of papers representing my more current research interests is below. For a more complete list, refer to DBLP.
  • Mamba Modulation: On the Length Generalization of Mamba Models [Link]
  • Resona: Improving Context Copying in Linear Recurrence Models with Retrieval [Link]
  • Calibrated Language Models and How to Find Them with Label Smoothing [Link]
  • ZETA: Leveraging Z-order Curves for Efficient Top-K Attention [Link]
  • How Well Can a Long Sequence Model Model Long Sequences? [Link]
  • Context-Aware Assistant Selection for Improved Inference Acceleration with Large Language Models [Link]

Working With Me

  • I actively seek motivated mentees! If you are passionate about research and eager to work on innovative projects in deep learning, consider reaching out! I am always open to discussing topics you might want to work on and/or providing the support to start something new.
  • I generally prioritize intermediate-to-advanced knowledge of math (linear algebra, statistics and calculus), research experience, and knowledge of machine learning, in this order. Coding experience and a more advanced math background (differential equations, measure/function theory, optimization, etc.) can vary depending on the possible topics we can work on together.
  • Nevertheless, everyone has different experiences and I encourage those from a non-conventional background (specifically non-engineering or math based) to contact me if interested.

Contacting Me

  • I am always reachable through e-mail, with the addresses structured as "[first_name].[last_name]@[domain_name]" for Mila (mila.quebec) and Université de Montréal (umontreal.ca). Though I try to be as responsive as possible, do anticipate potential delays of up to 48 hours in case I am busy.