About Me

  • I am a PhD student at Mila.
  • My research broadly focuses on understanding the principles behind how and why deep learning models work, in particular from a data-driven perspective with a current focus on sequence models. I also dabble in applied natural language processing and AI research.
  • I am grateful to have been supported by NSERC, FRQNT, Hydro-Québec and 日本学術振興会 during my studies, allowing me to conduct research on topics of personal fulfillment and interest.
  • I am always on the hunt for research internships or visiting opportunities. Please reach out by e-mail if there is overlap in research interest or if you simply feel I would be a good fit/addition to your team. For an understanding of my background and skills, refer to my (current) representative papers (below) and GitHub (which may be slightly out-of-date). Any unavailable manuscripts or code is available on request.
  • Additionally, I anticipate graduating in the 2026-2027 academic year and am looking for post-doctoral or research scientist positions; if you see a fit within your group, please do not hesitate to contact me. References are available upon request.

Relevant Works

  • Mamba Modulation: On the Length Generalization of Mamba Models [Link]
  • Resona: Improving Context Copying in Linear Recurrence Models with Retrieval [Link]
  • Calibrated Language Models and How to Find Them with Label Smoothing [Link]
  • ZETA: Leveraging Z-order Curves for Efficient Top-K Attention [Link]
  • How Well Can a Long Sequence Model Model Long Sequences? [Link]
  • Do Large Language Models Know How Much They Know? [Link]
  • Context-Aware Assistant Selection for Improved Inference Acceleration with Large Language Models [Link]

Working With Me

  • I actively seek motivated mentees! If you are passionate about research and eager to work on innovative projects in deep learning, consider reaching out!
  • As a general (but neither necessary nor sufficient) overview of what I personally look for in terms of collaborators, I generally ask for advanced knowledge of math (linear algebra, statistics and calculus), research experience, and knowledge of machine learning, in order of priority.
  • Nevertheless, everyone has different experiences and I encourage those from a non-conventional background (specifically non-engineering or math based) to contact me if interested.

Contacting Me

  • Please use my e-mail. They are structured as "[first_name].[last_name]@[domain_name]" for both Mila (mila.quebec) an Université de Montréal (umontreal.ca). I try to be as responsive as possible, but do anticipate potential delays of up to 48 hours during busy times during the year.

Other Information

  • Outside of research, I read books (mainly short fiction and non-fiction) and cook. I abide by the principle of 腹八分目 (hara hachi bun me - eat until you are eight parts full). I also play badminton, swim, fish and snowboard (in the winter).