Jerry Huang
Starving Grad Student
About Me
-
I am a PhD student at Mila.
-
My research broadly focuses on understanding the principles behind how and why deep learning models work, in particular from a data-driven perspective with a current focus on sequence models. I also dabble in applied natural language processing and AI research.
-
I am grateful to have been supported by NSERC, FRQNT, Hydro-Québec and 日本学術振興会 during my studies, allowing me to conduct research on topics of personal fulfillment and interest.
-
I am always on the hunt for research internships or visiting opportunities. Please reach out by e-mail if there is overlap in research interest or if you simply feel I would be a good fit/addition to your team. For an understanding of my background and skills, refer to my (current) representative papers (below) and GitHub (which may be slightly out-of-date). Any unavailable manuscripts or code is available on request.
-
Additionally, I anticipate graduating in the 2026-2027 academic year and am looking for post-doctoral or research scientist positions; if you see a fit within your group, please do not hesitate to contact me. References are available upon request.
Relevant Works
-
Mamba Modulation: On the Length Generalization of Mamba Models [Link]
-
Resona: Improving Context Copying in Linear Recurrence Models with Retrieval [Link]
-
Calibrated Language Models and How to Find Them with Label Smoothing [Link]
-
ZETA: Leveraging Z-order Curves for Efficient Top-K Attention [Link]
-
How Well Can a Long Sequence Model Model Long Sequences? [Link]
-
Do Large Language Models Know How Much They Know? [Link]
-
Context-Aware Assistant Selection for Improved Inference Acceleration with Large Language Models [Link]
Working With Me
-
I actively seek motivated mentees! If you are passionate about research and eager to work on innovative projects in deep learning, consider reaching out!
-
As a general (but neither necessary nor sufficient) overview of what I personally look for in terms of collaborators, I generally ask for advanced knowledge of math (linear algebra, statistics and calculus), research experience, and knowledge of machine learning, in order of priority.
-
Nevertheless, everyone has different experiences and I encourage those from a non-conventional background (specifically non-engineering or math based) to contact me if interested.
Contacting Me
-
Please use my e-mail. They are structured as "[first_name].[last_name]@[domain_name]" for both Mila (mila.quebec) an Université de Montréal (umontreal.ca). I try to be as responsive as possible, but do anticipate potential delays of up to 48 hours during busy times during the year.
Other Information
-
Outside of research, I read books (mainly short fiction and non-fiction) and cook. I abide by the principle of 腹八分目 (hara hachi bun me - eat until you are eight parts full). I also play badminton, swim, fish and snowboard (in the winter).