Jerry Huang
Starving Grad Student
About Me
-
I am a PhD student at Mila advised by Prof. Sarath Chandar.
-
My research broadly focuses on understanding the principles behind how and why deep learning models work, in particular from a data-driven perspective with a current focus on sequence models. On the side, I also dabble in more applied natural language processing research and traditional AI.
-
Support for my research has come from NSERC, FRQNT, Hydro-Québec and 日本学術振興会 among other sources. I thank them for their broad support and allowing me to conduct research on topics of personal fulfillment and interest.
-
I'm on the hunt for research internships or visiting opportunities in 2026. Please reach out by e-mail if there is overlap in research interest or if you simply feel I would be a good fit/addition to your team. For an understanding of my background and skills, refer to my papers (below) and GitHub (which may be slightly out-of-date).
-
In the meantime, here is where I will likely be and planned conference travel for the rest of 2025:
- June to August: Tokyo
- September to December: TBD
If you happen to be in close range, do not hesitate to reach out to meet up and chat!
Relevant Works
-
Resona: Improving Context Copying in Linear Recurrence Models with Retrieval [Link]
-
Investigating the Effects of Architectural Inductive Biases on Hallucination [Link]
-
Calibrated Language Models and How to Find Them with Label Smoothing [Link]
-
ZETA: Leveraging Z-order Curves for Efficient Top-K Attention [Link]
-
How Well Can a Long Sequence Model Model Long Sequences? [Link]
-
Do Large Language Models Know How Much They Know? [Link]
-
Context-Aware Assistant Selection for Improved Inference Acceleration with Large Language Models [Link]
-
Predicting the Impact of Model Expansion through the Minima Manifold [Link]
-
EpiK-Eval: Evaluation for Language Models as Epistemic Models [Link]
Working With Me
-
I actively seek motivated mentees! If you are passionate about research and eager to work on innovative projects in deep learning, consider reaching out! In the event that you satisfy one of the following criteria, there may be additional ways to increase the chances of collaboration:
-
You satisfy the eligibility requirements for a NSERC USRA.
-
You study at one of Mila's partner unversities and must conduct research to satisfy degree requirements.
-
You have external funding to support an internship/research stay and are not a resident of Canada.
-
As a general (but neither necessary nor sufficient) overview of what I personally look for in terms of collaborators, I generally ask for advanced knowledge of math (linear algebra, statistics and calculus), research experience, and knowledge of machine learning, in order of priority.
-
Nevertheless, everyone has different experiences and I encourage those from a non-conventional background (specifically non-engineering or math based) to contact me if interested.
Contacting Me
-
Please use my e-mail. They are structured as "[first_name].[last_name]@[domain_name]" for both Mila an Université de Montréal. I check it daily and will respond in a timely fashion (between 1 to 2 working days depending on the urgency).
Other Information
-
Outside of research, I read books and cook (particularly seafood). I faithfully abide by the principle of 腹八分目 (hara hachi bun me - eat until you are eight parts full). I also play badminton, swim, fish and snowboard (in the winter).