Skip to main content

Undergraduate Research Assistant Opportunities at (MURGE Lab) on Mixture of Expert, Model Merging, Efficient Models, and Continual Learning.

Post Date
04/10/2024
Description

Advisor: Prof. Mohit Bansal (https://www.cs.unc.edu/~mbansal/)
Mentor: Prateek Yadav (https://prateeky2806.github.io/)
Group: Murge Lab UNC-CH (https://murgelab.cs.unc.edu/)
Duration (Flexible): Apr 15th – Sep 15th, 2024 (5 months), at least 20 hours per week commitment.
Role: Research Assistant (with RAship stipend)
Contact: Prateek Yadav (praty@cs.unc.edu) with some basic information about yourself, your transcripts, your CV, and any discussion about any prior Machine Learning / Programming experience you have.

Requirements from Candidates (Good to Haves):
– Undergrads or Masters students from Computer Science, Mathematics or Statistics background.
– Strong foundation in machine learning and deep learning techniques. Familiarity with Model architectures like transformers, etc.
– Familiarity with deep learning programming frameworks like Pytorch and Huggingface libraries.
– Strong analytical abilities to ask the right questions to come up with a hypothesis and then design experiments to test it.
– Candidates who are possibly interested in a research career or grad school (Master/PhD) or Machine Learning jobs in the future.

Project Description:
Machine Learning has been evolving very rapidly and often people specialize models like LLama, and LLava to their specific applications to create domain-specialized models. The projects would revolve around the goal of recycling these existing models to create better modular models that are capable of solving unseen tasks and generalizing them to new datasets/domains in a zero/few-shot manner. Moreover, these models need to be continually adapted to new domains. The techniques involved would be around ideas related to Parameter Efficient finetuning, a Mixture of Expert models, Model Merging, and Composition.
Research Areas:
1. Enabling decentralized collaborative development of models, including modular architectures, cheaply-communicable updates, and merging methods
2. Developing generalist models by creating Mixture of Expert Models system.
3. Continual Model Adaptation and Learning
4. Parameter and Compute efficiency.

Some recent and representative papers in the direction;

1. TIES-Merging: Resolving Interference When Merging Models, NeurIPS’23 (https://arxiv.org/abs/2306.01708)
2. Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy, ICLR’24 (https://arxiv.org/abs/2310.01334)
3. ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization (https://arxiv.org/abs/2311.13171)
4. Loramoe: Revolutionizing Mixture of Experts for Maintaining World Knowledge In Language Model Alignment (https://arxiv.org/pdf2312.09979.pdf)
5. Modular Deep Learning (https://arxiv.org/abs/2302.11529)
6. Learning to Route Among Specialized Experts for Zero-Shot Generalization (https://arxiv.org/abs/2402.05859)
7. Model Stock: All we need is just a few fine-tuned models (https://arxiv.org/abs/2403.19522)

For inquiries or to express your interest, please send me an email at praty@cs.unc.edu

Faculty Advisor
Mohit Bansal
Research Supervisor
Prateek Yadav
Faculty Email:
Type of Position
Availability
Website
Application Deadline
05/15/2024