Research Paper Reading List
Published:
Just a collection of potentially interesting papers that are on my reading list…
General Papers
- Charlie Snell, Dan Klein, Ruiqi Zhong, Learning by Distilling Context
- Alycia Lee, Brando Miranda, Sudharsan Sundar, Allison Casasola, Sanmi Koyejo, Beyond Scale: The Diversity Coefficient as a Data Quality Metric for Variability in Natural Language Data
- Valeriia Cherepanova, James Zou, Talking Nonsense: Probing Large Language Models’ Understanding of Adversarial Gibberish Inputs
- Peiyi Wang, Lei Li, Liang Chen, Zefan Cai, Dawei Zhu, Binghuai Lin, Yunbo Cao, Lingpeng Kong, Qi Liu, Tianyu Liu, Zhifang Sui, Large Language Models are not Fair Evaluators
- Junmo Kang, Hongyin Luo, Yada Zhu, Jacob Hansen, James Glass, David Cox, Alan Ritter, Rogerio Feris, Leonid Karlinsky, Self-Specialization: Uncovering Latent Expertise within Large Language Models
- Shuqian Sheng, Yi Xu, Luoyi Fu, Jiaxin Ding, Lei Zhou, Xinbing Wang, Chenghu Zhou, Is Reference Necessary in the Evaluation of NLG Systems? When and Where?
- Daniel Deutsch, Rotem Dror, Dan Roth, On the Limitations of Reference-Free Evaluations of Generated Text
- Dominic Petrak, Nafise Moosavi, Ye Tian, Nikolai Rozanov, Iryna Gurevych, Learning From Free-Text Human Feedback – Collect New Datasets Or Extend Existing Ones?
EMNLP’24 Papers
- Jia-Chen Gu, Hao-Xiang Xu, Jun-Yu Ma, Pan Lu, Zhen-Hua Ling, Kai-Wei Chang, Nanyun Peng, Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue
- Yunxuan Li, Yibing Du, Jiageng Zhang, Le Hou, Peter Grabowski, Yeqing Li, Eugene Le, Improving Multi-Agent Debate with Sparse Communication Topology
- Makesh Narsimhan Sreedhar, Traian Rebedea, Shaona Ghosh, Jiaqi Zeng, Christopher Parisien, CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues
- Abhilasha Sancheti, Haozhe An, Rachel Rudinger, On the Influence of Gender and Race in Romantic Relationship Prediction from Large Language Models
- James Liyuan Wang, Ran Li, Junfeng Yang, Chengzhi Mao, RAFT: Realistic Attacks to Fool Text Detectors
- Terra Blevins, Tomasz Limisiewicz, Suchin Gururangan, Margaret Li, Hila Gonen, Noah A. Smith, Luke Zettlemoyer, Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models
- Jacob Morrison, Noah A. Smith, Hannaneh Hajishirzi, Pang Wei Koh, Jesse Dodge, Pradeep Dasigi, Merge to Learn: Efficiently Adding Skills to Language Models with Model Merging
- Manya Wadhwa, Xinyu Zhao, Jessy Li, Greg Durrett, Learning to Refine with Fine-Grained Natural Language Feedback
- Junehyung Kim, Sungjae Hwang, All You Need is Attention: Lightweight Attention-based Data Augmentation for Text Classification
- Yang Ba, Michelle V. Mancenido, Rong Pan, Fill In The Gaps: Model Calibration and Generalization with Synthetic Data
- Rajiv Movva, Pang Wei Koh, Emma Pierson, Annotation alignment: Comparing LLM and human annotations of conversational safety
- Beiduo Chen, Xinpeng Wang, Siyao Peng, Robert Litschko, Anna Korhonen, Barbara Plank, “Seeing the Big through the Small”: Can LLMs Approximate Human Judgment Distributions on NLI from a Few Explanations?
- Bingbing Wen, Bill Howe, Lucy Lu Wang, Characterizing LLM Abstention Behavior in Science QA with Context Perturbations
- Yuqing Zhou, Ruixiang Tang, Ziyu Yao, Ziwei Zhu, Navigating the Shortcut Maze: A Comprehensive Analysis of Shortcut Learning in Text Classification by Language Models
- Shangbin Feng, Taylor Sorensen, Yuhan Liu, Jillian Fisher, Chan Young Park, Yejin Choi, Yulia Tsvetkov, Modular Pluralism: Pluralistic Alignment via Multi-LLM Collaboration
- Kyusik Kim, Hyeonseok Jeon, Jeongwoo Ryu, Bongwon Suh, Will LLMs Sink or Swim? Exploring Decision-Making Under Pressure
- Adrian Cosma, Stefan Ruseti, Mihai Dascalu, Cornelia Caragea, How Hard is this Test Set? NLI Characterization by Exploiting Training Dynamics
- Lindia Tjuatja, Valerie Chen, Tongshuang Wu, Ameet Talwalkwar, Graham Neubig, Do LLMs Exhibit Human-like Response Biases? A Case Study in Survey Design
- Jie Chen, Yupeng Zhang, Bingning Wang, Xin Zhao, Ji-Rong Wen, Weipeng Chen, Unveiling the Flaws: Exploring Imperfections in Synthetic Data and Mitigation Strategies for Large Language Models
- Shramay Palta, Nishant Balepur, Peter A. Rankel, Sarah Wiegreffe, Marine Carpuat, Rachel Rudinger, Plausibly Problematic Questions in Multiple-Choice Benchmarks for Commonsense Reasoning
- Abhishek Divekar, Greg Durrett, SynthesizRR: Generating Diverse Datasets with Retrieval Augmentation
- Jing Huang, Diyi Yang, Christopher Potts, Demystifying Verbatim Memorization in Large Language Models
- Thao Nguyen, Jeffrey Li, Sewoong Oh, Ludwig Schmidt, Jason E Weston, Luke Zettlemoyer, Xian Li, Better Alignment with Instruction Back-and-Forth Translation
- Xinyi Xu, Zhaoxuan Wu, Rui Qiao, Arun Verma, Yao Shu, Jingtan Wang, Xinyuan Niu, Zhenfeng He, Jiangwei Chen, Zijian Zhou, Gregory Kang Ruey Lau, Hieu Dao, Lucas Agussurja, Rachael Hwee Ling Sim, Xiaoqiang Lin, Wenyang Hu, Zhongxiang Dai, Pang Wei Koh, Bryan Kian Hsiang Low, Position Paper: Data-Centric AI in the Age of Large Language Models
- Johnathan Xie, Annie S Chen, Yoonho Lee, Eric Mitchell, Chelsea Finn, Calibrating Language Models with Adaptive Temperature Scaling
- Fei Wang, Ninareh Mehrabi, Palash Goyal, Rahul Gupta, Kai-Wei Chang, Aram Galstyan, Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models
- Jared Moore, Tanvi Deshpande, Diyi Yang, Are Large Language Models Consistent over Value-laden Questions?
- Isadora White, Sashrika Pandey, Michelle Pan, Communicate to Play: Pragmatic Reasoning for Efficient Cross-Cultural Communication