neuralnoise.com


Homepage of Dr Pasquale Minervini
Researcher/Faculty at the University of Edinburgh, School of Informatics
Co-Founder and CTO at Miniml.AI
ELLIS Scholar, Edinburgh Unit


March 2025 in Research

We have been working on language model evaluation, knowledge utilization, efficiency, and multimodal reasoning. We had papers at ICLR 2025, NAACL 2025 (x3), AAAI 2025, and others, along with several ongoing works.

NAACL 2025 – Controlling Knowledge & Reasoning

  • Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering, by Yu Zhao et al. – We introduce SpARE, a training‑free method to control whether an LLM relies on its internal parametric knowledge or given context when conflicts arise. By analyzing mid‑layer activations with sparse autoencoders, SpARE identifies conflict signals and manipulates them to steer the model at inference time, significantly improving performance on open‑domain QA compared to prior methods. (Oral presentation)

  • Are We Done with MMLU?, by Aryo Gema and many others – We analyze the Massive Multitask Language Understanding benchmark, uncovering a fairly high error rate – for example, in the Virology subset, 57% of sampled questions had issues. We introduce MMLU‑Redux, a manually curated subset of 5,700 expert‑verified questions, and show that corrected evaluations can substantially alter model rankings. MMLU‑Redux is open‑sourced also adopted for example by DeepSeek and Qwen!

  • Self-Training Large Language Models for Tool-Use Without Demonstrations, based on Ne Luo’s MSc project – We explore whether LLMs can learn tool usage (e.g., search engines, calculators) without hand‑crafted examples. Starting with zero‑shot prompts, we generate synthetic tool‑using traces and then fine‑tune the model with them. On PopQA, the self‑trained model gains +3.7% accuracy, though results vary on other datasets, highlighting both promise and challenges in autonomous tool‑use learning. Ne Luo is looking for a PhD position, contact her if you are interested in working with her!

ICLR 2025 – Learning & Evaluation

  • An Auditing Test to Detect Behavioral Shift in Language Models, by the amazing Leo Richter – We propose a method for continual Behavioral Shift Auditing (BSA) of LLMs. This statistical test monitors an LLM’s outputs for significant deviations from a reference model’s behavior, with theoretical guarantees on detecting genuine shifts while avoiding false alarms. Our BSA approach relies on catching subtle changes in a model’s toxicity and translation performance after fine-tuning, using only a few hundred examples, offering a practical tool to ensure that an LLM remains aligned during its deployment/lifetime.

Reasoning and Planning for Large Language Models

AAAI 2025 – Efficient Inference

COLING 2025 – Multilingual Resources

Frontiers in AI 2025 – Human-AI Collaboration

  • Fostering Effective Hybrid Human-LLM Reasoning and Decision Making – We examine frameworks combining LLMs and human judgment for complex tasks, offering design principles for AI‑assisted decision systems. Through case studies, we show that integrating LLM‑generated insights with human oversight yields more reliable and interpretable outcomes than either alone, providing guidelines for principled human‑in‑the‑loop systems.

What’s Brewing

comments powered by Disqus