
Presentation Master's thesis - Ole Jürgensen - Brain & Cognition
Presentation Master's thesis - Ole Jürgensen - Brain & Cognition
- Startdatum
- 26-06-2026 10:30
- Einddatum
- 26-06-2026 11:30
- Locatie
Large Language Models (LLMs) demonstrate impressive linguistic abilities, yet they require vastly more data than humans to learn. Theories of human learning suggest that the limited nature of working memory may paradoxically help in the acquisition of language. Could introducing a similar constraint improve learning efficiency in transformer models? Thamma and Heilbron (2025) addressed this question and reported improved performance in models with a human-inspired fleeting memory limitation.
However, it remains unclear whether the improvement is due to a simple recency bias or to the memory limitation encouraging higher-level abstraction. This project arbitrated between the two hypotheses. We addressed this question both at the level of model outputs (probabilities assigned to words) and model internals (representations of syntax across model layers). Through targeted manipulation of syntactic dependency distances we find that the fleeting memory limitation improves abstraction of syntactic structure beyond a simple
effect of recency.
A complementary analysis of model internals showed contradictory results to our output analysis, revealing the limitations of mechanistic-interpretability tools for the comparison of different models. Together, these findings support the functional role of memory limitations in language learning and reveal an interesting tension between the analysis of model output and internal states.