Colloquiumpunten

Presentation Master's thesis - Ole Jürgensen - Brain & Cognition

Colloquiumpunten

Presentation Master's thesis - Ole Jürgensen - Brain & Cognition

Laatst gewijzigd op 16-06-2026 13:46

Explaining the linguistic abilities of the fleeting memory transformer

Startdatum: 26-06-2026 10:30
Einddatum: 26-06-2026 11:30
Locatie: REC GS.34Externe link

Large Language Models (LLMs) demonstrate impressive linguistic abilities, yet they require vastly more data than humans to learn. Theories of human learning suggest that the limited nature of working memory may paradoxically help in the acquisition of language. Could introducing a similar constraint improve learning efficiency in transformer models? Thamma and Heilbron (2025) addressed this question and reported improved performance in models with a human-inspired fleeting memory limitation.

However, it remains unclear whether the improvement is due to a simple recency bias or to the memory limitation encouraging higher-level abstraction. This project arbitrated between the two hypotheses. We addressed this question both at the level of model outputs (probabilities assigned to words) and model internals (representations of syntax across model layers). Through targeted manipulation of syntactic dependency distances we find that the fleeting memory limitation improves abstraction of syntactic structure beyond a simple
effect of recency.

A complementary analysis of model internals showed contradictory results to our output analysis, revealing the limitations of mechanistic-interpretability tools for the comparison of different models. Together, these findings support the functional role of memory limitations in language learning and reveal an interesting tension between the analysis of model output and internal states.

Events

Cookie Consent

Presentation Master's thesis - Ole Jürgensen - Brain & Cognition

Presentation Master's thesis - Ole Jürgensen - Brain & Cognition