Information

The course registration period is open. Register for semester 1 courses before Monday, 16 June at 13:00.

student.uva.nl
What is your study programme?
UvA Logo
What is your study programme?
Information

The course registration period is open. Register for semester 1 courses before Monday, 16 June at 13:00.

Colloquium credits

Presentation Master's thesis - Karolina Drożdż - Brain & Cognition

Colloquium credits

Presentation Master's thesis - Karolina Drożdż - Brain & Cognition

Last modified on 11-06-2025 14:34
Entity Tracking as a Microcosm of Semantic Abilities in LLMs and Humans
Show information for your study programme
What is your study programme?
or
event-summary.start-date
19-06-2025 15:00
event-summary.end-date
19-06-2025 16:00
event-summary.location

Roeterseilandcampus - Gebouw C, Straat: Nieuwe Achtergracht 129-B, Ruimte: GS.0. Vanwege beperkte zaalcapaciteit is deelname op basis van wie het eerst komt, het eerst maalt. Leraren moeten zich hieraan houden.

Large Language Models (LLMs) demonstrate remarkable linguistic abilities, yet their capacity to construct coherent internal representations of discourse remains an open question. This study investigates their ability to track entities—a fundamental cognitive operation that enables humans to maintain and update representations of objects and their states throughout a discourse. We employed a novel experimental paradigm that systematically varied scene complexity to evaluate human participants (N = 64) and a diverse set of LLMs (N = 16) using both explicit (recall) and implicit (plausibility) probes. Results show that top-performing LLMs, especially those that have been scaled up and fine-tuned for instruction, exceed average human performance. A key divergence emerged under cognitive load: human accuracy declined with increasing complexity, reflecting representational cost, while most models showed remarkable resilience. However, performance was not uniform. Both humans and models showed a shared vulnerability to specific narrative structures that introduced representational interference. These findings suggest that while LLMs have acquired an important semantic competence, their underlying operational mechanisms are fundamentally different from those of human cognition. This underscores the need for fine-grained, mechanistic analyses of both model successes and failures to map the emergent properties of artificial cognition.