Information

The course registration period is open. Register for semester 1 courses before Monday, 16 June at 13:00.

student.uva.nl

Programme:Show information for programme:

Information

The course registration period is open. Register for semester 1 courses before Monday, 16 June at 13:00.

Colloquium credits

Presentation Master's thesis - Lucca Pfründer - Clinical Psychology

Colloquium credits

Presentation Master's thesis - Lucca Pfründer - Clinical Psychology

Last modified on 11-06-2025 14:37

Misleading Deception Classifiers With Model-Based and Human Paraphrasing Attacks

event-summary.start-date: 17-06-2025 15:30
event-summary.end-date: 17-06-2025 16:30
event-summary.location: Roeterseilandcampus - Gebouw A, Straat: Nieuwe Achtergracht 129-B, Ruimte: A2.08. Vanwege beperkte zaalcapaciteit is deelname op basis van wie het eerst komt, het eerst maalt. Leraren moeten zich hieraan houden.

Automated models often outperform humans at detecting deception but remain vulnerable to adversarial attacks—subtle alterations of statements (i.e., words or phrases) that preserve meaning but change the classification of a model. After training a DistilBERT classifier on 80 percent of the statements from a dataset of autobiographical truths and lies (Hippocorpus), humans and GPT-4o each rewrote 153 test statements (of the remaining 20%) up to 10 times, attempting to flip the model’s prediction. This can be understood as a paraphrasing attack, the statement is rewritten in a way that it means the same, but in an attempt to fool the classifier. Nearly 70% of paraphrased statements succeeded in changing the model’s prediction (i.e., lie to truth or truth to lie). While humans and LLM were similarly effective and efficient overall, humans induced a greater change in model confidence and in fewer iterations for truthful statements. This highlights a key vulnerability: Models can be tricked through benign changes without changing the underlying content.

Events

Cookie Consent

Presentation Master's thesis - Lucca Pfründer - Clinical Psychology

Presentation Master's thesis - Lucca Pfründer - Clinical Psychology