Support for young scientists from all over the world
Doctoral projects
© Florent Draye

Repre­sen­ta­tion Learn­ing and Causal­ity: Theory, Practice, and Impli­ca­tions for Mecha­nis­tic Interpretability

Florent Draye — Hector Fellow Bernhard Schölkopf

My research inter­ests lie at the inter­sec­tion of repre­sen­ta­tion learn­ing, causal­ity, and gener­a­tive models. I aim to contribute to the devel­op­ment of methods that extract infor­ma­tive and inter­pretable features from high-dimen­sional datasets, with a focus on uncov­er­ing high-level causally related factors that describe meaning­ful seman­tics of the data. This, in turn, can help us gain deeper insights into the repre­sen­ta­tions found within advanced gener­a­tive models, partic­u­larly founda­tion models and LLMs, with the goal of improv­ing their efficiency and safety. 

Deep neural networks have demon­strated signif­i­cant success across various tasks due to their ability to learn meaning­ful features from complex, high-dimen­sional data. However, their heavy reliance on large amounts of labelling to effec­tively learn these features restricts their applic­a­bil­ity in unsuper­vised learn­ing scenar­ios. Repre­sen­ta­tion learn­ing tries to trans­form these high-dimen­sional datasets into lower dimen­sional repre­sen­ta­tions without super­vi­sion. By identi­fy­ing the right space in which to perform reason­ing and compu­ta­tion, it also enables the discov­ery of inter­pretable patterns and features within the data. In my PhD, I am working at the inter­sec­tion of repre­sen­ta­tion learn­ing and causal­ity. I am research­ing how to learn data repre­sen­ta­tions that are both inter­pretable and control­lable, focus­ing on identi­fy­ing the appro­pri­ate high-level abstrac­tions of the data on which to model causal structures.

In paral­lel, founda­tion models like large language models (LLMs) have recently gained signif­i­cant atten­tion. These neural networks, trained on massive datasets with billions of parame­ters, are often consid­ered "black boxes" due to our lack of under­stand­ing of their under­ly­ing compu­ta­tional princi­ples. Gaining a deeper under­stand­ing of the algorithms imple­mented by these neural networks is essen­tial, not only for advanc­ing scien­tific knowl­edge but also for improv­ing their efficiency and safety. To address this challenge, I am integrat­ing concepts from causal­ity and repre­sen­ta­tion learn­ing to build a mecha­nis­tic under­stand­ing of the high-dimen­sional data repre­sen­ta­tions found within these advanced gener­a­tive models. 

Teil 1. Repräsentationslernen / kausales Repräsentationslernen. Teil 2. Aufbau eines mechanistischen Verständnisses der hochdimensionalen Datenrepräsentationen, die in Grundmodellen zu finden sind.

Part 1. Repre­sen­ta­tion Learn­ing / causal repre­sen­ta­tion learn­ing. Part 2. Build­ing a mecha­nis­tic under­stand­ing of the high-dimen­sional data repre­sen­ta­tions found in founda­tion models.

Florent Draye

Florent Draye

Max Planck Insti­tute for Intel­li­gent Systems

Super­vised by

Prof. Dr.

Bernhard Schölkopf

Infor­mat­ics, Physics & Mathematics

Hector Fellow since 2018Disziplinen Bernhard Schölkopf