Pause
Read
CEA vacancy search engine

Grounding and reasoning over space and time in Vision-Language Models (VLM)


Thesis topic details

General information

Organisation

The French Alternative Energies and Atomic Energy Commission (CEA) is a key player in research, development and innovation in four main areas :
• defence and security,
• nuclear energy (fission and fusion),
• technological research for industry,
• fundamental research in the physical sciences and life sciences.

Drawing on its widely acknowledged expertise, and thanks to its 16000 technicians, engineers, researchers and staff, the CEA actively participates in collaborative projects with a large number of academic and industrial partners.

The CEA is established in ten centers spread throughout France
  

Reference

SL-DRT-25-0901  

Direction

DRT

Thesis topic details

Category

Technological challenges

Thesis topics

Grounding and reasoning over space and time in Vision-Language Models (VLM)

Contract

Thèse

Job description

Recent Vision-Language Models (VLMs) like BLIP, LLaVA, and Qwen-VL have achieved impressive results in multimodal tasks but still face limitations in true spatial and temporal reasoning. Many current benchmarks conflate visual reasoning with general knowledge and involve shallow reasoning tasks. Furthermore, these models often struggle with understanding complex spatial relations and dynamic scenes due to suboptimal visual feature usage. To address this, recent approaches such as SpatialRGPT, SpaceVLLM, VPD, and ST-VLM have introduced techniques like 3D scene graph integration, spatio-temporal queries, and kinematic instruction tuning to improve reasoning over space and time. This thesis proposes to build on these advances by developing new instruction-tuned models with improved data representation and architectural innovations. The goal is to enable robust spatio-temporal reasoning for applications in robotics, video analysis, and dynamic environment understanding.

University / doctoral school

Sciences et Technologies de l’Information et de la Communication (STIC)
Paris-Saclay

Thesis topic location

Site

Saclay

Requester

Position start date

01/10/2025

Person to be contacted by the applicant

TUO Aboubacar aboubacar.tuo@cea.fr
CEA
DRT/DIASI//LVA
CEA-Saclay, BP 28, GIF-SUR-YVETTE CEDEX, ESSONNE 91191, France
0656802188

Tutor / Responsible thesis director

LOESCH Angélique angelique.loesch@cea.fr
CEA
DRT/DIASI//LVA
CEA-Saclay, BP 28, GIF-SUR-YVETTE CEDEX, ESSONNE 91191, France

En savoir plus


https://kalisteo.cea.fr/
https://scholar.google.com/citations?user=5fE1oWwAAAAJ&hl=en