Thomas Serre

Associate Professor
(401) 863-1148
Office Location: 
Metcalf 343
Research Focus: 
Computational models of biological and machine vision


Dr Serre received a PhD in computational neuroscience from the Massachusetts Institute of Technology (MIT) in 2006 and a master degree in EECS from the Ecole Nationale Supérieure des Télécommunications de Bretagne (Brest, France) in 2000. His research focuses on understanding the brain mechanisms underlying the recognition of objects and complex visual scenes using a combination of behavioral, imaging and physiological techniques. These experiments fuel the development of quantitative computational models that try not only to mimic the processing of visual information in the cortex but also to match human performance in complex visual tasks.

Together with Tomaso Poggio and colleagues at MIT he has developed a large-scale computational model of visual recognition in cortex. This research was featured in the BBC series "Visions from the Future" and appeared in several news articles (The Economist, New Scientist, Scientific American, IEEE Computing in Science and Technology, Technology Review and EyeNet) and a post in Slashdot.

Research Summary:

Automated monitoring and analysis of rodent behavior

Neurobehavioural analysis of mouse phenotypes requires the monitoring of mouse behaviour over long periods of time. We are currently developing trainable computer vision systems enabling the automated analysis of complex mouse behaviors.

HMDB: A large-scale Human Motion DataBase

We collected the largest action video database to-date with 51 action categories and around 7,000 manually annotated clips extracted from a variety of sources ranging from digitized movies to YouTube.

Research Interests:

Most of the work in visual neuroscience has focused on the brain mechanisms underlying the rapid recognition of simple visual scenes using artificial, static and isolated stimuli. However our visual world is both highly dynamic and complex, with typical visual scenes consisting of many objects embedded in background clutter. The result: Our visual cortex must process noisy and ambiguous perceptual measurements. The success of everyday vision implies powerful neural mechanisms, yet to be understood, for combining bottom-up, sensory-driven information with top-down, attention and memory-driven processes to help resolve visual ambiguities and discount irrelevant clutter.

To help realize this goal, I played a leading role as a graduate student and postdoc at MIT in the development of a large-scale, neurophysiologically accurate computational model of visual processing in the primate cortex. The computer model emulates the main information processing steps across the entire cortical visual pathway and bridges the gap between multiple levels of understanding: This system-level model seems consistent with physiological data in primates in different cortical areas of the ventral visual pathway, as well as behavioral data during rapid categorization tasks with natural images. These findings suggest that bottom-up, sensory-driven processes may provide a satisfactory description of the very first pass of information in the visual cortex. Recent extensions of the model for the processing of motion information and attentional mechanisms are already impacting brain science, with researchers at Brown and MIT using the model as a guide for designing new monkey electrophysiology experiments. 

While still a fairly incomplete model of vision, we found that this early model performs on par, or better than, state-of-the-art computer vision systems for the recognition of objects in street scene images as well as human actions in videos. These findings have generated strong interest from the scientific community, with over 2,000 downloads for the source code of the model, and extensive coverage in the popular press, including the BBC and Scientific American. I believe that this bio-inspired approach to computer vision will soon have a significant influence in other areas of the physical and life sciences. For instance, we very recently developed an initial high-throughput system for the automated monitoring and analysis of rodent behavior, which performs on par with human annotators in scoring videos of typical mice behaviors. Rodent labs at Brown, Harvard, MIT and the Broad Institute are already using the system for large-scale behavioral phenotyping studies.