PROVIDENCE, R.I. [Brown University] — With the goal of continuing his studies of computer vision all the way through to a Ph.D., Brown undergraduate Gary Chien is already a veteran of research. But for all the time during school years that he’s spent writing code to help computers interpret images, summer still provides a unique opportunity.
“It’s different during the summer because there are no classes,” said Chien, a rising senior and computer science concentrator from Florence, S.C. “I can dedicate as much time as I want. If I run into a problem, I can just sit down and work on it throughout the day without worrying about studying for an exam.”
Chien has been putting that time to exceptional use. This summer with an Undergraduate Teaching and Research Award he’s learned a new Google software library called Tensorflow, which allows him to invoke the power of artificial intelligence by tapping “deep-learning” network technology. He’s applying this new capability to enhance an already innovative lab at Brown for the study of developmental psychology: the Smart Playroom.
The playroom is where Dima Amso, associate professor of cognitive, linguistic and psychological sciences (CLPS), studies attention and memory in young children, both those developing typically and atypically. Amso’s research team makes observations about how children behave in the fun and friendly experimental setting, with toys and activities arrayed around the room.
The National Institutes of Health-supported lab is a joint venture with computer vision expert and CLPS Associate Professor Thomas Serre. The lab’s “smartness” comes from the unimposing but ubiquitous technology that makes rigorous, quantifiable observation easier to make. Cameras and sensors such as Xbox Kinects can record and track the position — and even the facial expressions — of children as they play.
The lab also records all the activity from the child’s point of view with a head-mounted camera. That’s where Chien’s summer work comes in. His assignment, working in Serre’s research group, is to make computers capable of automatically annotating all that child’s point-of-view footage.
“Annotating the videos by hand takes a really long time,” Chien said. “The purpose is to train the network to do most of the annotation for us.”
Using Tensorflow, the deep-learning network and his ingenuity, Chien is working to program software that can essentially be taught by example to accurately produce annotations such as “subject looks at toy for 3 seconds” or “subject reaches for stuffed animal.”
The technical difficulties are many. For example, no two children approach the same toys at exactly the same angle as they go about their treasure hunts in the lab. They are different heights, move at different speeds, and take different routes. A computer has to be able to “understand” where the toy is in the child’s field of view and when the child is looking at it, even when the view isn’t perfectly exact from one instance to another.
It’s the ideal challenge for Chien, whose prior projects with Serre have included training a computer to track the entrances and exits of a bird in a birdhouse and aiding in the Serre lab’s efforts to automate observations of research mice in their cages.
Chien works about 7 hours a day on the project, starting at 10 a.m. Sometimes he’s in the playroom, helping to record data. Sometimes he’s downstairs in the lab working on algorithms. To teach his software, Chien has to annotate videos by hand.
“It’s basically a full-time job,” he said.
That kind of commitment is no problem, however, for a student who wants to dedicate years more research to making technology better.
“It’s kind of cliché, but my goal is to contribute something to the world,” Chien said. “I just find vision really fascinating. I really love working with it.”
So this summer at Brown, Chien is doing what he loves full time.