View-invariant object category learning, recognition, and search: how spatial and object attention are coordinated using surface-based attentional shrouds

Arash Fazl; Stephen Grossberg; Ennio Mingolla

doi:10.1016/j.cogpsych.2008.05.001

View-invariant object category learning, recognition, and search: how spatial and object attention are coordinated using surface-based attentional shrouds

Cogn Psychol. 2009 Feb;58(1):1-48. doi: 10.1016/j.cogpsych.2008.05.001. Epub 2008 Jul 23.

Authors

Arash Fazl¹, Stephen Grossberg, Ennio Mingolla

Affiliation

¹ Department of Cognitive and Neural Systems, Center for Adaptive Systems and Center of Excellence for Learning in Education, Science, and Technology, Boston University, 677 Beacon Street, Boston, MA 02215, USA. steve@cns.bu.edu

PMID: 18653176
DOI: 10.1016/j.cogpsych.2008.05.001

Abstract

How does the brain learn to recognize an object from multiple viewpoints while scanning a scene with eye movements? How does the brain avoid the problem of erroneously classifying parts of different objects together? How are attention and eye movements intelligently coordinated to facilitate object learning? A neural model provides a unified mechanistic explanation of how spatial and object attention work together to search a scene and learn what is in it. The ARTSCAN model predicts how an object's surface representation generates a form-fitting distribution of spatial attention, or "attentional shroud". All surface representations dynamically compete for spatial attention to form a shroud. The winning shroud persists during active scanning of the object. The shroud maintains sustained activity of an emerging view-invariant category representation while multiple view-specific category representations are learned and are linked through associative learning to the view-invariant object category. The shroud also helps to restrict scanning eye movements to salient features on the attended object. Object attention plays a role in controlling and stabilizing the learning of view-specific object categories. Spatial attention hereby coordinates the deployment of object attention during object category learning. Shroud collapse releases a reset signal that inhibits the active view-invariant category in the What cortical processing stream. Then a new shroud, corresponding to a different object, forms in the Where cortical processing stream, and search using attention shifts and eye movements continues to learn new objects throughout a scene. The model mechanistically clarifies basic properties of attention shifts (engage, move, disengage) and inhibition of return. It simulates human reaction time data about object-based spatial attention shifts, and learns with 98.1% accuracy and a compression of 430 on a letter database whose letters vary in size, position, and orientation. The model provides a powerful framework for unifying many data about spatial and object attention, and their interactions during perception, cognition, and action.

Publication types

Research Support, U.S. Gov't, Non-P.H.S.
Review

MeSH terms

Association Learning / physiology*
Attention / physiology*
Humans
Models, Psychological
Neural Networks, Computer*
Pattern Recognition, Visual / physiology*
Saccades*
Space Perception / physiology*
Visual Cortex / physiology