The human superior temporal sulcus (STS) is considered a hub for social perception and cognition, including the perception of faces and human motion, as well as understanding others’ actions, mental states, and language. However, the functional organization of the STS remains debated: is this broad region composed of multiple functionally distinct modules, each specialized for a different process, or are STS subregions multifunctional, contributing to multiple processes? Is the STS spatially organized, and if so, what are the dominant features of this organization?
The human superior temporal sulcus (STS) has been implicated in a broad range of social perceptual and cognitive processes, including the perception of faces, biological motion, and vocal sounds, and the understanding of language and mental states. However, little is known about the overall functional organization of these responses. Does the STS contain distinct, specialized regions for processing different types of social information? Or is cortex in the STS largely multifunctional, with each region engaged in multiple different computations? Because prior work has largely studied these processes independently, this question remains unanswered. Here, we first identify distinct functional subregions of the STS, and then examine their response to a broad range of social stimuli.
We scanned twenty human participants using fMRI, comparing responses to a number of tasks that tapped different social perceptual and cognitive processes: mental state understanding (termed Theory of Mind or ToM; reading stories about false beliefs vs false photos), face perception (viewing dynamic faces vs objects), biological motion perception (viewing point-light displays depicting moving humans vs objects), and voice perception (vocal sounds vs nonvocal environmental sounds). We additionally included an auditory story task with 4 conditions that allowed for ToM, language, and voice contrasts: stories about mental states, stories about physical events, nonword lists, and music. We identified maximally responsive STS subregions for each participant and task, and assessed the profile of response of these regions across all conditions (using independent data).
Results point to several functionally specific STS subregions, including a region in the right temporo-parietal junction that responded specifically to abstract mental state content; a region of posterior STS that responded specifically to dynamic human motion, and a region of middle STS that responded specifically to vocal sounds. The face-responsive region of posterior STS also had a strong response to vocal sounds, indicating that this is not a face-specific region as previously described, and should be reconceptualized as an audiovisual region processing dynamic audio and visual information from others’ faces. These results provide a foundational understanding of the functional organization of the superior temporal sulcus, and pave the way for subsequent studies that will probe the functional role of STS subregions in further detail.