Human-Centerd Behavioral Signal Processing (BSP)

Research Introduction
My research background and interest are centered on modeling and quantifying human interaction and behavioral dynamics – an essential aspect across multiple human-centered research domains. Figure 1 (left) shows a schematic of the standard approach to the study of widely-encountered dyadic interactions in behavioral science, and Figure 1 (right) depicts the collaborative approach of studying dyadic (i.e., two person) interaction using behavioral signal processing (BSP). My specific research contributions have focused on devising computational methods for modeling turn taking dynamics, affective states, and behavioral coupling dynamics.


My research at USC addresses three major aspects related to interpersonal interaction: conversation process, affective dynamics, and mental health applications. Some notable contributions include the creation of reliable human emotional state recognition systems with inspiration drawn from the domain knowledge (some of which were a part of the winning entry in the first Interspeech Emotion Challenge in 2009) and behavioral informatics for both fundamental and applied research related interactions of distressed couples undergoing therapy.

Sample Research Works (updated November 2012)
Research in Conversation Analysis
Turn taking is a collaborative process in human conversation. Analysis of human conversations and design of natural dialog interfaces can both benefit from modeling this process quantitatively. Interruption is often considered as a perturbation from a smooth human turn taking structure and is often a region of interest for behavioral analysis. For example the nature of interruption can signal dominance or absence of social engagement. My research contribution has offered a model of the expressed multimodal behaviors (hand gestures and vocal cues) comparing disruptive and cooperative overlaps in conversations. Another novel contribution is the development of a predictive model of interruptions in dialog through direct modeling of both interlocutors’ behaviors; results from the prediction model provide a hint into the cognitive planning as expressed in human behaviors just prior to the occurrence of the turn taking event.

Research in Affective Computing
Affective computing is a rapidly advancing domain in human-centered engineering. It aims at enabling systems to sense and predict human affective states. My contribution in affective computing includes designing a novel computation framework with a hierarchical structure inspired from a perceptual theory of human affective states. The approach won first place in the Interspeech 2009 - Emotion Challenge (Classifier Sub-Challenge). Another contribution is the direct modeling of the coupling in the interlocutors’ emotional states using the Dynamic Bayesian Network to improve the overall emotion recognition as the system decodes through the entire dialog. In addition to contributing to a number of publications in robust and cross-corpus emotional recognition, my work has become a key foundational block for several studies into understanding human behaviors.

I have also contributed to novel resource development in support of affective computing R&D, fulfilling a great community need. In addition to contributing to the creation of the publicly available IEMOCAP corpus, I contributed to collection of the USC CreativeIT corpus. It is a collaborative project between engineers and the school of Theater to study systematically the multimodal expressive behaviors and the quality of actors’ improvisation techniques. I held a primary role in designing and collecting this corpus of rich affective interactions of actors’ improvisations recorded with a full-body motion capture system along with other smart room sensing technologies. These freely available corpora have enabled new directions in multimodal affective computing.

Research in Mental Health Applications: Couple Therapy
One of the perennial challenges in behavioral sciences is quantifying the complex, often subtle, interplay between the interlocutors. My research has focused on modeling an essential phenomenon, vocal entrainment - a naturally-occurring synchrony in the matching of vocal behaviors between the interacting dyad. Entrainment has long been established as an important attribute with numerous theoretical implications, yet the modeling mechanism has been lacking, which has likely hindered advances in quantitative aspects of studying this interaction phenomenon.

I have focused on quantifying vocal entrainment by adopting and expanding upon synchrony measures for human conversation, drawing inspiration from domains such as economics, physics, and bio-signaling. More importantly, I have additionally proposed a similarity computational framework using abstract subspace representation of vocal characteristics in order to address challenges inherent in the asynchronous turn taking structures and the multivariate nature of acoustic cues. The approach also introduces a novel way for capturing the directionality of entrainment. This computational framework is not only applicable to quantify vocal entrainment and used as features in tasks of affect recognition, but has been useful in advancing knowledge in mental health research and practice where behavioral interplay has been a key component in analyzing and designing intervention strategies.

In collaboration with psychologists from the domain of family studies (marital therapy), we have demonstrated quantitative evidence through our method that directional vocal matching, which is a form of behavioral influence in dialogs, is highly associated with a dysfunctional cyclical behavior, i.e., polarization in demand and withdraw, and validated using a large sample of severely distressed couples. It highlights the utility of providing insights back to the domain experts with new computational methods.

© 2014 Jeremy Lee
Template design by Andreas Viklund | Icons from Iconza.com