While there have been great advances in quantification of the genotype of organisms, including full genomes for many species, the quantification of phenotype is at a comparatively primitive stage. Part of the reason is technical difficulty: the phenotype covers a wide range of characteristics, ranging from static morphological features, to dynamic behavior. The latter poses challenges that are in the area of multimedia signal processing. Automated analysis of video and audio recordings of animal and human behavior is a growing area of research, ranging from the behavioral phenotyping of genetically modified mice or drosophila to the study of song learning in birds and speech acquisition in human infants. This paper reviews recent advances and identifies key problems for a range of behavior experiments that use audio and video recording. This research area offers both research challenges and an application domain for advanced multimedia signal processing. There are a number of MMSP tools that now exist which are directly relevant for behavioral quantification, such as speech recognition, video analysis and more recently, wired and wireless sensor networks for surveillance. The research challenge is to adapt these tools and to develop new ones required for studying human and animal behavior in a high throughput manner while minimizing human intervention. In contrast with consumer applications, in the research arena there is less of a penalty for computational complexity, so that algorithmic quality can be maximized through the utilization of larger computational resources that are available to the biomedical researcher.