Jeffrey Scott - Ph.D. Research Proposal
Title: Automated Multi-Track Mixing and Analysis of Instrumental Mixtures
Advisor: Dr. Youngmoo Kim
Access to hardware and software tools for producing music has become commonplace in the digital landscape. While the means to produce music have become widely available, significant time must be invested to attain professional results. Mixing multi-channel audio requires techniques and training far beyond the knowledge of the average music software user. Achieving balance and clarity in a mixture comprising a multitude of instrument layers requires experience in evaluating and modifying the individual elements and their sum. Creating a mix involves many technical concerns (level balancing, dynamic range control, stereo panning, spectral balance) as well as artistic decisions (modulation effects, distortion effects, side-chaining, etc.). This work proposes methods to model the relationships between a set of multi-channel audio tracks based on short-time spectral-temporal characteristics and long term dynamics. The goal is to create a parameterized space based on high level perceptual cues to drive processing decisions in a multi-track audio setting.To this end, I propose a means to group tracks that have similar spectro-temporal characteristics using a learned basis approach and a model of the human auditory system. Critical band filtering and basis decomposition using Probabilistic Latent Components Analysis (PLCA) will provide an acoustic feature front-end. Mid level representations will be developed using state space models and mixtures of state space models. Unsupervised clustering using Gaussian Mixture Models (GMM) will inform groupings based on the mid-level representations. Perceptual evaluations regarding the salience of these groupings will be verified using structured listening tests. The results from the listening tests will inform modifications to the modeling procedure.