Michael McAuliffe

Computational phonetician

I am currently a postdoctoral research fellow at McGill University working primarily with Morgan Sonderegger and Michael Wagner. I received my PhD from the University of British Columbia. My thesis was on lexically-guided perceptual learning and attention (available here, code available here). Much of my computational research on GitHub deals with speech sounds in spontaneous and laboratory speech, and in the ways that signal processing techniques can represent those sounds.

Primary projects

Speech Corpus Tools

Speech Corpus Tools is a graphical application for interacting, querying, and visualizing large speech corpora. It parses a wide range of formats into a database, which allow for fast and consistent queries across different sources of corpora. PolyglotDB is the package responsible for the storage and database aspects.

Montreal Forced Aligner

Montreal Forced Aligner is a command line utility for performing forced alignment on audio datasets using orthographic transcriptions and a pronunciation dictionary. It is trainable on larger datasets and can align smaller datasets through pretrained models. It is built using Kaldi.


Python-acoustic-similary represents most of my work in signal processing for creating MFCC, amplitude envelope, and gammatone representations of speech. Future versions will also include algorithms to calculate linguistically-relevant measurements such as pitch and formants.

Phonological CorpusTools

Phonological CorpusTools has Python implementations of algorithms reported in the linguistic literature with the ability to run these algorithms on a wide variety of corpora. The primary contributors to this project are Kathleen Currie Hall (@kchall), Blake Allen (@bhallen), Michael Fry (@mdfry), Scott Mackie (@jsmackie) and myself.