Publications‎ > ‎

Journal Articles

Ying, J., Shaw, J., Proctor, M., Carignan, C., Derrick, D., & Best, C. (under revision). "Para-sagittal dynamics in lateral /l/ production: Three-dimensional electromagnetic articulography data from Australian English."

Frej, M. Y., Carignan, C., & Best, C. T. (under review). "The acoustics and articulation of initial gemination in Moroccan and Lebanese Arabic."

Carignan, C., Coretta, S., Frahm, J., Harrington, J., Hoole, P., Joseph, A., Kunay, E., & Voit, D. (forthcoming). "Planting the seed for sound change: evidence from real-time MRI of velum kinematics in German." Language.

Ying, J., Shaw, J., Proctor, M., Carignan, C., Derrick, D., & Best, C. (forthcoming). "Evidence for active control of tongue lateralization in Australian English /l/." Journal of Phonetics.

Carignan, C. (2021). "A practical method of estimating the time-varying degree of vowel nasalization from acoustic features." Journal of the Acoustical Society of America, 149(2), 911–922, DOI: 10.1121/10.0002925.
This paper presents a simple and easy-to-use method of creating a time-varying signal of the degree of nasalization in vowels, generated from acoustic features measured in oral and nasalized vowel contexts. The method is presented for separate models constructed using two sets of acoustic features: (1) an uninformed set of 13 Mel-frequency cepstral coefficients (MFCCs) and (2) a combination of the 13 MFCCs and a phonetically informed set of 20 acoustic features of vowel nasality derived from previous research. Both models are compared against two traditional approaches to estimating vowel nasalization from acoustics: A1-P0 and A1-P1, as well as their formant-compensated counterparts. Data include productions from six speakers of different language backgrounds, producing 11 different qualities within the vowel quadrilateral. The results generated from each of the methods are compared against nasometric measurements, representing an objective “ground truth” of the degree of nasalization. The results suggest that the proposed method is more robust than conventional acoustic approaches, generating signals which correlate strongly with nasometric measures across all vowel qualities and all speakers and accurately approximate the time-varying change in the degree of nasalization. Finally, an experimental example is provided to help researchers implement the method in their own study designs.

Carignan, C. and Egurtzegi, A. (2021). "Principal components variable importance reconstruction (PC-VIR): Exploring predictive importance in multicollinear acoustic speech data." arXiv:2102.04740 [stat.ME]
GitHub repository
This paper presents a method of exploring the relative predictive importance of individual variables in multicollinear data sets at three levels of significance: strong importance, moderate importance, and no importance. Implementation of Bonferroni adjustment to control for Type I error in the method is described, and results with and without the correction are compared. An example of the method in binary logistic modeling is demonstrated by using a set of 20 acoustic features to discriminate vocalic nasality in the speech of six speakers of the Mixean variety of Low Navarrese Basque. Validation of the method is presented by comparing the direction of significant effects to those observed in separate logistic mixed effects models, as well as goodness of fit and prediction accuracy compared to partial least squares logistic regression. The results show that the proposed method yields: (1) similar, but more conservative estimates in comparison to separate logistic regression models, (2) models that fit data as well as partial least squares methods, and (3) predictions for new data that are as accurate as partial least squares methods.

Shaw, J., Carignan, C., Agostini, T., Mailhammer, R., Harvey, M., & Derrick, D. (2020). "Phonological contrast and phonetic variation: the case of velars in Iwaidja." Language, 96(3), 578617.
A field-based ultrasound and acoustic study of Iwaidja, an endangered Australian aboriginal language, investigated the phonetic identity of non-nasal velar consonants in intervocalic position, where past work had proposed a [+continuant] vs [-continuant] phonemic contrast. We analyzed the putative contrast within a continuous phonetic space, defined by both acoustic and articulatory parameters, and found gradient variation from more consonantal realizations, e.g. [ɰ], to more vocalic realizations, e.g. [a]. The distribution of realizations across lexical items and speakers did not support the proposed phonemic contrast. This case illustrates how lenition that is both phonetically gradient and variable across speakers and words can give the illusion of a contextually restricted phonemic contrast.

Carignan, C., Hoole, P., Kunay, E., Pouplier, M., Joseph, A., Voit, D., Frahm, J., & Harrington, J. (2020). "Analyzing speech in both time and space: Generalized additive mixed models can uncover systematic patterns of variation in vocal tract shape in real-time MRI." Laboratory Phonology: Journal of the Association for Laboratory Phonology, 11(1): 2, 1–26, DOI: 10.5334/labphon.214.
We present a method of using generalized additive mixed models (GAMMs) to analyze midsagittal vocal tract data obtained from real-time magnetic resonance imaging (rt-MRI) video of speech production. Applied to rt-MRI data, GAMMs allow for observation of factor effects on vocal tract shape throughout two key dimensions: time (vocal tract change over the temporal course of a speech segment) and space (location of change within the vocal tract). Examples of this method are provided for rt-MRI data collected at a temporal resolution of 20 ms and a spatial resolution of 1.41 mm, for 36 native speakers of German. The rt-MRI data were quantified as 28-point semi-polar-grid aperture functions. Three test cases are provided as a way of observing vocal tract differences between: (1) /aː/ and /iː/, (2) /aː/ and /aɪ/, and (3) accentuated and unstressed /aː/. The results for each GAMM are independently validated using functional linear mixed models (FLMMs) constructed from data obtained at 20% and 80% of the vowel interval. In each case, the two methods yield similar results. In light of the method similarities, we propose that GAMMs are a robust, powerful, and interpretable method of simultaneously analyzing both temporal and spatial effects in rt-MRI video of speech.

Egurtzegi, A., & Carignan, C. (2020). "An acoustic description of Mixean Basque." Journal of the Acoustical Society of America, 147(4), 2791–2802, DOI: 10.1121/10.0000996.
Publisher link
This paper presents an acoustic analysis of Mixean Low Navarrese, an endangered variety of Basque. The manuscript includes an overview of previous acoustic studies performed on different Basque varieties in order to synthesize the sparse acoustic descriptions of the language that are available. This synthesis serves as a basis for the acoustic analysis performed in the current study, in which the various acoustic analyses given in previous studies are replicated in a single, cohesive general acoustic description of Mixean Basque. The analyses include formant and duration measurements for the six-vowel system, voice onset time measurements for the three-way stop system, spectral center of gravity for the sibilants, and number of lingual contacts in the alveolar rhotic tap and trill. Important findings include: a centralized realization ([ʉ]) of the high-front rounded vowel usually described as /y/; a data-driven confirmation of the three-way laryngeal opposition in the stop system; evidence in support of an alveolo-palatal to apical sibilant merger; and the discovery of a possible incipient merger of rhotics. These results show how using experimental acoustic methods to study under-represented linguistic varieties can result in revelations of sound patterns otherwise undescribed in more commonly studied varieties of the same language.

Carignan, C. (2019). "A network-modeling approach to investigating individual differences in articulatory-to-acoustic relationship strategies." Speech Communication, 108, 1–14, DOI: 10.1016/j.specom.2019.01.007.
PDF access request (embargo period)
This study represents an exploratory analysis of a novel method of investigating variation among individual speakers with respect to the articulatory strategies used to modify acoustic characteristics of their speech. Articulatory data (nasalization, tongue height, breathiness) and acoustic data (F1 frequency) related to the distinction of three nasal-oral vowel contrasts in French were co-registered. Data were collected first from four Southern French (FR) speakers and, subsequently, from nine naïve Australian English listeners who imitated the FR productions. Articulatory measurements were mapped to F1 measurements using relative importance analysis (RIA), and the RIA coefficients were used to create similarity scores among all of the speakers. The similarity scores were then used to build network models for each nasal-oral vowel pair, using the spinglass algorithm to identify communities of shared articulatory-to-acoustic strategies within each network. The results show that network grouping is rarely based on language-dependent articulatory-to-acoustic strategies, but evidence of inter- and intra-speaker consistency is observed: Individual speakers tend to group together in their articulatory-to-acoustic strategies across vowel pairs, and most speakers have consistent articulatory-to-acoustic mappings across vowel pairs. Evidence is also observed which highlights the multi-dimensional nature of vowel nasality, rather than the uni-dimensional assumption of "nasal" vowels as merely oral vowels produced with a lowered velum.

Carignan, C. (2018a). "Using naïve listener imitations of native speaker productions to investigate mechanisms of listener-based sound change." Laboratory Phonology: Journal of the Association for Laboratory Phonology, 9(1): 18, 1–31, DOI: 10.5334/labphon.136.
This study was designed to test whether listener-based sound change—listener misperception (Ohala, 1981, 1993) and perceptual cue re-weighting (Beddor, 2009, 2012)—can be observed synchronically in a laboratory setting. Co-registered articulatory data (degree of nasalization, tongue height, breathiness) and acoustic data (F1 frequency) related to the productions of phonemic oral and nasal vowels of Southern French were first collected from four native speakers, and the acoustic recordings were subsequently presented to nine Australian English naïve listeners, who were instructed to imitate the native productions. During these imitations, similar articulatory and acoustic data were collected in order to compare the articulatory strategies used by the two groups. The results suggest that the imitators successfully reproduced the acoustic distinctions made by the native speakers, but that they did so using different articulatory strategies. The articulatory strategies for the vowel pair /ɑ̃/-/a/ suggest that listeners (at least partially) misperceived F1-lowering due to nasalization and breathiness as being due to tongue height. Additional evidence supports perceptual cue re-weighting, in that the naïve imitators used nasalance less, and tongue height more, in order to obtain the same F1 nasal-oral distinctions that the native speakers had originally produced.

Carignan, C. (2018b). "Using ultrasound and nasalance to separate oral and nasal contributions to formant frequencies of nasalized vowels." Journal of the Acoustical Society of America, 143(5), 2588–2601, DOI: 10.1121/1.5034760.
Publisher link
The experimental method described in this manuscript offers a possible means to address a well known issue in research on the independent effects of nasalization on vowel acoustics: given that the separate transfer functions associated with the oral and nasal cavities are merged in the acoustic signal, the task of teasing apart the respective effects of the two cavities seems to be an intractable problem. The proposed method uses ultrasound and nasalance to predict the effect of lingual configuration on formant frequencies of nasalized vowels, thus accounting for acoustic variation due to changing lingual posture and excluding its contribution to the acoustic signal. The results reveal that the independent effect of nasalization on the acoustic vowel quadrilateral resembles a counter-clockwise chain shift of nasal compared to non-nasal vowels. The results from the productions of 11 vowels by six speakers of different language backgrounds are compared to predictions presented in previous modeling studies, as well as discussed in the light of sound change of nasal vowel systems.

Derrick, D., Carignan, C., Chen, W.-R., Shujau, M., & Best, C. (2018). "Three-dimensional printable ultrasound transducer stabilization system." Journal of the Acoustical Society of America, 144(5), EL392–EL398, DOI: 10.1121/1.5066350.
Publisher link
When using ultrasound imaging of the tongue for speech recording/research, submental transducer stabilization is required to prevent the ultrasound transducer from translating or rotating in relation to the tongue. An iterative prototype of a lightweight three-dimensional-printable wearable ultrasound transducer stabilization system that allows flexible jaw motion and free head movement is presented. The system is completely non-metallic, eliminating interference with co- recorded signals, thus permitting co-collection and co-registration with articulometry systems. A motion study of the final version demonstrates that transducer rotation is limited to 1.25° and translation to 2.5 mm—well within accepted tolerances.

Blackwood Ximenes, A., Shaw, J., & Carignan, C. (2017). "A comparison of acoustic and articulatory methods for analyzing vowel variation across American and Australian dialects of English." Journal of the Acoustical Society of America, 142(1), 363–377, DOI: 10.1121/1.4991346.
In studies of dialect variation, the articulatory nature of vowels is sometimes inferred from formant values using the following heuristic: F1 is inversely correlated with tongue height and F2 is inversely correlated with tongue backness. This study compared vowel formants and corresponding lingual articulation in two dialects of English, standard North American English, and Australian English. Five speakers of North American English and four speakers of Australian English were recorded producing multiple repetitions of ten monophthongs embedded in the /sVd/ context. Simultaneous articulatory data were collected using electromagnetic articulography. Results show that there are significant correlations between tongue position and formants in the direction predicted by the heuristic but also that the relations implied by the heuristic break down under specific conditions. Articulatory vowel spaces, based on tongue dorsum position, and acoustic vowel spaces, based on formants, show systematic misalignment due in part to the influence of other articulatory factors, including lip rounding and tongue curvature on formant values. Incorporating these dimensions into dialect comparison yields a richer description and a more robust understanding of how vowel formant patterns are reproduced within and across dialects.

Carignan, C. (2017). "Covariation of nasalization, tongue height, and breathiness in the realization of F1 of Southern French nasal vowels." Journal of Phonetics, 63, 87–105, DOI: 10.1016/j.wocn.2017.04.005.
In a variety of languages, changes in tongue height and breathiness have been observed to covary with nasalization in both phonetic and phonemic vowel nasality. It has been argued that this covariation stems from speakers using multiple articulations to enhance F1 modulation and/or from listeners misperceiving the articulatory basis for F1 modification. This study includes results from synchronous nasalance, ultrasound, EGG, and F1 data related to the realizations of the oral–nasal vowel pairs /ɛ/-/ɛ̃/, /a/-/ɑ̃/, and /o/-/ɔ̃/ of Southern French (SF) as produced by four male speakers in a laboratory setting. The aim of the study is to determine to what extent tongue height and breathiness covary with nasalization, as well as how these articulations affect the realization of F1. The following evidence is observed: (1) that nasalization, breathiness, and tongue height are used in idiosyncratic ways to distinguish F1 for each vowel pair; (2) that increased nasalization and breathiness significantly predict F1-lowering for all three nasal vowels; (3) that nasalization increases throughout the duration of the nasal vowels, supporting previous claims about the temporal nature of nasality in SF nasal vowels, but contradicting claims that SF nasal vowels comprise distinct oral and nasal elements; (4) that breathiness increases in a gradient manner as nasalization increases; and (5) that the acoustic and articulatory data provide limited support for claims of the existence of an excrescent nasal coda in SF nasal vowels. These results are discussed in the light of claims that the multiple articulatory components observed in the production of vowel nasalization may have arisen due to misperception-based sound change and/or to phonetic enhancement.

Kalashnikova, M., Carignan, C., & Burnham, D. (2017). "The origins of babytalk: Smiling, teaching, or social convergence?" Royal Society Open Science, 4(8), DOI: 10.1098/rsos.170306.
When addressing their young infants, parents systematically modify their speech. Such infant-directed speech (IDS) contains exaggerated vowel formants, which have been proposed to foster language development via articulation of more distinct speech sounds. Here, this assumption is rigorously tested using both acoustic and, for the first time, fine-grained articulatory measures. Mothers were recorded speaking to their infant and to another adult, and measures were taken of their acoustic vowel space, their tongue and lip movements and the length of their vocal tract. Results showed that infant- but not adult-directed speech contains acoustically exaggerated vowels, and these are not the product of adjustments to tongue or to lip movements. Rather, they are the product of a shortened vocal tract due to a raised larynx, which can be ascribed to speakers' unconscious effort to appear smaller and more non-threatening to the young infant. This adjustment in IDS may be a vestige of early mother–infant interactions, which had as its primary purpose the transmission of non-aggressiveness and/or a primitive manifestation of pre-linguistic vocal social convergence of the mother to her infant. With the advent of human language, this vestige then acquired a secondary purpose—facilitating language acquisition via the serendipitously exaggerated vowels.

Mielke, J., Carignan, C., & Thomas, E. R. (2017). "The articulatory dynamics of pre-velar and pre-nasal /æ/-raising in English: an ultrasound study." Journal of the Acoustical Society of America, 142(1), 332–349, DOI: 10.1121/1.4991348.
Most dialects of North American English exhibit /æ/-raising in some phonological contexts. Both the conditioning environments and the temporal dynamics of the raising vary from region to region. To explore the articulatory basis of /æ/-raising across North American English dialects, acoustic and articulatory data were collected from a regionally diverse group of 24 English speakers from the United States, Canada, and the United Kingdom. A method for examining the temporal dynamics of speech directly from ultrasound video using EigenTongues decomposition [Hueber, Aversano, Chollet, Denby, Dreyfus, Oussar, Roussel, and Stone (2007). in IEEE International Conference on Acoustics, Speech and Signal Processing (Cascadilla, Honolulu, HI)] was applied to extract principal components of filtered images and linear regression to relate articulatory variation to its acoustic consequences. This technique was used to investigate the tongue movements involved in /æ/ production, in order to compare the tongue gestures involved in the various /æ/-raising patterns, and to relate them to their apparent phonetic motivations (nasalization, voicing, and tongue position).

Carignan, C., Shosted, R., Fu, M., Liang, Z.-P., & Sutton, B. (2015). "A real-time MRI investigation of the role of lingual and pharyngeal articulation in the production of the nasal vowel system of French." Journal of Phonetics, 50, 34–51, DOI: 10.1016/j.wocn.2015.01.001.
It is well known that, for nasal vowels, traditional estimation of the shape of the vocal tract via inference from acoustic characteristics is complicated by the acoustic effects of velopharyngeal coupling (i.e. nasalization). Given this complexity, measuring the shape of the vocal tract directly is, perhaps, a more desirable method of assessing oro-pharyngeal configuration. Real-time MRI (rt-MRI) allows us to explore the shape of the entire vocal tract during the production of nasal vowels. This permits us to better assess the contribution of the oro-pharyngeal acoustic transfer function to the acoustic signal, which is otherwise obscured by the conflation of the independent oro-pharyngeal and nasal acoustic transfer functions. The oro-pharyngeal shape associated with nasal vowels has implications for both synchronic and diachronic phonology, particularly in French, where descriptions of nasal vowels have long suggested that differences in oral articulation, in addition to velopharyngeal coupling, serve to distinguish oral and nasal vowels. In this study, we use single-slice rt-MRI (midsagittal slice) and multi-slice rt-MRI (oral, velopharyngeal, mediopharyngeal, and lower pharyngeal slices) to examine three nasal vowels /ɛ̃, ɑ̃, ɔ̃/ and their traditional oral counterparts /ɛ, a, o/ as produced by three female speakers of Northern Metropolitan French (NMF). We find evidence of lingual and pharyngeal articulatory configurations which may, in some cases, enhance formant-frequency-related acoustic effects associated with nasalization, viz., modulation of F1 and F2. Given these findings, we speculate that the synchronic oral articulation of NMF nasal vowels may have arisen—at least in part—due to misperception of the articulatory source of changes in F1 and F2, rather than to mere chance, as has been argued.

Fu, M., Zhao, B., Carignan, C., Shosted, R., Perry, J., Kuehn, D., Liang, Z.-P., & Sutton, B. (2015). "High-resolution dynamic speech imaging with joint low-rank and sparsity constraints." Magnetic Resonance in Medicine, 74(5), 1820–1832, DOI: 10.1002/mrm.25302.
To enable dynamic speech imaging with high spatiotemporal resolution and full-vocal-tract spatial coverage, leveraging recent advances in sparse sampling.

An imaging method is developed to enable high-speed dynamic speech imaging exploiting low-rank and sparsity of the dynamic images of articulatory motion during speech. The proposed method includes: (a) a novel data acquisition strategy that collects spiral navigators with high temporal frame rate and (b) an image reconstruction method that derives temporal subspaces from navigators and reconstructs high-resolution images from sparsely sampled data with joint low-rank and sparsity constraints.

The proposed method has been systematically evaluated and validated through several dynamic speech experiments. A nominal imaging speed of 102 frames per second (fps) was achieved for a single-slice imaging protocol with a spatial resolution of 2.2 × 2.2 × 6.5 mm(3) . An eight-slice imaging protocol covering the entire vocal tract achieved a nominal imaging speed of 12.8 fps with the identical spatial resolution. The effectiveness of the proposed method and its practical utility was also demonstrated in a phonetic investigation.

High spatiotemporal resolution with full-vocal-tract spatial coverage can be achieved for dynamic speech imaging experiments with low-rank and sparsity constraints.

Carignan, C. (2014). "An acoustic and articulatory examination of the 'oral' in 'nasal': The oral articulations of French nasal vowels are not arbitrary." Journal of Phonetics, 46, 23–33, DOI: 10.1016/j.wocn.2014.05.001.
This study includes results of an articulatory (electromagnetic articulography, i.e. EMA) and acoustic study of the realizations of three oral–nasal vowel pairs  /ɛ/-/ɛ̃/, /a/-/ɑ̃/, and /o/-/ɔ̃/ recorded from 12 Northern Metropolitan French (NMF) female speakers in laboratory settings. By studying the position of the tongue and the lips during the production of target oral and nasal vowels and simultaneously recording the acoustic signal, the predicted effects of velo-pharyngeal (VP) coupling on the acoustic output of the vocal tract can be separated from those due to oral articulatory configuration in a qualitative manner. Based on the previous research, all nasal vowels were expected to be produced with at least some change in lingual and labial articulatory configurations compared to their oral vowel counterparts. Evidence is observed which suggests that many of the oral articulatory configurations of NMF nasal vowels enhance the acoustic effect of VP coupling on F1 and F2 frequencies. Moreover, evidence is observed that the oral articulatory strategies used to produce the oral/nasal vowel distinction are idiosyncratic, but that, nevertheless, speakers produce a similar acoustic output. These results are discussed in the light of motor equivalence as well as the view that the goal of speech acts is acoustic, not articulatory.

Shosted, R., Carignan, C., & Rong, P. (2012). "Managing the distinctiveness of phonemic nasal vowels: Articulatory evidence from Hindi." Journal of the Acoustical Society of America, 131(1), 455–465, DOI: 10.1121/1.3665998.
There is increasing evidence that fine articulatory adjustments are made by speakers to reinforce and sometimes counteract the acoustic consequences of nasality. However, it is difficult to attribute the acoustic changes in nasal vowel spectra to either oral cavity configuration or to velopharyngeal opening (VPO). This paper takes the position that it is possible to disambiguate the effects of VPO and oropharyngeal configuration on the acoustic output of the vocal tract by studying the position and movement of the tongue and lips during the production of oral and nasal vowels. This paper uses simultaneously collected articulatory, acoustic, and nasal airflow data during the production of all oral and phonemically nasal vowels in Hindi (four speakers) to understand the consequences of the movements of oral articulators on the spectra of nasal vowels. For Hindi nasal vowels, the tongue body is generally lowered for back vowels, fronted for low vowels, and raised for front vowels (with respect to their oral congeners). These movements are generally supported by accompanying changes in the vowel spectra. In Hindi, the lowering of back nasal vowels may have originally served to enhance the acoustic salience of nasality, but has since engendered a nasal vowel chain shift.

Carignan, C., Shosted, R., Shih, C., & Rong, P. (2011). "Compensatory articulation in American English nasalized vowels." Journal of Phonetics, 39, 668–682, DOI: 10.1016/j.wocn.2011.07.005.
In acoustic studies of vowel nasalization, it is sometimes assumed that the primary articulatory difference between an oral vowel and a nasal vowel is the coupling of the nasal cavity to the rest of the vocal tract. Acoustic modulations observed in nasal vowels are customarily attributed to the presence of additional poles affiliated with the naso-pharyngeal tract and zeros affiliated with the nasal cavity. We test the hypothesis that oral configuration may also change during nasalized vowels, either enhancing or compensating for the acoustic modulations associated with nasality. We analyze tongue position, nasal airflow, and acoustic data to determine whether American English /i/ and /a/ manifest different oral configurations when they are nasalized, i.e. when they are followed by nasal consonants. We find that tongue position is higher during nasalized [ĩ] than it is during oral [i] but do not find any effect for nasalized [ã]. We argue that speakers of American English raise the tongue body during nasalized [ĩ] in order to counteract the perceived F1-raising (centralization) associated with high vowel nasalization.