Publications

An Information-Theoretic Characterization of Morphological Fusion

Linguistic typology generally divides synthetic languages into groups based on their morphological fusion. However, this measure has …

Simple induction of (deterministic) probabilistic finite-state automata for phonotactics by stochastic gradient descent

We introduce a simple and highly general phonotactic learner which induces a probabilistic finite-state automaton from word-form data. …

Deep Subjecthood: Higher-Order Grammatical Features in Multilingual BERT

We investigate how Multilingual BERT (mBERT) encodes grammar by examining how the high-order grammatical feature of morphosyntactic …

Sensitivity as a Complexity Measure for Sequence Classification Tasks

Abstract We introduce a theoretical framework for understanding and predicting the complexity of sequence classification tasks, using a …

Modeling word and morpheme order in natural language as an efficient tradeoff of memory and surprisal

Memory limitations are known to constrain language comprehension and production, and have been argued to account for crosslinguistic …