Publications

Selected Publications

Superalignment with Dynamic Human Values

Published in ICLR 2025 Workshop on Bidirectional Human-AI Alignment (BiAlign), 2025

This paper sketches a roadmap for training a superhuman reasoning model to decompose complex tasks into subtasks amenable to human-level guidance, addressing scalable oversight and dynamic human values in AI alignment.

Recommended citation: Florian Mai, David Kaczér, Nicholas Kluge Corrêa, Lucie Flek. (2025). "Superalignment with Dynamic Human Values." ICLR 2025 Workshop on Bidirectional Human-AI Alignment (BiAlign).

Learning to Plan for Language Modeling from Unlabeled Data

Published in COLM, 2024

We propose a method to learn planning for language modeling using unlabeled data.

Recommended citation: Nathan Cornille, Marie-Francine Moens, Florian Mai. (2024). "Learning to Plan for Language Modeling from Unlabeled Data." COLM 2024. https://arxiv.org/abs/2404.00614

HyperMixer: An MLP-based Low Cost Alternative to Transformers

Published in ACL, 2023

We propose an efficient all-MLP architecture with the same inductive biases as Transformers.

Recommended citation: Florian Mai, Arnaud Pannatier, Fabio Fehr, Haolin Chen, François Marelli, François Fleuret and James Henderson. (2020). "HyperMixer: An MLP-based Low Cost Alternative to Transformers." ACL 2023. https://arxiv.org/abs/2203.03691

Bag-of-Vectors Autoencoders for Unsupervised Conditional Text Generation

Published in AACL, 2022

We embed text into a variable-size embedding space via an autoencoder and propose a method to learn unsupervised conditional text generation tasks in that space.

Recommended citation: Florian Mai and James Henderson. (2022). "Bag-of-Vectors Autoencoders for Unsupervised Conditional Text Generation." AACL 2022. https://arxiv.org/abs/2110.07002

Plug and Play Autoencoders for Conditional Text Generation

Published in EMNLP, 2020

We reduce conditional text generation tasks to learning in the embedding space of an autoencoder.

Recommended citation: Florian Mai, Nikolaos Pappas, Ivan Montero, Noah A. Smith and James Henderson. (2020). "Plug and Play Autoencoders for Conditional Text Generation." EMNLP 2020. https://arxiv.org/abs/2010.02983

Optimizer Benchmarking Needs to Account for Hyperparameter Tuning

Published in ICML, 2020

We formulate a benchmarking evaluation protocol that takes the tunability of optimizers into account.

Recommended citation: Prabhu Teja Sivaprasad*, Florian Mai*, Thijs Vogels, Martin Jaggi and Francois Fleuret (2020). "Optimizer Benchmarking Needs to Account for Hyperparameter Tuning." ICML 2020. https://arxiv.org/abs/1910.11758

All Peer-Reviewed Publications

Triple-Encoders: Representations That Fire Together, Wire Together

Published in ACL, 2024

We propose an efficient model for sequence modeling by encoding turn distances in the embedding space.

Recommended citation: Justus-Jonas Erker, Florian Mai, Nils Reimers, Gerasimos Spanakis, and Iryna Gurevych (2024). "Triple-Encoders: Representations That Fire Together, Wire Together." ACL 2024. https://arxiv.org/abs/2402.12332

Open-Source Conversational AI with SpeechBrain 1.0

Published in JMLR, MLOSS, 2024

We present SpeechBrain 1.0, an open-source toolkit for speech and language processing.

Recommended citation: Mirco Ravanelli, Titouan Parcollet, Adel Moumen, Sylvain de Langen, Cem Subakan, Peter Plantinga, Yingzhi Wang, Pooneh Mousavi, Luca Della Libera, Artem Ploujnikov, Francesco Paissan, Davide Borra, Salah Zaiem, Zeyu Zhao, Shucong Zhang, Georgios Karakasidis, Sung-Lin Yeh, Pierre Champion, Aku Rouhe, Rudolf Braun, Florian Mai, Juan Zuluaga-Gomez, Seyed Mahed Mousavi, Andreas Nautsch, Xuechen Liu, Sangeet Sagar, Jarod Duret, Salima Mdhaffar, Gaelle Laperriere, Mickael Rouvier, Renato De Mori, Yannick Esteve. (2024). "Open-Source Conversational AI with SpeechBrain 1.0." arXiv. https://www.jmlr.org/papers/volume25/24-0991/24-0991.pdf

Learning to Plan for Language Modeling from Unlabeled Data

Published in COLM, 2024

We propose a method to learn planning for language modeling using unlabeled data.

Recommended citation: Nathan Cornille, Marie-Francine Moens, Florian Mai. (2024). "Learning to Plan for Language Modeling from Unlabeled Data." COLM 2024. https://arxiv.org/abs/2404.00614

HyperConformer: Multi-head HyperMixer for Efficient Speech Recognition

Published in InterSpeech, 2023

We show that the attention mechanism in the popular Conformer architecture can be replaced with a more efficient alternative at no performance loss.

Recommended citation: Florian Mai, Juan Zuluaga-Gomez, Titouan Parcollet and Petr Motlicek (2023). "HyperConformer: Multi-head HyperMixer for Efficient Speech Recognition." InterSpeech 2023. https://arxiv.org/abs/2305.18281

HyperMixer: An MLP-based Low Cost Alternative to Transformers

Published in ACL, 2023

We propose an efficient all-MLP architecture with the same inductive biases as Transformers.

BQ-NCO: Bisimulation Quotienting for Generalizable Neural Combinatorial Optimization

Published in NeurIPS, 2023

We propose an new neural architecture for Neural Combinatorial Optimization based on bisimulation quotienting of a CO problem’s MDP.

Recommended citation: Darko Drakulic, Sofia Michel, Florian Mai, Arnaud Sors, Jean-Marc Andreoli (2023). "BQ-NCO: Bisimulation Quotienting for Generalizable Neural Combinatorial Optimization." NeurIPS 2023. https://arxiv.org/abs/2301.03313

Bag-of-Vectors Autoencoders for Unsupervised Conditional Text Generation

Published in AACL, 2022

We embed text into a variable-size embedding space via an autoencoder and propose a method to learn unsupervised conditional text generation tasks in that space.

Recommended citation: Florian Mai and James Henderson. (2022). "Bag-of-Vectors Autoencoders for Unsupervised Conditional Text Generation." AACL 2022. https://arxiv.org/abs/2110.07002

Plug and Play Autoencoders for Conditional Text Generation

Published in EMNLP, 2020

We reduce conditional text generation tasks to learning in the embedding space of an autoencoder.

Optimizer Benchmarking Needs to Account for Hyperparameter Tuning

Published in ICML, 2020

We formulate a benchmarking evaluation protocol that takes the tunability of optimizers into account.

CBOW Is Not All You Need: Combining CBOW with the Compositional Matrix Space Model

Published in ICLR, 2019

We represent word embeddings as matrices and compose them via matrix multiplication.

Recommended citation: Florian Mai, Lukas Galke and Ansgar Scherp (2019). "CBOW Is Not All You Need: Combining CBOW with the Compositional Matrix Space Model." ICLR 2019. https://arxiv.org/abs/1902.06423

Using Adversarial Autoencoders for Multi-Modal Automatic Playlist Continuation

Published in RecSys Challenge, 2018

Recommended citation: Iacopo Vagliano, Lukas Galke, Florian Mai and Ansgar Scherp. (2018). "Using Adversarial Autoencoders for Multi-Modal Automatic Playlist Continuation." RecSys 2018. https://dl.acm.org/doi/abs/10.1145/3267471.3267476

Multi-Modal Adversarial Autoencoders for Recommendations of Citations and Subject Labels

Published in UMAP, 2018

We show that adversarial autoencoders perform well at multi-modal recommendation tasks.

Recommended citation: Lukas Galke, Florian Mai, Iacopo Vagliano and Ansgar Scherp. (2018). "Multi-Modal Adversarial Autoencoders for Recommendations of Citations and Subject Labels." UMAP 2018. https://arxiv.org/abs/1907.12366

Using Deep Learning for Title-Based Semantic Subject Indexing to Reach Competitive Performance to Full-Text

Published in JCDL, 2018

We show that title-based text classification can outperform classification based on the full text due to the larger number of available training data.

Recommended citation: Florian Mai, Lukas Galke and Ansgar Scherp. (2018). "Using Deep Learning for Title-Based Semantic Subject Indexing to Reach Competitive Performance to Full-Text." JCDL 2018. https://arxiv.org/abs/1801.06717

Using Titles vs. Full-Text as Source for Automated Semantic Document Annotation

Published in KCAP, 2017

We compare different machine learning approaches for text classification from two different modalities, titles and texts.

Recommended citation: Lukas Galke, Florian Mai, Alan Schelten, Dennis Brunsch and Ansgar Scherp. (2017). "Using Deep Learning for Title-Based Semantic Subject Indexing to Reach Competitive Performance to Full-Text." KCAP 2017. https://arxiv.org/abs/1705.05311

Reranking-based Recommender System with Deep Learning

Published in INFORMATIK, 2017

Recommended citation: Ahmed Saleh, Florian Mai, Chifumi Nishioka and Ansgar Scherp. (2017). "Reranking-based Recommender System with Deep Learning." INFORMATIK 2017.

Preprints

End-to-end Planner Training for Language Modeling

Published in arXiv, 2024

We propose a differentiable method for joint fine-tuning of language models with planning modules by using predicted label probabilities as mixing weights.

Recommended citation: Nathan Cornille, Florian Mai, Jingyuan Sun, Marie-Francine Moens. (2024). "End-to-end Planner Training for Language Modeling." arXiv:2410.12492. https://arxiv.org/abs/2410.12492

Learning to Plan Long-Term for Language Modeling

Published in arXiv, 2024

We propose a planner that predicts a latent plan for many sentences into the future, allowing language models to trade computation time for better next token prediction accuracy.

Recommended citation: Florian Mai, Nathan Cornille, Marie-Francine Moens. (2024). "Learning to Plan Long-Term for Language Modeling." arXiv:2409.00070. https://arxiv.org/abs/2409.00070

Learning Entailment-Based Sentence Embeddings from Natural Language Inference

Published in OpenReview, 2019

Our natural language inference model learns sentence embeddings that are interpretable in terms of entailment relations.

Recommended citation: Rabeeh Karimi Mahabadi*, Florian Mai* and James Henderson (2019). "Learning Entailment-Based Sentence Embeddings from Natural Language Inference." OpenReview. https://openreview.net/forum?id=BkxackSKvH&noteId=SmKtJ-nJHH