Research Interests

My research focuses on advancing the field of Artificial Intelligence, with particular emphasis on:

  1. Planning and Reasoning in Language Models (2023 - ongoing)
    • LLMs are so successful because they are trained from huge amounts of unlabeled data. Most existing approaches to planning and reasoning in LLMs use advanced prompting techniques or finetune on domain-specific tasks, which do not exploit the strengths of unlabeled data. I am interested in developing new methods to leverage the strengths of unlabeled data to improve planning and reasoning in LLMs in general.
    • To make LLMs more robust and reliable, they need to be able to reason effectively. If we can leverage unlabeled data to learn a model of what would happen when taking an action in a certain context, we can leverage this model to make better decisions.
    • In our COLM 2024 paper, we learn to predict abstract writing actions from unlabeled data and use them to control the future writing process of the LM.
  2. Efficient Text Representation Learning (2018 - 2023)
    • My PhD thesis focused on reducing the costs of natural language understanding in terms of processing time, required labeled data, and hyperparameter tuning. This research interest encompasses:
      • Developing a fast and order-sensitive sentence embedding method (ICLR 2019)
      • Creating frameworks for learning conditional text generation tasks with less labeled data (EMNLP 2020, AACL 2022)
      • Proposing evaluation protocols for deep learning optimizers to reduce hyperparameter tuning (ICML 2020)
      • Introducing efficient token mixing architectures as alternatives to the attention mechanism (ACL 2024, InterSpeech 2024)
    • These approaches aim to make NLP more accessible to research teams with limited resources and enable further scaling of language models and other AI systems.
  3. Deep Learning in Digital Libraries (2016-2018)
    • To enable the efficient use of digital libraries, data needs to be easily accessible, requiring categorization of articles, indexing, and retrieval, and recommendations.
    • With the rise of Deep Learning techniques for text, new ways to improve the efficiency of these tasks are being offered. I have contributed to the adoption of these techniques in the field of digital libraries in several ways: