portfolio

publications

Can Text Encoders be Deceived by Length Attack?

The 14th International Conference on Learning Representations (ICLR 2023), May 2023

An editing method is proposed that can effectively improve the robustness of models against length attacks and can be attributed to reduced length information in the embeddings, more robust intra-document token interaction.

Length is a Curse and a Blessing for Document-level Semantics

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023), Dec 2023

We show that contrastive learning models are sensitive to text length in ways that distort semantic representations, and propose a length-agnostic framework that improves robustness and retrieval performance.

talks

teaching