AI & ML interests

Massive Text Embeddings Benchmark

Recent Activity

SamoedΒ  new activity about 17 hours ago
mteb/arguana:Wrong reference and metadata
SamoedΒ  updated a dataset about 17 hours ago
mteb/arguana
orionwellerΒ  updated a Space about 21 hours ago
mteb/leaderboard
View all activity

SamoedΒ 
in mteb/arguana about 17 hours ago

Wrong reference and metadata

1
#5 opened about 19 hours ago by
nicolaebanari
tomaarsenΒ 
posted an update 9 days ago
view post
Post
436
🌐 I've just published Sentence Transformers v5.4 to make the project fully multimodal for embeddings and reranking. The release also includes a modular CrossEncoder, and automatic Flash Attention 2 input flattening. Details:

You can now use SentenceTransformer and CrossEncoder with text, images, audio, and video, with the same familiar API. That means you can compute embeddings for an image and a text query using model.encode(), compare them with model.similarity(), and it just works. Models like Qwen3-VL-Embedding-2B and jinaai/jina-reranker-m0 are supported out of the box.

Beyond multimodal, I also fully modularized the CrossEncoder class. It's now a torch.nn.Sequential of composable modules, just like SentenceTransformer has been. This unlocked support for generative rerankers (CausalLM-based models like mxbai-rerank-v2 and the Qwen3 rerankers) via a new LogitScore module, which wasn't possible before without custom code.

Also, Flash Attention 2 now automatically skips padding for text-only inputs. If your batch has a mix of short and long texts, this gives you a nice speedup and lower VRAM usage for free.

I wrote a blog post walking through the multimodal features with practical examples. Check it out if you want to get started, or just point your Agent to the URL: https://huggingface.co/blog/multimodal-sentence-transformers

This release has set up the groundwork for more easily introducing late-interaction models (both text-only and multimodal) into Sentence Transformers in the next major release. I'm looking forward to it!
SamoedΒ 
in mteb/BRIGHT 16 days ago

Add eval config

#2 opened 16 days ago by
Samoed
SamoedΒ 
updated a Space 17 days ago
SamoedΒ 
in mteb/leaderboard 17 days ago

Slow & Unresponsive.

1
#182 opened 17 days ago by
SuperPauly