PP-OCRv5 Collection PP-OCRv5 is the latest text recognition solution, supporting Simplified Chinese, Chinese Pinyin, Traditional Chinese, English, and Japanese • 13 items • Updated Sep 15, 2025 • 50
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth Paper • 2509.03867 • Published Sep 4, 2025 • 210
MMTEB: Massive Multilingual Text Embedding Benchmark Paper • 2502.13595 • Published Feb 19, 2025 • 43
view article Article Welcome EmbeddingGemma, Google's new efficient embedding model +4 Sep 4, 2025 • 267
Health AI Developer Foundations (HAI-DEF) Collection Groups models released for use in health AI by Google. Read more about HAI-DEF at http://goo.gle/hai-def • 16 items • Updated 17 days ago • 140
Tucan Collection A series of open-source Bulgarian language models fine-tuned for function calling and tool use. 2.6B, 9B, and 27B parameter variants. • 12 items • Updated Jul 1, 2025 • 1
view article Article Train 400x faster Static Embedding Models with Sentence Transformers Jan 15, 2025 • 222
CoLLM: A Large Language Model for Composed Image Retrieval Paper • 2503.19910 • Published Mar 25, 2025 • 15
view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 May 28, 2024 • 263
view article Article Training and Finetuning Reranker Models with Sentence Transformers v4 Mar 26, 2025 • 177
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion Paper • 2503.11576 • Published Mar 14, 2025 • 128
view article Article PaliGemma 2 Mix - New Instruction Vision Language Models by Google +1 Feb 19, 2025 • 74
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5, 2024 • 138