Daniel van Strien's picture

Building on HF

Daniel van Strien PRO

davanstrien

huggingface

·

https://danielvanstrien.xyz/

AI & ML interests

Machine Learning Librarian

Recent Activity

upvoted an article about 1 hour ago

Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval

upvoted an article about 8 hours ago

Why We Built VIBE Bench: Rethinking Evaluation for Real Workloads

liked a model about 9 hours ago

LiquidAI/LFM2.5-VL-1.6B

View all activity

Organizations

upvoted an article about 1 hour ago

Article

Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval

+1

Mar 22, 2024

•

110

upvoted an article about 8 hours ago

Article

Why We Built VIBE Bench: Rethinking Evaluation for Real Workloads

about 8 hours ago

•

4

upvoted an article about 10 hours ago

Article

Diversity Vs Density: A strategy comparison for fine-tuning VLMs

about 16 hours ago

•

3

upvoted an article 18 days ago

Article

Shadow AI - Where are the CIOs?

18 days ago

•

31

upvoted 2 collections 19 days ago

SauerkrautLM-Vision-Document-Retrieval

7 items • Updated 22 days ago • 8

GLM-V

4 items • Updated 20 days ago • 11

upvoted 3 papers 20 days ago

CHURRO: Making History Readable with an Open-Weight Large Vision-Language Model for High-Accuracy, Low-Cost Historical Text Recognition

Paper • 2509.19768 • Published Sep 24, 2025 • 5

Metadata Extraction Leveraging Large Language Models

Paper • 2510.19334 • Published Oct 22, 2025 • 1

FiNERweb: Datasets and Artifacts for Scalable Multilingual Named Entity Recognition

Paper • 2512.13884 • Published 22 days ago • 14

upvoted a collection 20 days ago

fiNERweb

A multilingual dataset for NER covering 91 langauges and 25 scripts • 3 items • Updated 21 days ago • 1

upvoted 3 collections 21 days ago

Molmo2 Data

Artifacts for the Molmo2 data release • 16 items • Updated 14 days ago • 28

Molmo2

Artifacts for the Molmo2 release • 6 items • Updated 14 days ago • 30

Datasets Wrapped 2025: Reasoning

The reasoning datasets that defined 2025. Part 1 of Datasets Wrapped 2025. #DatasetsWrapped2025 • 20 items • Updated 21 days ago • 1

upvoted 3 collections 22 days ago

NeMo Gym

Collection of RL verifiable data for NeMo Gym • 13 items • Updated 14 days ago • 32

Nemotron-Post-Training-v3

Collection of datasets used in the post-training phase of Nemotron Nano v3. • 7 items • Updated 14 days ago • 55

NVIDIA Nemotron v3

Open, Production-ready Enterprise Models • 6 items • Updated 6 days ago • 113

upvoted an article 22 days ago

Article

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

22 days ago

•

104

upvoted a collection 24 days ago

Olmo 3.1

The latest members of the Olmo 3 family: another 3 weeks of RL for 32B Think, the 32B Instruct model, large post-training research datasets... • 9 items • Updated 14 days ago • 42

upvoted an article 26 days ago

Article

New in llama.cpp: Model Management

26 days ago

•

104

upvoted a collection 29 days ago

Ministral 3

Mistral Ministral 3: new multimodal models in Base, Instruct, and Reasoning variants, available in 3B, 8B, and 14B sizes. • 36 items • Updated 13 days ago • 26