Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Open to Collab
24
5
48
Michael Anthony
PRO
MikeDoes
Follow
branikita's profile picture
viuren's profile picture
faisal255's profile picture
77 followers
·
21 following
http://www.aisuisse.com
MikeDoesDo
MikeDoes
AI & ML interests
Privacy, Large Language Model, Explainable
Recent Activity
reacted
to
their
post
with 🚀
about 10 hours ago
Why choose between performance, privacy, and transparency when you can have all three? We're highlighting a solution-oriented paper that introduces PRvL, an open-source toolkit for PII redaction. The interesting part, the researchers used the AI4Privacy-300K and AI4Privacy-500K datasets to train and benchmark their suite of models. This is the power of open-source collaboration. We provide the comprehensive data foundation, and the community builds better solutions on top of it. It's a win for every organization when this research results in a powerful, free, and self-hostable tool that helps keep their data safe. Big cheers to Leon Garza, Anantaa Kotal, Aritran Piplai, Lavanya Elluri, Prajit D., and Aman Chadha for pulling this off. 🔗 Read the full paper to see their data-driven results and access the PRvL toolkit: https://arxiv.org/pdf/2508.05545 🚀 Stay updated on the latest in privacy-preserving AI—follow us on LinkedIn: https://www.linkedin.com/company/ai4privacy/posts/ #OpenSource #DataPrivacy #LLM #Anonymization #AIsecurity #HuggingFace #Ai4Privacy #Worldslargestopensourceprivacymaskingdataset
posted
an
update
about 10 hours ago
Why choose between performance, privacy, and transparency when you can have all three? We're highlighting a solution-oriented paper that introduces PRvL, an open-source toolkit for PII redaction. The interesting part, the researchers used the AI4Privacy-300K and AI4Privacy-500K datasets to train and benchmark their suite of models. This is the power of open-source collaboration. We provide the comprehensive data foundation, and the community builds better solutions on top of it. It's a win for every organization when this research results in a powerful, free, and self-hostable tool that helps keep their data safe. Big cheers to Leon Garza, Anantaa Kotal, Aritran Piplai, Lavanya Elluri, Prajit D., and Aman Chadha for pulling this off. 🔗 Read the full paper to see their data-driven results and access the PRvL toolkit: https://arxiv.org/pdf/2508.05545 🚀 Stay updated on the latest in privacy-preserving AI—follow us on LinkedIn: https://www.linkedin.com/company/ai4privacy/posts/ #OpenSource #DataPrivacy #LLM #Anonymization #AIsecurity #HuggingFace #Ai4Privacy #Worldslargestopensourceprivacymaskingdataset
posted
an
update
1 day ago
How do you prove your new, specialized AI model is a better solution? You test it against the best. That's why we were excited to see the new AdminBERT paper from researchers at Nantes Université and others. To show the strength of their new model for French administrative texts, they compared it to the state-of-the-art generalist model, NERmemBERT. The direct connection to our work is clear: NERmemBERT was trained on a combination of datasets, including the Pii-masking-200k dataset by Ai4Privacy. This is a perfect win-win for the open-source community. Our foundational dataset helps create a strong, general-purpose benchmark, which in turn helps researchers prove the value of their specialized work. This is how we all get better. 🔗 Great work by Thomas Sebbag, Solen Quiniou, Nicolas Stucky, and Emmanuel Morin on tackling a challenging domain! Check out their paper: https://aclanthology.org/2025.coling-main.27.pdf 🚀 Stay updated on the latest in privacy-preserving AI—follow us on LinkedIn: https://www.linkedin.com/company/ai4privacy/posts/ #OpenSource #DataPrivacy #LLM #Anonymization #AIsecurity #HuggingFace #Ai4Privacy #Worldslargestopensourceprivacymaskingdataset
View all activity
Organizations
MikeDoes
's Spaces
2
Sort: Recently updated
Running
1
Terminal Visualiser
💻
Create and download styled terminal screenshots
Running
1
TKG Visualiser
🌍
Visualize workflows from TSV data