AI & ML interests

AI for the physical world, TinyML, Embedded Systems

Recent Activity

kevemanย  updated a model about 9 hours ago
UsefulSensors/moonshine-streaming-small
kevemanย  updated a model about 10 hours ago
UsefulSensors/moonshine-streaming-medium
kevemanย  published a model about 10 hours ago
UsefulSensors/moonshine-streaming-medium
View all activity

Xenovaย 
posted an update 5 months ago
view post
Post
14258
Okay this is insane... WebGPU-accelerated semantic video tracking, powered by DINOv3 and Transformers.js! ๐Ÿคฏ
Demo (+ source code): webml-community/DINOv3-video-tracking

This will revolutionize AI-powered video editors... which can now run 100% locally in your browser, no server inference required (costs $0)! ๐Ÿ˜

How does it work? ๐Ÿค”
1๏ธโƒฃ Generate and cache image features for each frame
2๏ธโƒฃ Create a list of embeddings for selected patch(es)
3๏ธโƒฃ Compute cosine similarity between each patch and the selected patch(es)
4๏ธโƒฃ Highlight those whose score is above some threshold

... et voilร ! ๐Ÿฅณ

You can also make selections across frames to improve temporal consistency! This is super useful if the object changes its appearance slightly throughout the video.

Excited to see what the community builds with it!
  • 2 replies
ยท
Xenovaย 
posted an update 5 months ago
view post
Post
4632
The next generation of AI-powered websites is going to be WILD! ๐Ÿคฏ

In-browser tool calling & MCP is finally here, allowing LLMs to interact with websites programmatically.

To show what's possible, I built a demo using Liquid AI's new LFM2 model, powered by ๐Ÿค— Transformers.js: LiquidAI/LFM2-WebGPU

As always, the demo is open source (which you can find under the "Files" tab), so I'm excited to see how the community builds upon this! ๐Ÿš€
  • 2 replies
ยท
Xenovaย 
posted an update 6 months ago
view post
Post
3491
Introducing Voxtral WebGPU: State-of-the-art audio transcription directly in your browser! ๐Ÿคฏ
๐Ÿ—ฃ๏ธ Transcribe videos, meeting notes, songs and more
๐Ÿ” Runs on-device, meaning no data is sent to a server
๐ŸŒŽ Multilingual (8 languages)
๐Ÿค— Completely free (forever) & open source

That's right, we're running Mistral's new Voxtral-Mini-3B model 100% locally in-browser on WebGPU, powered by Transformers.js and ONNX Runtime Web! ๐Ÿ”ฅ

Try it out yourself! ๐Ÿ‘‡
webml-community/Voxtral-WebGPU
Xenovaย 
posted an update 7 months ago
view post
Post
7412
NEW: Real-time conversational AI models can now run 100% locally in your browser! ๐Ÿคฏ

๐Ÿ” Privacy by design (no data leaves your device)
๐Ÿ’ฐ Completely free... forever
๐Ÿ“ฆ Zero installation required, just visit a website
โšก๏ธ Blazingly-fast WebGPU-accelerated inference

Try it out: webml-community/conversational-webgpu

For those interested, here's how it works:
- Silero VAD for voice activity detection
- Whisper for speech recognition
- SmolLM2-1.7B for text generation
- Kokoro for text to speech

Powered by Transformers.js and ONNX Runtime Web! ๐Ÿค— I hope you like it!
ยท
Xenovaย 
posted an update 8 months ago
view post
Post
8600
Introducing the ONNX model explorer: Browse, search, and visualize neural networks directly in your browser. ๐Ÿคฏ A great tool for anyone studying Machine Learning! We're also releasing the entire dataset of graphs so you can use them in your own projects! ๐Ÿค—

Check it out! ๐Ÿ‘‡
Demo: onnx-community/model-explorer
Dataset: onnx-community/model-explorer
Source code: https://github.com/xenova/model-explorer