Dev Mode Explorers

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

DongfuJiang authored a paper 3 days ago

RewardHarness: Self-Evolving Agentic Post-Training

DongfuJiang authored a paper 3 days ago

Cosmos 3: Omnimodal World Models for Physical AI

DongfuJiang authored a paper 3 days ago

AutoLab: Can Frontier Models Solve Long-Horizon Auto Research and Engineering Tasks?

View all activity

DongfuJiang

authored 3 papers 3 days ago

RewardHarness: Self-Evolving Agentic Post-Training

Paper • 2605.08703 • Published about 1 month ago • 10

Cosmos 3: Omnimodal World Models for Physical AI

Paper • 2606.02800 • Published 8 days ago • 100

AutoLab: Can Frontier Models Solve Long-Horizon Auto Research and Engineering Tasks?

Paper • 2606.05080 • Published 6 days ago • 27

nielsr

submitted a paper to Daily Papers 5 days ago

Ultralytics YOLO26: Unified Real-Time End-to-End Vision Models

Paper • 2606.03748 • Published 7 days ago • 9

alielfilali01

posted an update 10 days ago

Post

357

Plans in HTML > Plans in Markdown

nielsr

submitted a paper to Daily Papers 12 days ago

Gemini Embedding 2: A Native Multimodal Embedding Model from Gemini

Paper • 2605.27295 • Published 14 days ago • 23

victor

posted an update 13 days ago

Post

892

Sharing how I built the LongCat-Video-Avatar 1.5 Space (+500k views on X) in one agent session. Gave a coding agent its own AI lab on ZeroGPU, framed the goal, walked away. It designed, deployed, tested against the live API, fixed, shipped.

Full recipe with the copy-paste prompt: https://huggingface.co/blog/victor/building-zerogpu-spaces-autonomously

1 reply

nielsr

submitted a paper to Daily Papers 18 days ago

Stable Audio 3

Paper • 2605.17991 • Published 22 days ago • 18

fffiloni

posted an update 21 days ago

Post

3488

I built HF Radio on Hugging Face Spaces 📻
fffiloni/HF-Radio

A live community radio for AI-generated songs, powered by tracks created with ACE-Step.

You can tune in, discover community-made songs in many languages, vote on what sounds good, and mark your real favorites as Bangers.

The more people listen, vote, and create, the better the station gets.

Under the hood, it connects a few Hugging Face pieces together:

Spaces for the live app, HF buckets for community tracks, OAuth for signed-in listeners, server-side streaming with ffmpeg, hourly playlist refreshes, moderation, jingles, and community feedback loops.

It’s not just a playlist.

It’s a shared taste experiment:
new songs get a shot every hour, and the community helps decide what deserves another spin.

Come listen.
Find weird gems.
Support the Bangers.
Shape the radio.

—> fffiloni/HF-Radio

Tonic

posted an update 25 days ago

Post

2836

🙋🏻‍♂️ Hey there folks ,

Turns out : if we predict 🌏 earth we can save a lot of time looking for interesting things and less time looking at things that we expect to see.

Sentinel-2 imagery 🛰️basically takes a long time to download towards earth. so our "near real time" systems are quite far from that in practical terms.

meanwhile , if we "predict" what we will see , based on what we do see , we can send down much less data in a timely way , and prioritize 📡earth-bound response .

I'm talking about illegal fishing , logging , mining or building in nature reserves , the more of that we predict early the more we're able to stop it on time.

At least that's the concept !

check out the blog : https://huggingface.co/blog/Tonic/save-patagonia-by-predicting-earth

- Collection: https://huggingface.co/collections/NuTonic/earth-observation-with-temporal-and-general-understanding
- Code: https://github.com/Josephrp/Nutonic
- Dataset: NuTonic/sat-vl-sft-training-ready-v1
- Model: NuTonic/lspace
- Training: NuTonic/lspace-trackio
- Evals: NuTonic/Patagonia_Eval

2 replies

DongfuJiang

authored 4 papers 25 days ago

Watch Before You Answer: Learning from Visually Grounded Post-Training

Paper • 2604.05117 • Published Apr 6 • 36

ClawBench: Can AI Agents Complete Everyday Online Tasks?

Paper • 2604.08523 • Published Apr 9 • 263

Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Paper • 2604.12374 • Published Apr 14 • 37

Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction

Paper • 2605.05242 • Published May 3 • 122

fffiloni

posted an update 26 days ago

Post

491

Great technical guide by Nico Martin on the Hugging Face blog, showing how to use Transformers.js inside a Chrome extension and run ONNX models from the Hub locally with WebGPU inside a Manifest V3 extension.

The interesting part: this is not just a chatbot in a side panel.

The article walks through the architecture behind a browser agent that can read open tabs, query webpages, search history, and highlight elements directly on the page — with models downloaded from the Hugging Face Hub, cached under the extension origin, and executed locally instead of being called through a remote API for every prompt.

A strong blueprint for building local-first web copilots, reading assistants, and AI-powered browsing workflows.

Article: https://huggingface.co/blog/transformersjs-chrome-extension

fffiloni

posted an update 28 days ago

Post

336

I’ve been reading “What if AI systems weren’t chatbots?”
What if AI systems weren't chatbots? (2605.07896) 👀

The paper asks a simple but important question: what if the chatbot interface is not just a neutral wrapper around AI models, but part of the problem?

A chatbot can make a system feel more capable, more certain, and more “human” than it really is. That matters, because interfaces shape how we trust, use, and delegate to AI systems.

When everything becomes: ask → answer
we can lose sight of the actual workflow:
- parameters
- alternatives
- uncertainty
- intermediate steps
- failure modes
- human control

For creative AI especially — image, video, editing, animation — I’m not sure “chat” should always be the default interface.

Sometimes we need a conversation.
But often we need a canvas, a timeline, sliders, masks, previews, comparisons, and visible pipelines.

This is also why I find many open ML demos interesting: Spaces, Gradio apps, visual tools, small focused interfaces.

They often explore another direction — not just better assistants, but better tools. 🤗

2 replies

Prabhjotschugh

authored a paper 29 days ago

When Less Is More: Simplicity Beats Complexity for Physics-Constrained InSAR Phase Unwrapping

Paper • 2605.00896 • Published Apr 28 • 1

Tonic

posted an update about 1 month ago

Post

4304

🙋🏻‍♂️ Hey there folks,

since everyone liked my previous announcement post ( https://huggingface.co/posts/Tonic/338509028435394 ) so much , i'm back with more high quality proceedural datasets in the Geospacial domain for SFT training !

Check this one out :
NuTonic/sat-bbox-metadata-sft-v1

the goal is to be able to train vision models on multiple images for remote sensing analysis with one shot .

hope you like it ! 🚀

2 replies

fffiloni

posted an update about 2 months ago

Post

690

Quietly baking Image → Music 🎵 v3 — now running on SOTA open-source models.
👉 fffiloni/image-2-music-v3 | Feel free to test it and share feedback.

Just wiring together: merve/moondream3 * victor/ace-step-jam

Image → prompt → audio | Early version, will evolve | Follow: @fffiloni

Tonic

posted an update about 2 months ago

Post

3651

🙋🏻‍♂️ Hey there folks ,

I'm sharing huggingface's largest dataset of annotated statelite images today.

check it out here : NuTonic/sat-image-boundingbox-sft-full

I hope you like it , the idea is to be able to use this with small vision models 🚀

AI & ML interests

Recent Activity

Team members 144

dev-mode-explorers's activity