10 8 5

Yu Zhang

AaronZ345

https://aaronz345.github.io

AI & ML interests

Multi-Modal Generative AI (Spatial Audio/Music/Singing/Speech).

Recent Activity

new activity 4 days ago

GTSinger/GTSinger:Annotation quality is very low, not usable for training

authored a paper 6 months ago

MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations

authored a paper 7 months ago

ASAudio: A Survey of Advanced Spatial Audio Research

View all activity

Organizations

New activity in GTSinger/GTSinger 4 days ago

Annotation quality is very low, not usable for training

#8 opened 4 days ago by

da1sypetals-iota

authored a paper 6 months ago

MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations

Paper • 2510.10396 • Published Oct 12, 2025

authored a paper 7 months ago

ASAudio: A Survey of Advanced Spatial Audio Research

Paper • 2508.10924 • Published Aug 8, 2025 • 1

upvoted a paper 7 months ago

ASAudio: A Survey of Advanced Spatial Audio Research

Paper • 2508.10924 • Published Aug 8, 2025 • 1

updated a dataset 8 months ago

AaronZ345/MRSDrama

Preview • Updated Aug 10, 2025 • 1.27k • 1

updated a dataset 9 months ago

AaronZ345/GTSinger

Viewer • Updated Jul 24, 2025 • 28.6k • 8.62k • 14

authored a paper 9 months ago

Conan: A Chunkwise Online Network for Zero-Shot Adaptive Voice Conversion

Paper • 2507.14534 • Published Jul 19, 2025

liked a dataset 9 months ago

AaronZ345/MRSDrama

Preview • Updated Aug 10, 2025 • 1.27k • 1

authored 3 papers 9 months ago

Leveraging Pretrained Diffusion Models for Zero-Shot Part Assembly

Paper • 2505.00426 • Published May 1, 2025

TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis

Paper • 2505.14910 • Published May 20, 2025 • 1

STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation

Paper • 2507.06670 • Published Jul 9, 2025

published a dataset 9 months ago

AaronZ345/MRSDrama

Preview • Updated Aug 10, 2025 • 1.27k • 1

liked a dataset 11 months ago

AaronZ345/GTSinger

Viewer • Updated Jul 24, 2025 • 28.6k • 8.62k • 14

published a dataset 11 months ago

AaronZ345/GTSinger

Viewer • Updated Jul 24, 2025 • 28.6k • 8.62k • 14

updated a model 11 months ago

AaronZ345/StyleSinger

Updated May 5, 2025 • 1

upvoted 2 papers 11 months ago

Robust Singing Voice Transcription Serves Synthesis

Paper • 2405.09940 • Published May 16, 2024 • 1

TechSinger: Technique Controllable Multilingual Singing Voice Synthesis via Flow Matching

Paper • 2502.12572 • Published Feb 18, 2025 • 2

authored 3 papers 11 months ago

TechSinger: Technique Controllable Multilingual Singing Voice Synthesis via Flow Matching

Paper • 2502.12572 • Published Feb 18, 2025 • 2

MegaTTS 3: Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis

Paper • 2502.18924 • Published Feb 26, 2025 • 16

Versatile Framework for Song Generation with Prompt-based Control

Paper • 2504.19062 • Published Apr 27, 2025 • 6

Yu Zhang

AI & ML interests

Recent Activity

Organizations

AaronZ345's activity

Annotation quality is very low, not usable for training