This repository is the official PyTorch implementation of our AAAI-2022 paper, in which we propose DiffSinger (for Singing-Voice-Synthesis) and DiffSpeech (for Text-to-Speech).
Abstract: We introduce unified source-filter generative adversarial networks (uSFGAN), a waveform generative model conditioned on acoustic features, which represents the source-filter architecture in ...
Abstract: Vocoders, encoding speech signals into acoustic features and allowing for speech signal reconstruction from them, have been studied for decades. Recently, the rise of deep learning has ...
No one wants their time wasted, including by music. But the rise of TikTok-optimized tracks less than two minutes long might be going too far. The best long songs are an antidote to that—a way of ...
Works with wav (only 8000 Hz, mono, 1 byte PCM_UNSIGNED). I recommend Audacity to convert to this format Configurable quality (from 0 to 10), default is 4 Good compression (from 88 bytes/s to 530 ...
Nearly 50 years on from posing existential queries on Talking Heads' art-pop masterpiece Once In A Lifetime ("How did I get here?'… Am I right? Am I wrong?"), David Byrne is still asking lots of big ...
There’s a moment in KPop Demon Hunters where the film stops looking for a perfect ending and chooses the messier one. “What It Sounds Like” is that choice set to a stadium-scale pop finale. It closes ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results