top of page

Everyone-Can-Sing: Zero-Shot Singing Voice Synthesis and Conversion with Speech Reference

This is my internship project at Adobe Research in the summer of 2023. This project has been written into a paper which is now under review. 

Input:    1. Score, lyrics (specify which language), style

              2. 5-second speech audio of target (unseen target voice in training data)

Output: Singing in the target’s voice

Demo Example

  • Input Speech Target 1 (Female voice)

  • Output Singing

    • ​An English Pop Song

    • A Chinese Folk Song

  • Input Speech Target 2 (Female voice)

  • Output Singing

    • ​An English Pop Song

    • A Chinese Folk Song

  • ​Input  Speech Target 3 (Male voice)

  • Output Singing

    • ​A Chinese Folk Song

    • An English Pop Song

    • An Italian Opera Song

00:00 / 00:05
00:00 / 00:18
00:00 / 00:41
00:00 / 00:06
00:00 / 00:37
00:00 / 00:32
00:00 / 00:09
00:00 / 00:07
00:00 / 00:36
00:00 / 00:10

@ Shuqi Dai 2024  | All Rights Reserved

bottom of page