Zero-shot speaker adaptation seeks to enable the cloning of voices for previously unseen speakers by leveraging only a few seconds of their speech samples. Nevertheless, existing zero-shot ...