Search CORE

26 research outputs found

An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder

Author: Gu Yicheng
Li Haizhou
Wu Zhizheng
Xue Liumeng
Zhang Xueyao
Publication venue
Publication date: 26/04/2024
Field of study

Generative Adversarial Network (GAN) based vocoders are superior in both inference speed and synthesis quality when reconstructing an audible waveform from an acoustic representation. This study focuses on improving the discriminator for GAN-based vocoders. Most existing Time-Frequency Representation (TFR)-based discriminators are rooted in Short-Time Fourier Transform (STFT), which owns a constant Time-Frequency (TF) resolution, linearly scaled center frequencies, and a fixed decomposition basis, making it incompatible with signals like singing voices that require dynamic attention for different frequency bands and different time intervals. Motivated by that, we propose a Multi-Scale Sub-Band Constant-Q Transform CQT (MS-SB-CQT) discriminator and a Multi-Scale Temporal-Compressed Continuous Wavelet Transform CWT (MS-TC-CWT) discriminator. Both CQT and CWT have a dynamic TF resolution for different frequency bands. In contrast, CQT has a better modeling ability in pitch information, and CWT has a better modeling ability in short-time transients. Experiments conducted on both speech and singing voices confirm the effectiveness of our proposed discriminators. Moreover, the STFT, CQT, and CWT-based discriminators can be used jointly for better performance. The proposed discriminators can boost the synthesis quality of various state-of-the-art GAN-based vocoders, including HiFi-GAN, BigVGAN, and APNet.Comment: arXiv admin note: text overlap with arXiv:2311.1495

arXiv.org e-Print Archive

SponTTS: modeling and transferring spontaneous style for TTS

Author: Chen Yunlin
Li Hanzhao
Song Yang
Xie Lei
Xue Liumeng
Zhu Xinfa
Publication venue
Publication date: 08/01/2024
Field of study

Spontaneous speaking style exhibits notable differences from other speaking styles due to various spontaneous phenomena (e.g., filled pauses, prolongation) and substantial prosody variation (e.g., diverse pitch and duration variation, occasional non-verbal speech like a smile), posing challenges to modeling and prediction of spontaneous style. Moreover, the limitation of high-quality spontaneous data constrains spontaneous speech generation for speakers without spontaneous data. To address these problems, we propose SponTTS, a two-stage approach based on neural bottleneck (BN) features to model and transfer spontaneous style for TTS. In the first stage, we adopt a Conditional Variational Autoencoder (CVAE) to capture spontaneous prosody from a BN feature and involve the spontaneous phenomena by the constraint of spontaneous phenomena embedding prediction loss. Besides, we introduce a flow-based predictor to predict a latent spontaneous style representation from the text, which enriches the prosody and context-specific spontaneous phenomena during inference. In the second stage, we adopt a VITS-like module to transfer the spontaneous style learned in the first stage to the target speakers. Experiments demonstrate that SponTTS is effective in modeling spontaneous style and transferring the style to the target speakers, generating spontaneous speech with high naturalness, expressiveness, and speaker similarity. The zero-shot spontaneous style TTS test further verifies the generalization and robustness of SponTTS in generating spontaneous speech for unseen speakers.Comment: 5 pages, 3 figures, Accepted by ICASSP202

arXiv.org e-Print Archive

Single-Codec: Single-Codebook Speech Codec towards High-Performance Speech Generation

Author: Chen Yunlin
Guo Haohan
Li Hanzhao
Li Zhifei
Lv Yuanjun
Xie Lei
Xue Liumeng
Yin Hao
Zhu Xinfa
Publication venue
Publication date: 11/06/2024
Field of study

The multi-codebook speech codec enables the application of large language models (LLM) in TTS but bottlenecks efficiency and robustness due to multi-sequence prediction. To avoid this obstacle, we propose Single-Codec, a single-codebook single-sequence codec, which employs a disentangled VQ-VAE to decouple speech into a time-invariant embedding and a phonetically-rich discrete sequence. Furthermore, the encoder is enhanced with 1) contextual modeling with a BLSTM module to exploit the temporal information, 2) a hybrid sampling module to alleviate distortion from upsampling and downsampling, and 3) a resampling module to encourage discrete units to carry more phonetic information. Compared with multi-codebook codecs, e.g., EnCodec and TiCodec, Single-Codec demonstrates higher reconstruction quality with a lower bandwidth of only 304bps. The effectiveness of Single-Code is further validated by LLM-TTS experiments, showing improved naturalness and intelligibility.Comment: Accepted by Interspeech 202

arXiv.org e-Print Archive

Поліморфізм качок породи shaoxing за мікросателітними локусами

Author: Chepiha A. M.
Doroshenko M. S.
Konoval O. M.
Korol P. V.
Kostenko S. O.
Liumeng Li
Lizhi Lu
Xuetao Huang
Publication venue: 'Oles Honchar Dnipropetrovsk National University'
Publication date: 28/03/2018
Field of study

Microsatellite markers are now been widely used for the detection and description of micropopulation processes occurring in the populations of domestic animals for the effects of various factors of breeding pressure. Microsatellite loci distributed throughout eukaryotic genomes, making them the preferred genetic marker for high resolution genetic mapping. In recent years, rapid advances have been made in the development of molecular genetic maps. High-density linkage maps are now available for many farm animals, such as cattle, pigs, and goats. In contrast, mapping studies in avian species are much less advanced except in the chicken. According to FAO about 70% of ducks are bred in China. This country is a leader in growing ducks. The Shaoxing breed is one of the three major duck breeds in China. Ducks of this breed are characterized by high performance. According to the Bureau of Product Quality, the age of maturity (the beginning of egg laying) in these birds occurs at 130–140 days. The characteristics of the Shaoxing breed include the fact that the peak period of laying eggs lasts from eight to ten months. On average, one duck in 500 days gives from 290 to 310 eggs, which is one of the highest rates for egg breeds. That is why the purpose of our study was the microsatellite analysis of two populations of Shaoxing breed with 9 locuses was conducted. The selection of birds for the study were carried out on a duck farms in Zhejiang Generation Biological Science and Technology Co., Ltd. and Zhuji Guowei Poultry Development Co, Ltd., and at the laboratory of the Jjejiang Academy of Sciences Institute. Samples collection and  DNA  preparation: Venous blood  samples  were  collected  from 480 ducks (240 ducks of population I and 240 ducks of population II of the Shaoxing breeds)  of  both populations  into  3  ml  tubes  containing  EDTA  as anticoagulant agent. In total of 9 investigated loci in the Shaoxing breed population, only one locus was monomorphic (SMO10). The number of different alleles (Na) for each polymorphic locus ranged from 2 (SMO12) to 13 (APL79, CMO11) in population I and from 2 (APL78, SMO12) to 7 (APL79) in population II. On average, one locus had 5.889 alleles in population I and 3.889 of alleles in the population II. The effective number of alleles (Nе) was 1.735 in population I and 1.599 in population II. The number of alleles and the expected heterozygosity (Hexp) values can provide important information for the discrimination of individuals and breeds. The index of expected heterozygosity in population I was 0.336 and 0.307 in population II. The information index (I) was 0,702 in population I and 0,576 in population II. For each population was found private alleles, in population I 6 alleles and in population II just 4 alleles. The results show high level of polymorphism of the studied populations of ducks. The obtained results can be used in the creation of new lines of ducks.У статті наведені результати досліджень генетичної структури двох популяцій качок породи шаосінь за використання дев’яти мікросателітних локусів. Птицю досліджували на качиних фермах компаній Zhejiang Generation Biological Science and Technology Co., Ltd. та Zhuji Guowei Poultry Development Co, Ltd. за підтримки лабораторії Poultry Genetics Laboratory of the Zhejiang Academy of Sciences (Zhejiang Province, PRC). Було встановлено, що середнє число ефективних алелів (Ne) на локус у популяції І складало 1,735, а для популяції ІІ – 1,599. Показники інформаційного індексу становили 0,702 (популяція І) та 0,576 (популяція ІІ). Фактична гетерозиготність у популяції І була 0,298, а у популяції ІІ – 0,269. У результаті нашого дослідження для кожної популяції були виявлені приватні алелі. З 9 досліджених локусів, у популяції І було виявлено 6 приватних алелів, в той час, коли популяція ІІ мала лише 4 локуси. Загалом у популяції І виявлено 23 приватних алелів, а у популяції ІІ – 5. Найбільша кількість приватних алелів була в локусі CMO11 (9), а найменша – 1 алель у локусі SMO7 та SMO10 в популяції І. Популяція ІІ була бідніша на приватні алелі, так у локусі APL79 було 2 та по 1 у CMO11, SMO7, SMO10. Отримані результати свідчать про високий рівень внутрішньопородного поліморфізму шаосінь, що дозволяє розробку стратегій збереження та використання генетичних ресурсів качки за використання аналізу поліморфних локусів мікросателіті

Crossref

Scientific Messenger of Lviv University of Veterinary Medicine and Biotechnology / Науковий вісник ЛНУ ветеринарної медицини та біотехнологій

ANALYSIS OF MORPHOMETRIC PARAMETERS DUCK EGGS OF LOCAL BREED SHAOXING

Author: Bindan CHEN
Liumeng LI
Olena SYDORENKO
Pavlyna DZHUS
Publication venue: National and University Library of the Republic of Srpska
Publication date: 10/03/2021
Field of study

The efficiency of industrial poultry farming within the optimization of poultry technology, depends on the level of genetic potential of the flock. Selection features of Shaoxing ducks make this kind optimal for its breeding in the People's Republic of China. The study aims to evaluate the morphometric characteristics of Shaoxing duck eggs, which are bred on the breeding farm of Zhejiang Generation Biological Science and Technology Co., Ltd in Zhuji, Zhejiang Province, China. The weight, length, width of the eggs and the index of the egg shape have been determined. An individual method of counting the number of eggs laid by ducks of the Shaoxing breed for 4 adjacent months has been implemented. The average weight of the egg is 67.45 ± 0.22 g with limit values lim max = 89 g lim min = 45 g. The average value of egg length is 6.02 ± 0.01 cm, width – 4.45 ± 0.01 cm. The duck egg shape index is 74.01 ± 0.12. Thereby systematic individual studies of morphometric parameters of eggs will increase the effect of selection by expanding the indicators of lifelong assessment of the uterine population of ducks. Selection of queens for the breeding core of the breed according to the indicators of manufacturability of morphometric parameters of eggs will increase the incubation yield of ducklings and, accordingly, will be one of the effective mechanisms to ensure economic profitability of breeding Shaoxing ducks.</jats:p

Crossref

Wetting of SiC by molten Cu–20Me–2Cr (Me=Ag, Mn, Si, and Sn) alloys at 1373 K

Author: Hongyu Yang
Liumeng Li
Lu Liu
Qiaoli Lin
Publication venue: Elsevier BV
Publication date: 01/03/2021
Field of study

Crossref

Controllable Emotion Transfer For End-to-End Speech Synthesis

Author: Lei Xie
Liumeng Xue
Shan Yang
Tao Li
Publication venue: Institute of Electrical and Electronics Engineers (IEEE)
Publication date: 24/01/2021
Field of study

Crossref

Differences of methanogenesis between mesophilic and thermophilic in situ biogas-upgrading systems by hydrogen addition

Author: Dong Li
Liumeng Chen
Qin Cao
Xianpu Zhu
Xiaofeng Liu
Yichao Chen
Publication venue: Oxford University Press (OUP)
Publication date: 01/11/2019
Field of study

Abstract To investigate the differences in microbial community structure between mesophilic and thermophilic in situ biogas-upgrading systems by H2 addition, two reactors (35 °C and 55 °C) were run for four stages according to different H2 addition rates (H2/CO2 of 0:1, 1:1, and 4:1) and mixing mode (intermittent and continuous). 16S rRNA gene-sequencing technology was applied to analyze microbial community structure. The results showed that the temperature is a crucial factor in impacting succession of microbial community structure and the H2 utilization pathway. For mesophilic digestion, most of added H2 was consumed indirectly by the combination of homoacetogens and strict aceticlastic methanogens. In the thermophilic system, most of added H2 may be used for microbial cell growth, and part of H2 was utilized directly by strict hydrogenotrophic methanogens and facultative aceticlastic methanogens. Continuous stirring was harmful to the stabilization of mesophilic system, but not to the thermophilic one.</jats:p

Crossref

Surface water changes in China's Yangtze River Delta over the past forty years

Author: Haitao Zhang
Jialin Li
Liumeng Chen
Peng Tian
Yongchao Liu
Publication venue: Elsevier BV
Publication date: 01/04/2023
Field of study

Crossref