26 research outputs found
An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
Generative Adversarial Network (GAN) based vocoders are superior in both
inference speed and synthesis quality when reconstructing an audible waveform
from an acoustic representation. This study focuses on improving the
discriminator for GAN-based vocoders. Most existing Time-Frequency
Representation (TFR)-based discriminators are rooted in Short-Time Fourier
Transform (STFT), which owns a constant Time-Frequency (TF) resolution,
linearly scaled center frequencies, and a fixed decomposition basis, making it
incompatible with signals like singing voices that require dynamic attention
for different frequency bands and different time intervals. Motivated by that,
we propose a Multi-Scale Sub-Band Constant-Q Transform CQT (MS-SB-CQT)
discriminator and a Multi-Scale Temporal-Compressed Continuous Wavelet
Transform CWT (MS-TC-CWT) discriminator. Both CQT and CWT have a dynamic TF
resolution for different frequency bands. In contrast, CQT has a better
modeling ability in pitch information, and CWT has a better modeling ability in
short-time transients. Experiments conducted on both speech and singing voices
confirm the effectiveness of our proposed discriminators. Moreover, the STFT,
CQT, and CWT-based discriminators can be used jointly for better performance.
The proposed discriminators can boost the synthesis quality of various
state-of-the-art GAN-based vocoders, including HiFi-GAN, BigVGAN, and APNet.Comment: arXiv admin note: text overlap with arXiv:2311.1495
SponTTS: modeling and transferring spontaneous style for TTS
Spontaneous speaking style exhibits notable differences from other speaking
styles due to various spontaneous phenomena (e.g., filled pauses, prolongation)
and substantial prosody variation (e.g., diverse pitch and duration variation,
occasional non-verbal speech like a smile), posing challenges to modeling and
prediction of spontaneous style. Moreover, the limitation of high-quality
spontaneous data constrains spontaneous speech generation for speakers without
spontaneous data. To address these problems, we propose SponTTS, a two-stage
approach based on neural bottleneck (BN) features to model and transfer
spontaneous style for TTS. In the first stage, we adopt a Conditional
Variational Autoencoder (CVAE) to capture spontaneous prosody from a BN feature
and involve the spontaneous phenomena by the constraint of spontaneous
phenomena embedding prediction loss. Besides, we introduce a flow-based
predictor to predict a latent spontaneous style representation from the text,
which enriches the prosody and context-specific spontaneous phenomena during
inference. In the second stage, we adopt a VITS-like module to transfer the
spontaneous style learned in the first stage to the target speakers.
Experiments demonstrate that SponTTS is effective in modeling spontaneous style
and transferring the style to the target speakers, generating spontaneous
speech with high naturalness, expressiveness, and speaker similarity. The
zero-shot spontaneous style TTS test further verifies the generalization and
robustness of SponTTS in generating spontaneous speech for unseen speakers.Comment: 5 pages, 3 figures, Accepted by ICASSP202
Single-Codec: Single-Codebook Speech Codec towards High-Performance Speech Generation
The multi-codebook speech codec enables the application of large language
models (LLM) in TTS but bottlenecks efficiency and robustness due to
multi-sequence prediction. To avoid this obstacle, we propose Single-Codec, a
single-codebook single-sequence codec, which employs a disentangled VQ-VAE to
decouple speech into a time-invariant embedding and a phonetically-rich
discrete sequence. Furthermore, the encoder is enhanced with 1) contextual
modeling with a BLSTM module to exploit the temporal information, 2) a hybrid
sampling module to alleviate distortion from upsampling and downsampling, and
3) a resampling module to encourage discrete units to carry more phonetic
information. Compared with multi-codebook codecs, e.g., EnCodec and TiCodec,
Single-Codec demonstrates higher reconstruction quality with a lower bandwidth
of only 304bps. The effectiveness of Single-Code is further validated by
LLM-TTS experiments, showing improved naturalness and intelligibility.Comment: Accepted by Interspeech 202
Поліморфізм качок породи shaoxing за мікросателітними локусами
Microsatellite markers are now been widely used for the detection and description of micropopulation processes occurring in the populations of domestic animals for the effects of various factors of breeding pressure. Microsatellite loci distributed throughout eukaryotic genomes, making them the preferred genetic marker for high resolution genetic mapping. In recent years, rapid advances have been made in the development of molecular genetic maps. High-density linkage maps are now available for many farm animals, such as cattle, pigs, and goats. In contrast, mapping studies in avian species are much less advanced except in the chicken. According to FAO about 70% of ducks are bred in China. This country is a leader in growing ducks. The Shaoxing breed is one of the three major duck breeds in China. Ducks of this breed are characterized by high performance. According to the Bureau of Product Quality, the age of maturity (the beginning of egg laying) in these birds occurs at 130–140 days. The characteristics of the Shaoxing breed include the fact that the peak period of laying eggs lasts from eight to ten months. On average, one duck in 500 days gives from 290 to 310 eggs, which is one of the highest rates for egg breeds. That is why the purpose of our study was the microsatellite analysis of two populations of Shaoxing breed with 9 locuses was conducted. The selection of birds for the study were carried out on a duck farms in Zhejiang Generation Biological Science and Technology Co., Ltd. and Zhuji Guowei Poultry Development Co, Ltd., and at the laboratory of the Jjejiang Academy of Sciences Institute. Samples collection and DNA preparation: Venous blood samples were collected from 480 ducks (240 ducks of population I and 240 ducks of population II of the Shaoxing breeds) of both populations into 3 ml tubes containing EDTA as anticoagulant agent. In total of 9 investigated loci in the Shaoxing breed population, only one locus was monomorphic (SMO10). The number of different alleles (Na) for each polymorphic locus ranged from 2 (SMO12) to 13 (APL79, CMO11) in population I and from 2 (APL78, SMO12) to 7 (APL79) in population II. On average, one locus had 5.889 alleles in population I and 3.889 of alleles in the population II. The effective number of alleles (Nе) was 1.735 in population I and 1.599 in population II. The number of alleles and the expected heterozygosity (Hexp) values can provide important information for the discrimination of individuals and breeds. The index of expected heterozygosity in population I was 0.336 and 0.307 in population II. The information index (I) was 0,702 in population I and 0,576 in population II. For each population was found private alleles, in population I 6 alleles and in population II just 4 alleles. The results show high level of polymorphism of the studied populations of ducks. The obtained results can be used in the creation of new lines of ducks.У статті наведені результати досліджень генетичної структури двох популяцій качок породи шаосінь за використання дев’яти мікросателітних локусів. Птицю досліджували на качиних фермах компаній Zhejiang Generation Biological Science and Technology Co., Ltd. та Zhuji Guowei Poultry Development Co, Ltd. за підтримки лабораторії Poultry Genetics Laboratory of the Zhejiang Academy of Sciences (Zhejiang Province, PRC). Було встановлено, що середнє число ефективних алелів (Ne) на локус у популяції І складало 1,735, а для популяції ІІ – 1,599. Показники інформаційного індексу становили 0,702 (популяція І) та 0,576 (популяція ІІ). Фактична гетерозиготність у популяції І була 0,298, а у популяції ІІ – 0,269. У результаті нашого дослідження для кожної популяції були виявлені приватні алелі. З 9 досліджених локусів, у популяції І було виявлено 6 приватних алелів, в той час, коли популяція ІІ мала лише 4 локуси. Загалом у популяції І виявлено 23 приватних алелів, а у популяції ІІ – 5. Найбільша кількість приватних алелів була в локусі CMO11 (9), а найменша – 1 алель у локусі SMO7 та SMO10 в популяції І. Популяція ІІ була бідніша на приватні алелі, так у локусі APL79 було 2 та по 1 у CMO11, SMO7, SMO10. Отримані результати свідчать про високий рівень внутрішньопородного поліморфізму шаосінь, що дозволяє розробку стратегій збереження та використання генетичних ресурсів качки за використання аналізу поліморфних локусів мікросателіті
ANALYSIS OF MORPHOMETRIC PARAMETERS DUCK EGGS OF LOCAL BREED SHAOXING
The efficiency of industrial poultry farming within the optimization of poultry technology, depends on the level of genetic potential of the flock. Selection features of Shaoxing ducks make this kind optimal for its breeding in the People's Republic of China. The study aims to evaluate the morphometric characteristics of Shaoxing duck eggs, which are bred on the breeding farm of Zhejiang Generation Biological Science and Technology Co., Ltd in Zhuji, Zhejiang Province, China. The weight, length, width of the eggs and the index of the egg shape have been determined. An individual method of counting the number of eggs laid by ducks of the Shaoxing breed for 4 adjacent months has been implemented. The average weight of the egg is 67.45 ± 0.22 g with limit values lim max = 89 g lim min = 45 g. The average value of egg length is 6.02 ± 0.01 cm, width – 4.45 ± 0.01 cm. The duck egg shape index is 74.01 ± 0.12. Thereby systematic individual studies of morphometric parameters of eggs will increase the effect of selection by expanding the indicators of lifelong assessment of the uterine population of ducks. Selection of queens for the breeding core of the breed according to the indicators of manufacturability of morphometric parameters of eggs will increase the incubation yield of ducklings and, accordingly, will be one of the effective mechanisms to ensure economic profitability of breeding Shaoxing ducks.</jats:p
Differences of methanogenesis between mesophilic and thermophilic in situ biogas-upgrading systems by hydrogen addition
Abstract
To investigate the differences in microbial community structure between mesophilic and thermophilic in situ biogas-upgrading systems by H2 addition, two reactors (35 °C and 55 °C) were run for four stages according to different H2 addition rates (H2/CO2 of 0:1, 1:1, and 4:1) and mixing mode (intermittent and continuous). 16S rRNA gene-sequencing technology was applied to analyze microbial community structure. The results showed that the temperature is a crucial factor in impacting succession of microbial community structure and the H2 utilization pathway. For mesophilic digestion, most of added H2 was consumed indirectly by the combination of homoacetogens and strict aceticlastic methanogens. In the thermophilic system, most of added H2 may be used for microbial cell growth, and part of H2 was utilized directly by strict hydrogenotrophic methanogens and facultative aceticlastic methanogens. Continuous stirring was harmful to the stabilization of mesophilic system, but not to the thermophilic one.</jats:p
