141 research outputs found

    The Score-Difference Flow for Implicit Generative Modeling

    Full text link
    Implicit generative modeling (IGM) aims to produce samples of synthetic data matching the characteristics of a target data distribution. Recent work (e.g. score-matching networks, diffusion models) has approached the IGM problem from the perspective of pushing synthetic source data toward the target distribution via dynamical perturbations or flows in the ambient space. In this direction, we present the score difference (SD) between arbitrary target and source distributions as a flow that optimally reduces the Kullback-Leibler divergence between them while also solving the Schroedinger bridge problem. We apply the SD flow to convenient proxy distributions, which are aligned if and only if the original distributions are aligned. We demonstrate the formal equivalence of this formulation to denoising diffusion models under certain conditions. We also show that the training of generative adversarial networks includes a hidden data-optimization sub-problem, which induces the SD flow under certain choices of loss function when the discriminator is optimal. As a result, the SD flow provides a theoretical link between model classes that individually address the three challenges of the "generative modeling trilemma" -- high sample quality, mode coverage, and fast sampling -- thereby setting the stage for a unified approach.Comment: 25 pages, 5 figures, 4 tables. To appear in Transactions on Machine Learning Research (TMLR

    Audio Feature Extraction with Convolutional Neural Autoencoders with Application to Voice Conversion

    Get PDF
    Feature extraction is a key step in many machine learning and signal processing applications. For speech signals in particular, it is important to derive features that contain both the vocal characteristics of the speaker and the content of the speech. In this paper, we introduce a convolutional auto-encoder (CAE) to extract features from speech represented via proposed short-time discrete cosine transform (STDCT). We then introduce a deep neural mapping at the encoding bottleneck to enable converting a source speaker’s speech to a target speaker’s speech while preserving the source-speech content. We further compare this approach to clustering-based and linear mappings

    CADS: Unleashing the Diversity of Diffusion Models through Condition-Annealed Sampling

    Full text link
    While conditional diffusion models are known to have good coverage of the data distribution, they still face limitations in output diversity, particularly when sampled with a high classifier-free guidance scale for optimal image quality or when trained on small datasets. We attribute this problem to the role of the conditioning signal in inference and offer an improved sampling strategy for diffusion models that can increase generation diversity, especially at high guidance scales, with minimal loss of sample quality. Our sampling strategy anneals the conditioning signal by adding scheduled, monotonically decreasing Gaussian noise to the conditioning vector during inference to balance diversity and condition alignment. Our Condition-Annealed Diffusion Sampler (CADS) can be used with any pretrained model and sampling algorithm, and we show that it boosts the diversity of diffusion models in various conditional generation tasks. Further, using an existing pretrained diffusion model, CADS achieves a new state-of-the-art FID of 1.70 and 2.31 for class-conditional ImageNet generation at 256×\times256 and 512×\times512 respectively.Comment: Published as a conference paper at ICLR 202

    LiteVAE: Lightweight and Efficient Variational Autoencoders for Latent Diffusion Models

    Full text link
    Advances in latent diffusion models (LDMs) have revolutionized high-resolution image generation, but the design space of the autoencoder that is central to these systems remains underexplored. In this paper, we introduce LiteVAE, a family of autoencoders for LDMs that leverage the 2D discrete wavelet transform to enhance scalability and computational efficiency over standard variational autoencoders (VAEs) with no sacrifice in output quality. We also investigate the training methodologies and the decoder architecture of LiteVAE and propose several enhancements that improve the training dynamics and reconstruction quality. Our base LiteVAE model matches the quality of the established VAEs in current LDMs with a six-fold reduction in encoder parameters, leading to faster training and lower GPU memory requirements, while our larger model outperforms VAEs of comparable complexity across all evaluated metrics (rFID, LPIPS, PSNR, and SSIM)

    Relative Age Effects Across and Within Female Sport Contexts: A Systematic Review and Meta-Analysis

    Get PDF
    Subtle differences in chronological age within sport (bi-) annual-age groupings can contribute to immediate participation and long-term attainment discrepancies; known as the relative age effect. Voluminous studies have examined relative age effects in male sport; however, their prevalence and context-specific magnitude in female sport remain undetermined. The objective of this study was to determine the prevalence and magnitude of relative age effects in female sport via examination of published data spanning 1984–2016. Registered with PROSPERO (No. 42016053497) and using Preferred Reporting Items for Systematic Reviews and Meta-analysis systematic search guidelines, 57 studies were identified, containing 308 independent samples across 25 sports. Distribution data were synthesised using odds ratio meta-analyses, applying an invariance random-effects model. Follow-up subgroup category analyses examined whether relative age effect magnitudes were moderated by age group, competition level, sport type, sport context and study quality. When comparing the relatively oldest (quartile 1) vs. youngest (quartile 4) individuals across all female sport contexts, the overall pooled estimate identified a significant but small relative age effect (odds ratio=1.25; 95% confidence interval 1.21–1.30; p=0.01; odds ratio adjusted=1.21). Subgroup analyses revealed the relative age effect magnitude was higher in pre-adolescent (≤ 11 years) and adolescent (12–14 years) age groups and at higher competition levels. Relative age effect magnitudes were higher in team-based and individual sport contexts associated with high physiological demands. The findings highlight relative age effects are prevalent across the female sport contexts examined. Relative age effect magnitude is moderated by interactions between developmental stages, competition level and sport context demands. Modifications to sport policy, organisational and athlete development system structure, as well as practitioner intervention are recommended to prevent relative age effect-related participation and longer term attainment inequalities

    No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models

    Full text link
    Classifier-free guidance (CFG) has become the standard method for enhancing the quality of conditional diffusion models. However, employing CFG requires either training an unconditional model alongside the main diffusion model or modifying the training procedure by periodically inserting a null condition. There is also no clear extension of CFG to unconditional models. In this paper, we revisit the core principles of CFG and introduce a new method, independent condition guidance (ICG), which provides the benefits of CFG without the need for any special training procedures. Our approach streamlines the training process of conditional diffusion models and can also be applied during inference on any pre-trained conditional model. Additionally, by leveraging the time-step information encoded in all diffusion networks, we propose an extension of CFG, called time-step guidance (TSG), which can be applied to any diffusion model, including unconditional ones. Our guidance techniques are easy to implement and have the same sampling cost as CFG. Through extensive experiments, we demonstrate that ICG matches the performance of standard CFG across various conditional diffusion models. Moreover, we show that TSG improves generation quality in a manner similar to CFG, without relying on any conditional information

    Differences in Race Characteristics between World-Class Individual-Medley and Stroke-Specialist Swimmers

    Get PDF
    The purpose of the present study was to investigate differences between world-class individual medley (IM) swimmers and stroke-specialists using race analyses. A total of eighty 200 m races (8 finalists × 2 sexes × 5 events) at the 2021 European long-course swimming championships were analysed. Eight digital video cameras recorded the races, and the video footage was manually analysed to obtain underwater distance, underwater time, and underwater speed, as well as clean-swimming speed, stroke rate, and distance per stroke. Each lap of the IM races was compared with the first, second, third, and fourth laps of butterfly, backstroke, breaststroke, and freestyle races, respectively. Differences between IM swimmers and specialists in each analysed variable were assessed using an independent-sample t-test, and the effects of sex and stroke on the differences were analysed using a two-way analysis of variance with relative values (IM swimmers’ score relative to the mean specialists’ score) as dependent variables. Breaststroke specialists showed faster clean-swimming speed and longer distance per stroke than IM swimmers for both males (clean-swimming speed: p = 0.011; distance per stroke: p = 0.023) and females (clean-swimming speed: p = 0.003; distance per stroke: p = 0.036). For backstroke and front crawl, specialists exhibited faster underwater speeds than IM swimmers (all p < 0.001). Females showed faster relative speeds during butterfly clean-swimming segments (p < 0.001) and breaststroke underwater segments than males (p = 0.028). IM swimmers should focus especially on breaststroke training, particularly aiming to improve their distance per stroke

    Ionic liquids at electrified interfaces

    Get PDF
    Until recently, “room-temperature” (<100–150 °C) liquid-state electrochemistry was mostly electrochemistry of diluted electrolytes(1)–(4) where dissolved salt ions were surrounded by a considerable amount of solvent molecules. Highly concentrated liquid electrolytes were mostly considered in the narrow (albeit important) niche of high-temperature electrochemistry of molten inorganic salts(5-9) and in the even narrower niche of “first-generation” room temperature ionic liquids, RTILs (such as chloro-aluminates and alkylammonium nitrates).(10-14) The situation has changed dramatically in the 2000s after the discovery of new moisture- and temperature-stable RTILs.(15, 16) These days, the “later generation” RTILs attracted wide attention within the electrochemical community.(17-31) Indeed, RTILs, as a class of compounds, possess a unique combination of properties (high charge density, electrochemical stability, low/negligible volatility, tunable polarity, etc.) that make them very attractive substances from fundamental and application points of view.(32-38) Most importantly, they can mix with each other in “cocktails” of one’s choice to acquire the desired properties (e.g., wider temperature range of the liquid phase(39, 40)) and can serve as almost “universal” solvents.(37, 41, 42) It is worth noting here one of the advantages of RTILs as compared to their high-temperature molten salt (HTMS)(43) “sister-systems”.(44) In RTILs the dissolved molecules are not imbedded in a harsh high temperature environment which could be destructive for many classes of fragile (organic) molecules

    Synthesis and Characterization of Cobalt and Nitrogen Co Doped Peat Derived Carbon Catalysts for Oxygen Reduction in Acidic Media

    Get PDF
    In this study, several peat derived carbons PDC were synthesized using various carbonization protocols. It was found that depending on the carbonization method, carbons with very different surface morphologies, elemental compositions, porosities, and oxygen reduction reaction ORR activities were obtained. Five carbons were used as carbon supports to synthesize Co N PDC catalysts, and five different ORR catalysts were acquired. The surface analysis revealed that a higher nitrogen content, number of surface oxide defects, and higher specific surface area lead to higher ORR activity of the Co N PDC catalysts in acidic solution. The catalyst Co N C 2 ZnCl2 , which was synthesized from ZnCl2 activated and pyrolyzed peat, showed the highest ORR activity in both rotating disk electrode and polymer electrolyte membrane fuel cell tests. A maximum power density value of 210 mW cm2 has been obtained. The results of this study indicate that PDCs are promising candidates for the synthesis of active non platinum group metal type catalyst
    corecore