20 research outputs found
Role of pressure in the dynamics of intense velocity gradients in turbulent flows
We investigate the role of pressure, via its Hessian tensor H, on amplification of vorticity and strain-rate and contrast it with other inviscid nonlinear mechanisms. Results are obtained from direct numerical simulations of isotropic turbulence with Taylor-scale Reynolds number in the range 140–1300. Decomposing H into local isotropic (HI) and non-local deviatoric (HD) components reveals that HI depletes vortex stretching, whereas HD enables it, with the former slightly stronger. The resulting inhibition is significantly weaker than the nonlinear mechanism which always enables vortex stretching. However, in regions of intense vorticity, identified using conditional statistics, contribution from H prevails over nonlinearity, leading to overall depletion of vortex stretching. We also observe near-perfect alignment between vorticity and the eigenvector of H corresponding to the smallest eigenvalue, which conforms with well-known vortex-tubes. We discuss the connection between this depletion, essentially due to (local) HI, and recently identified self-attenuation mechanism (Buaria et al., Nat. Commun., vol. 11, 2020, p. 5852), whereby intense vorticity is locally attenuated through inviscid effects. In contrast, the influence of H on strain-amplification is weak. It opposes strain self-amplification, together with vortex stretching, but its effect is much weaker than vortex stretching. Correspondingly, the eigenvectors of strain and H do not exhibit any strong alignments. For all results, the dependence on Reynolds number is very weak. In addition to the fundamental insights, our work provides useful data and validation benchmarks for future modelling endeavours, for instance in Lagrangian modelling of velocity gradient dynamics, where conditional H is explicitly modelled
Lagrangian acceleration and its Eulerian decompositions in fully developed turbulence
We study the properties of various Eulerian contributions to fluid particle accelerationby using well-resolved direct numerical simulations of isotropic turbulence, with theTaylor-scale Reynolds number Rλ in the range 140–1300. The variance of convectiveacceleration, when normalized by Kolmogorov scales, increases as Rλ, consistent withsimple theoretical arguments, but differing from classical Kolmogorov’s phenomenology,as well as Lagrangian extension of Eulerian multifractal models. The scaling of the localacceleration is also linear in Rλ to the leading order, but more complex in detail. Thestrong cancellation between the local and convective acceleration—faithful to the randomsweeping hypothesis—results in the variance of the Lagrangian acceleration increasingonly as R0.25λ , as recently shown by Buaria and Sreenivasan [Phys. Rev. Lett. 128, 234502(2022)]. The acceleration variance is dominated by the irrotational pressure gradientcontribution, whose variance essentially follows the R0.25λ scaling; the solenoidal viscouscontributions are comparatively small and follow R0.13λ , which is the only accelerationcomponent consistent with multifractal prediction
Universality of extreme events in turbulent flows
The universality of small scales, a cornerstone of turbulence, has been nominally confirmed for low-order mean-field statistics, such as the energy spectrum. However, small scales exhibit strong intermittency, exemplified by formation of extreme events which deviate anomalously from a mean-field description. Here, we investigate the universality of small scales by analyzing extreme events of velocity gradients in different turbulent flows, viz., direct numerical simulations of homogeneous isotropic turbulence, inhomogeneous channel flow, and laboratory measurements in a von Kármán mixing tank. We demonstrate that the scaling exponents of velocity gradient moments, as function of Reynolds number (Re), are universal, in agreement with previous studies at lower Re, and further show that even proportionality constants are universal when considering one moment order as a function of another. Additionally, by comparing various unconditional and conditional statistics across different flows, we demonstrate that the structure of the velocity gradient tensor is also universal. Overall, our findings provide compelling evidence that even extreme events are universal, with profound implications for turbulence theory and modeling
Dissipation range of the energy spectrum in high Reynolds number turbulence
We seek to understand the kinetic energy spectrum in the dissipation range of fully developed turbulence. The data are obtained by direct numerical simulations (DNS) of forced Navier-Stokes equations in a periodic domain, for Taylor-scale Reynolds numbers up to R-lambda = 650, with excellent small-scale resolution of k(max) eta approximate to 6, and additionally at R-lambda = 1300 with k(max) eta approximate to 3, where k(max) is the maximum resolved wave number and eta is the Kolmogorov length scale. We find that for a limited range of wave numbers k past the bottleneck, in the range 0.15 less than or similar to k eta > 1 where analytical arguments as well as DNS data with superfine resolution [S. Khurshid et al., Phys. Rev. Fluids 3, 082601 (2018)] suggest a simple exp(-k eta) dependence. We briefly discuss our results in connection to the multifractal model
A highly scalable particle tracking algorithm using partitioned global address space (PGAS) programming for extreme-scale turbulence simulations
A new parallel algorithm utilizing a partitioned global address space (PGAS) programming model to achieve high scalability is reported for particle tracking in direct numerical simulations of turbulent fluid flow. The work is motivated by the desire to obtain Lagrangian information necessary for the study of turbulent dispersion at the largest problem sizes feasible on current and next-generation multi-petaflop supercomputers. A large population of fluid particles is distributed among parallel processes dynamically, based on instantaneous particle positions such that all of the interpolation information needed for each particle is available either locally on its host process or neighboring processes holding adjacent sub domains of the velocity field. With cubic splines as the preferred interpolation method, the new algorithm is designed to minimize the need for communication, by transferring between adjacent processes only those spline coefficients determined to be necessary for specific particles. This transfer is implemented very efficiently as a one-sided communication, using Co-Array Fortran (CAF) features which facilitate small data movements between different local partitions of a large global array. The cost of monitoring transfer of particle properties between adjacent processes for particles migrating across sub-domain boundaries is found to be small. Detailed benchmarks are obtained on the Cray petascale supercomputer Blue Waters at the University of Illinois, Urbana-Champaign. For operations on the particles in a 81923 simulation (0.55 trillion grid points) on 262,144 Cray XE6 cores, the new algorithm is found to be orders of magnitude faster relative to a prior algorithm in which each particle is tracked by the same parallel process at all times. This large speedup reduces the additional cost of tracking of order 300 million particles to just over 50% of the cost of computing the Eulerian velocity field at this scale. Improving support of PGAS models on major compilers suggests that this algorithm will be of wider applicability on most upcoming supercomputers
A Lagrangian study of turbulent mixing: forward and backward dispersion of molecular trajectories in isotropic turbulence
Statistics of the trajectories of molecules diffusing via Brownian motion in a turbulent flow are extracted from simulations of stationary isotropic turbulence, using a postprocessing approach applicable in both forward and backward reference frames. Detailed results are obtained for Schmidt numbers () from 0.001 to 1000 at Taylor-scale Reynolds numbers up to 1000. The statistics of displacements of single molecules compare well with the earlier theoretical work of Saffman (J. Fluid Mech. vol. 8, 1960, pp. 273–283) except for the scaling of the integral time scale of the fluid velocity following the molecular trajectories. For molecular pairs we extend Saffman’s theory to include pairs of small but finite initial separation, which is in excellent agreement with numerical results provided that data are collected at sufficiently small times. At intermediate times the separation statistics of molecular pairs exhibit a more robust Richardson scaling behaviour than for the fluid particles. The forward scaling constant is very close to 0.55, whereas the backward constant is approximately 1.53–1.57, with a weak Schmidt number dependence, although no scaling exists if at the Reynolds numbers presently accessible. An important innovation in this work is to demonstrate explicitly the practical utility of a Lagrangian description of turbulent mixing, where molecular displacements and separations in the limit of small backward initial separation can be used to calculate the evolution of scalar fluctuations resulting from a known source function in space. Lagrangian calculations of the production and dissipation rates of the scalar fluctuations are shown to agree very well with Eulerian results for the case of passive scalars driven by a uniform mean gradient. Although the Eulerian–Lagrangian comparisons are made only for , the Lagrangian approach is more easily extended to both very low and very high Schmidt numbers. The well-known scalar dissipation anomaly is accordingly also addressed in a Lagrangian context.</jats:p
GPU acceleration of a petascale application for turbulent mixing at high Schmidt number using OpenMP 4.5
This paper reports on the successful implementation of a massively parallel GPU-accelerated algorithm for the direct numerical simulation of turbulent mixing at high Schmidt number. The work stems from a recent development (Comput. Phys. Commun., vol. 219, 2017, 313-328), in which a low-communication algorithm was shown to attain high degrees of scalability on the Cray XE6 architecture when overlapping communication and computation via dedicated communication threads. An even higher level of performance has now been achieved using OpenMP 4.5 on the Cray XK7 architecture, where on each node the 16 integer cores of an AMD Interlagos processor share a single Nvidia K20X GPU accelerator. In the new algorithm, data movements are minimized by performing virtually all of the intensive scalar field computations in the form of combined compact finite difference (CCD) operations on the GPUs. A memory layout in departure from usual practices is found to provide much better performance for a specific kernel required to apply the CCD scheme. Asynchronous execution enabled by adding the OpenMP 4.5 NOWAIT clause to TARGET constructs improves scalability when used to overlap computation on the GPUs with computation and communication on the CPUs. On the 27-petaflops supercomputer Titan at Oak Ridge National Laboratory, USA, a GPU-to-CPU speedup factor of approximately 5 is consistently observed at the largest problem size of 81923 grid points for the scalar field computed with 8192 XK7 nodes
Characteristics of backward and forward two-particle relative dispersion in turbulence at different Reynolds numbers
A dual communicator and dual grid-resolution algorithm for petascale simulations of turbulent mixing at high Schmidt number
A new dual-communicator algorithm with very favorable performance characteristics has been developed for direct numerical simulation (DNS) of turbulent mixing of a passive scalar governed by an advection-diffusion equation. We focus on the regime of high Schmidt number (Sc), where because of low molecular diffusivity the grid-resolution requirements for the scalar field are stricter than those for the velocity field by a factor root Sc. Computational throughput is improved by simulating the velocity field on a coarse grid of N-v(3) points with a Fourier pseudo-spectral (FPS) method, while the passive scalar is simulated on a fine grid of N-theta(3) points with a combined compact finite difference (CCD) scheme which computes first and second derivatives at eighth-order accuracy. A static three-dimensional domain decomposition and a parallel solution algorithm for the CCD scheme are used to avoid the heavy communication cost of memory transposes. A kernel is used to evaluate several approaches to optimize the performance of the CCD routines, which account for 60% of the overall simulation cost. On the petascale supercomputer Blue Waters at the University of Illinois, Urbana-Champaign, scalability is improved substantially with a hybrid MPI-OpenMP approach in which a dedicated thread per NUMA domain overlaps communication calls with computational tasks performed by a separate team of threads spawned using OpenMP nested parallelism. At a target production problem size of 8192(3) (0.5 trillion) grid points on 262,144 cores, CCD timings are reduced by 34% compared to a pure-MPI implementation. Timings for 16384(3) (4 trillion) grid points on 524,288 cores encouragingly maintain scalability greater than 90%, although the wall clock time is too high for production runs at this size. Performance monitoring with CrayPat for problem sizes up to 4096(3) shows that the CCD routines can achieve nearly 6% of the peak flop rate. The new DNS code is built upon two existing FPS and CCD codes. With the grid ratio N-theta/N-v = 8, the disparity in the computational requirements for the velocity and scalar problems is addressed by splitting the global communicator MPI_COMM_WORLD into disjoint communicators for the velocity and scalar fields, respectively. Inter communicator transfer of the velocity field from the velocity communicator to the scalar communicator is handled with discrete send and non-blocking receive calls, which are overlapped with other operations on the scalar communicator. For production simulations at N-theta = 8192 and N-v = 1024 on 262,144 cores for the scalar field, the DNS code achieves 94% strong scaling relative to 65,536 cores and 92% weak scaling relative to N-theta = 1024 and Nv = 128 on 512 cores
