846 research outputs found

    Exploration of an adaptive merging scheme for optimal precipitation estimation over ungauged urban catchment

    Get PDF
    Merging rain gauge and radar data improves the accuracy of precipitation estimation for urban areas. Since the rain gauge network around the ungauged urban catchment is fixed, the relevant question relates to the optimal merging area that produces the best rainfall estimation inside the catchment. Thus, an incremental radar-gauge merging was performed by gradually increasing the distance from the centre of the study area, the number of merging gauges around it and the radar domain. The proposed adaptive merging scheme is applied to a small urban catchment in west Yorkshire, Northern England, for 118 extreme events from 2007 to 2009. The performance of the scheme is assessed using four experimental rain gauges installed inside the study area. The result shows that there is indeed an optimum radar-gauge merging area and consequently there is an optimum number of rain gauges that produce the best merged rainfall data inside the study area. Different merging methods produce different results for both classified and unclassified rainfall types. Although the scheme was applied on daily data, it is applicable to other temporal resolutions. This study has importance for other studies such as urban flooding analysis, since it provides improved rainfall estimation for ungauged urban catchments.</jats:p

    Improving rainfall nowcasting and urban runoff forecasting through dynamic radar-raingauge rainfall adjustment

    Get PDF
    The insufficient accuracy of radar rainfall estimates is a major source of uncertainty in short-term quantitative precipitation forecasts (QPFs) and associated urban flood forecasts. This study looks at the possibility of improving QPFs and urban runoff forecasts through the dynamic adjustment of radar rainfall estimates based on raingauge measurements. Two commonly used techniques (Kriging with External Drift (KED) and mean field bias correction) were used to adjust radar rainfall estimates for a large area of the UK (250,000 km2) based on raingauge data. QPFs were produced using original radar and adjusted rainfall estimates as input to a nowcasting algorithm. Runoff forecasts were generated by feeding the different QPFs into the storm water drainage model of an urban catchment in London. The performance of the adjusted precipitation estimates and the associated forecasts was tested using local rainfall and flow records. The results show that adjustments done at too large scales cannot provide tangible improvements in rainfall estimates and associated QPFs and runoff forecasts at small scales, such as those of urban catchments. Moreover, the results suggest that the KED adjusted rainfall estimates may be unsuitable for generating QPFs, as this method damages the continuity of spatial structures between consecutive rainfall fields

    Sharing the instruction cache among lean cores on an asymmetric CMP for HPC applications

    Get PDF
    High performance computing (HPC) applications have parallel code sections that must scale to large numbers of cores, which makes them sensitive to serial regions. Current supercomputing systems with heterogeneous or asymmetric CMPs (ACMP) combine few high-performance big cores for serial regions, together with many low-power lean cores for throughput computing. The low requirements of HPC applications in the core front-end lead some designs, such as SMT and GPU cores, to share front-end structures including the instruction cache (I-cache). However, little work exists to analyze the benefit of sharing the I-cache among full cores, which seems compelling as a solution to reduce silicon area and power. This paper analyzes the performance, power and area impact of such a design on an ACMP with one high-performance core and multiple low-power cores. Having identified that multiple cores run the same code during parallel regions, the lean cores share the I-cache with the intent of benefiting from mutual prefetching, without increasing the average access latency. Our exploration of the multiple parameters finds the sweet spot on a wide interconnect to access the shared I-cache and the inclusion of a few line buffers to provide the required bandwidth and latency to sustain performance. The projections with McPAT and a rich set of HPC benchmarks show 11% area savings with a 5% energy reduction at no performance cost.The research was supported by European Unions 7th Framework Programme [FP7/2007-2013] under project Mont-Blanc (288777), the Ministry of Economy and Competitiveness of Spain (TIN2012-34557, TIN2015-65316-P, and BES-2013-063925), Generalitat de Catalunya (2014-SGR-1051 and 2014-SGR-1272), HiPEAC-3 Network of Excellence (ICT-287759), and finally the Severo Ochoa Program (SEV-2011-00067) of the Spanish Government.Peer ReviewedPostprint (author's final draft

    Rebalancing the core front-end through HPC code analysis

    Get PDF
    There is a need to increase performance under the same power and area envelope to achieve Exascale technology in high performance computing (HPC). The today's chip multiprocessor (CMP) design is tailored by traditional desktop and server workloads, different from parallel applications commonly run in HPC. In this work, we focus on the HPC code characteristics and processor front-end which factors around 30% of core power and area on the emerging lean-core type of processors used in HPC. Separating serial from parallel code sections inside applications, we characterize three HPC benchmark suites and compare them to a traditional set of desktop integer workloads. HPC applications have biased and mostly backward taken branches, small dynamic instruction footprints, and long basic blocks. Our findings suggest smaller branch predictors (BP) with the additional loop BP, smaller branch target buffers (BTB), and smaller L1 instruction caches (I-cache) with wider lines. Still, the aforementioned downsizing applies only to the cores meant to run parallel code. The difference between serial and parallel code sections in HPC applications points to an asymmetric CMP design, with one baseline core for sequential and many HPCtailored cores designed for parallel code. Predictions using Sniper simulator and McPAT show that an HPC-tailored lean core saves 16% of the core area and 7% of power compared to a baseline core, without performance loss. Using the area savings to add an extra core, an asymmetric CMP with one baseline and eight tailored cores has the same area budget as a symmetric CMP composed out of eight baseline cores demanding 4% more power and providing 12% shorter execution time on average.Postprint (author's final draft
    corecore