17,014 research outputs found
A Workload-Specific Memory Capacity Configuration Approach for In-Memory Data Analytic Platforms
We propose WSMC, a workload-specific memory capacity configuration approach
for the Spark workloads, which guides users on the memory capacity
configuration with the accurate prediction of the workload's memory requirement
under various input data size and parameter settings.First, WSMC classifies the
in-memory computing workloads into four categories according to the workloads'
Data Expansion Ratio. Second, WSMC establishes a memory requirement prediction
model with the consideration of the input data size, the shuffle data size, the
parallelism of the workloads and the data block size. Finally, for each
workload category, WSMC calculates the shuffle data size in the prediction
model in a workload-specific way. For the ad-hoc workload, WSMC can profile its
Data Expansion Ratio with small-sized input data and decide the category that
the workload falls into. Users can then determine the accurate configuration in
accordance with the corresponding memory requirement prediction.Through the
comprehensive evaluations with SparkBench workloads, we found that, contrasting
with the default configuration, configuration with the guide of WSMC can save
over 40% memory capacity with the workload performance slight degradation (only
5%), and compared to the proper configuration found out manually, the
configuration with the guide of WSMC leads to only 7% increase in the memory
waste with the workload's performance slight improvement (about 1%
Production inventory policy under a discounted cash flow
This paper presents an extended production inventory model in which the production rate at any instant depends on the demand and the inventory level. The effects of the time value of money are incorporated into the model. The demand rate is a linear function of time for the scheduling period. The proposed model can assist managers in economically controlling production systems under the condition of considering a discounted cash flow. A simple algorithm computing the optimal production-scheduling period is developed. Several particular cases of the model are briefly discussed. Through numerical example, sensitive analyses are carried out to examine the effect of the parameters. Results show that the discount rate parameter and the inventory holding cost have a significant impact on the proposed model
Link Clustering with Extended Link Similarity and EQ Evaluation Division.
Link Clustering (LC) is a relatively new method for detecting overlapping communities in networks. The basic principle of LC is to derive a transform matrix whose elements are composed of the link similarity of neighbor links based on the Jaccard distance calculation; then it applies hierarchical clustering to the transform matrix and uses a measure of partition density on the resulting dendrogram to determine the cut level for best community detection. However, the original link clustering method does not consider the link similarity of non-neighbor links, and the partition density tends to divide the communities into many small communities. In this paper, an Extended Link Clustering method (ELC) for overlapping community detection is proposed. The improved method employs a new link similarity, Extended Link Similarity (ELS), to produce a denser transform matrix, and uses the maximum value of EQ (an extended measure of quality of modularity) as a means to optimally cut the dendrogram for better partitioning of the original network space. Since ELS uses more link information, the resulting transform matrix provides a superior basis for clustering and analysis. Further, using the EQ value to find the best level for the hierarchical clustering dendrogram division, we obtain communities that are more sensible and reasonable than the ones obtained by the partition density evaluation. Experimentation on five real-world networks and artificially-generated networks shows that the ELC method achieves higher EQ and In-group Proportion (IGP) values. Additionally, communities are more realistic than those generated by either of the original LC method or the classical CPM method
- …
