141,411 research outputs found
Optimizing I/O for Big Array Analytics
Big array analytics is becoming indispensable in answering important
scientific and business questions. Most analysis tasks consist of multiple
steps, each making one or multiple passes over the arrays to be analyzed and
generating intermediate results. In the big data setting, I/O optimization is a
key to efficient analytics. In this paper, we develop a framework and
techniques for capturing a broad range of analysis tasks expressible in
nested-loop forms, representing them in a declarative way, and optimizing their
I/O by identifying sharing opportunities. Experiment results show that our
optimizer is capable of finding execution plans that exploit nontrivial I/O
sharing opportunities with significant savings.Comment: VLDB201
Bayesian Matrix Completion via Adaptive Relaxed Spectral Regularization
Bayesian matrix completion has been studied based on a low-rank matrix
factorization formulation with promising results. However, little work has been
done on Bayesian matrix completion based on the more direct spectral
regularization formulation. We fill this gap by presenting a novel Bayesian
matrix completion method based on spectral regularization. In order to
circumvent the difficulties of dealing with the orthonormality constraints of
singular vectors, we derive a new equivalent form with relaxed constraints,
which then leads us to design an adaptive version of spectral regularization
feasible for Bayesian inference. Our Bayesian method requires no parameter
tuning and can infer the number of latent factors automatically. Experiments on
synthetic and real datasets demonstrate encouraging results on rank recovery
and collaborative filtering, with notably good results for very sparse
matrices.Comment: Accepted to AAAI 201
DeepCity: A Feature Learning Framework for Mining Location Check-ins
Online social networks being extended to geographical space has resulted in
large amount of user check-in data. Understanding check-ins can help to build
appealing applications, such as location recommendation. In this paper, we
propose DeepCity, a feature learning framework based on deep learning, to
profile users and locations, with respect to user demographic and location
category prediction. Both of the predictions are essential for social network
companies to increase user engagement. The key contribution of DeepCity is the
proposal of task-specific random walk which uses the location and user
properties to guide the feature learning to be specific to each prediction
task. Experiments conducted on 42M check-ins in three cities collected from
Instagram have shown that DeepCity achieves a superior performance and
outperforms other baseline models significantly
- …
