Monte Carlo Methods and Their Applications in Big Data Analysis

Due to recent emergence of many big data problems, Monte Carlo methods tend to become powerful tools for analyzing big data sets. In this chapter, we first review the fundamental principles of Monte Carlo methods. Then, we describe several popular variance reduction techniques, including stratified sampling, control variates, antithetic variates, and importance sampling, to improve Monte Carlo sampling efficiency. Finally, application examples of estimation of sum, Monte Carlo linear solver, image recovery, matrix multiplication, and low-rank approximation are shown as case studies to demonstrate the effectiveness of Monte Carlo methods in data analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic €32.70 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (France)

eBook EUR 96.29 Price includes VAT (France)

Softcover Book EUR 126.59 Price includes VAT (France)

Hardcover Book EUR 126.59 Price includes VAT (France)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Statistical Leveraging Methods in Big Data

General Introduction to Monte Carlo and Multi-level Monte Carlo Methods

Comparative Study of Inference Methods for Bayesian Nonnegative Matrix Factorisation

References

M. Hilbert, P. López, The world’s technological capacity to store, communicate, and compute information. Science 332, 60–65 (2011) ArticleGoogle Scholar
R.L. Burden, J.D. Faires, Numerical Analysis (Brooks/Cole, Cengage Learning, Boston, 2011) MATHGoogle Scholar
J.M. Hammersley, D.C. Handscomb, Monte Carlo Methods (Chapman and Hall, Methuen & Co., London, and Wiley, New York, 1964) BookMATHGoogle Scholar
J.S. Liu, Monte Carlo Strategies in Scientific Computing (Springer, New York, 2008) MATHGoogle Scholar
Y. Li, M. Mascagni, Grid-based Monte Carlo application, in Grid Computing Third International Workshop/Conference (2002), pp. 13–24 Google Scholar
Y. Li, M. Mascagni, Analysis of large-scale grid-based Monte Carlo applications. Int. J. High Perform. Comput. Appl. 17, 369–382 (2003) ArticleGoogle Scholar
G.E. Forsythe, R.A. Leibler, Matrix inversion by a Monte Carlo method. Math. Comput. 4, 127–129 (1950) ArticleMathSciNetGoogle Scholar
H. Ji, Y. Li, Reusing random walks in Monte Carlo methods for linear systems, in Proceedings of International Conference on Computational Science, vol. 9 (2012), pp. 383–392 Google Scholar
H. Ji, M. Mascagni, Y. Li, Convergence analysis of Markov chain Monte Carlo linear solvers using Ulam–von Neumann Algorithm. SIAM J. Numer. Anal. 51, 2107–2122 (2013) ArticleMathSciNetMATHGoogle Scholar
I.T. Dimov, B. Philippe, A. Karaivanova, C. Weihrauch, Robustness and applicability of Markov chain Monte Carlo algorithms for eigenvalue problems. Appl. Math. Model. 32, 1511–1529 (2008) ArticleMathSciNetMATHGoogle Scholar
A. Weber, SIPI image database, The USC-SIPI Image Database, Signal & Image Processing Institute, Department of Electrical Engineering, Viterbi School of Engineering, Univ. of Southern California (2012), http://sipi.usc.edu/database
J.F. Cai, E.J. Candès, Z. Shen, A singular value thresholding algorithm for matrix completion. SIAM J. Optim. 20, 1956–1982 (2010) ArticleMathSciNetMATHGoogle Scholar
P. Drineas, R. Kannan, M.W. Mahoney, Fast Monte Carlo algorithms for matrices I: approximating matrix multiplication. SIAM J. Comput. 36, 132–157 (2006) ArticleMathSciNetMATHGoogle Scholar
S. Eriksson-Bique, M. Solbrig, M. Stefanelli, S. Warkentin, R. Abbey, I.C. Ipsen, Importance sampling for a Monte Carlo matrix multiplication algorithm, with application to information retrieval. SIAM J. Sci. Comput. 33, 1689–1706 (2011) ArticleMathSciNetMATHGoogle Scholar
G.H. Golub, C.F. Van Loan, Matrix Computations (Johns Hopkins University Press, Baltimore, 2012) MATHGoogle Scholar
N. Halko, P.G. Martinsson, J.A. Tropp, Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 53, 217–288 (2011) ArticleMathSciNetMATHGoogle Scholar
M.W. Mahoney, Randomized algorithms for matrices and data. Found. Trends Mach. Learn. 3, 123–224 (2011) MATHGoogle Scholar
H. Ji, Y. Li, GPU accelerated randomized singular value decomposition and its application in image compression, in Proceedings of Modeling, Simulation, and Visualization Student Capstone Conference, Sulfolk (2014) Google Scholar

Acknowledgements

This work is partially supported by NSF grant 1066471 for Yaohang Li and Hao Ji acknowledges support from ODU Modeling and Simulation Fellowship.

Author information

Authors and Affiliations

Department of Computer Science, Old Dominion University, Norfolk, VA, 23529, USA Hao Ji & Yaohang Li

Hao Ji