Tutorial on Bayesian optimization


Published: 2023-04-29

Page: 196-222

Loc Nguyen *

Loc Nguyen’s Academic Network, Vietnam.

*Author to whom correspondence should be addressed.


Machine learning forks into three main branches such as supervised learning, unsupervised learning, and reinforcement learning where reinforcement learning is much potential to artificial intelligence (AI) applications because it solves real problems by progressive process in which possible solutions are improved and finetuned continuously. The progressive approach, which reflects ability of adaptation, is appropriate to the real world where most events occur and change continuously and unexpectedly. Moreover, data is getting too huge for supervised learning and unsupervised learning to draw valuable knowledge from such huge data at one time. Bayesian optimization (BO) models an optimization problem as a probabilistic form called surrogate model and then directly maximizes an acquisition function created from such surrogate model in order to maximize implicitly and indirectly the target function for finding out solution of the optimization problem. A popular surrogate model is Gaussian process regression model. The process of maximizing acquisition function is based on updating posterior probability of surrogate model repeatedly, which is improved after every iteration. Taking advantages of acquisition function or utility function is also common in decision theory but the semantic meaning behind BO is that BO solves problems by progressive and adaptive approach via updating surrogate model from a small piece of data at each time, according to ideology of reinforcement learning. Undoubtedly, BO is a reinforcement learning algorithm with many potential applications and thus it is surveyed in this research with attention to its mathematical ideas. Moreover, the solution of optimization problem is important to not only applied mathematics but also AI.

Keywords: Bayesian optimization, Gaussian process regression, acquisition function, machine learning, reinforcement learning

How to Cite

Nguyen , L. (2023). Tutorial on Bayesian optimization. Asian Journal of Advances in Research, 6(1), 196–222. Retrieved from https://mbimph.com/index.php/AJOAIR/article/view/3467


Download data is not yet available.


Luo X. Minima distribution for global optimization. arXiv preprint; 2019. DOI:10.48550/arXiv.1812.03457

Nguyen L. A short study on minima distribution. Preprints; 2022. DOI:10.20944/preprints202206.0361.v1

Shahriari B, Swersky K, Wang Z, Adams RP, Freitas Nd. Taking the Human Out of the Loop: A Review of Bayesian Optimization. (J. H. Trussell, Ed.) Proceedings of the IEEE. 2016;104(1):148 - 175.DOI:10.1109/JPROC.2015.2494218

Neapolitan RE. Learning Bayesian Networks. Upper Saddle River, New Jersey, USA: Prentice Hall; 2003.

Frazier PI. A Tutorial on Bayesian Optimization. arXiv; 2018. DOI:10.48550/arXiv.1807.02811

Wikipedia. Gaussian process; 2003. (Wikimedia Foundation) Retrieved from Wikipedia website:


Wang J. An Intuitive Tutorial to Gaussian Processes Regression. arXiv; 2022. DOI:10.48550/arXiv.2009.10862

Hardle W, Simar L. Applied Multivariate Statistical Analysis. Berlin, Germany: Research Data Center, School of Business and Economics, Humboldt University; 2013.

Bai T, Li Y, Shen Y, Zhang X, Zhang W, Cui B. Transfer Learning for Bayesian Optimization: A Survey. arXiv Preprint; 2023. DOI:10.48550/arXiv.2302.05927

Kamperis S. Acquisition functions in Bayesian Optimization; 2021. Retrieved from Stathis Kamperis's GitHub page: https://ekamperi.github.io/machine%20learning/2021/06/11/acquisition-functions.html