A Hierarchical Bayesian Approach to Adaptive Multi-Task Modeling
DOI:
https://doi.org/10.47134/ppm.v3i1.2079Keywords:
Multi-Task Learning, Hierarchical Bayesian Models, Variational Inference, Adaptive Learning, Uncertainty Quantification, Relational Structure LearningAbstract
Multi-task learning (MTL) aims to improve generalization by leveraging shared information across related tasks. However, conventional methods often rely on restrictive, pre-defined assumptions about task relationships, limiting their effectiveness in complex, heterogeneous environments, this paper introduces a Hierarchical Bayesian Model for Adaptive Multi-Task Learning (HB-MTL), a fully integrated probabilistic framework that learns the inter-task relationship structure directly from the data. By placing hyper-priors on the parameters of a shared task distribution, our model can flexibly capture a rich mosaic of relationships, including positive, negative, and null correlations, we employ Variational Inference for tractable posterior approximation, we validate our approach on a challenging synthetic benchmark, "MetroSim," designed to emulate the structural complexities of real-world systems, the results demonstrate that our model significantly outperforms a suite of strong baselines, particularly in its unique ability to leverage negative correlations and avoid negative transfer with unrelated tasks, the framework not only yields superior predictive accuracy but also provides an interpretable map of the learned task structure and robust uncertainty quantification, making it a powerful tool for practical applications
References
Baek J., Lesmes L., Lu Z. L. (2014). Bayesian adaptive estimation of the sensory memory decay function: The quick partial report method. Journal of Vision, 14 (10): 157, doi:10.1167/14.10.157. DOI: https://doi.org/10.1167/14.10.157
Cavagnaro D. R., Myung J. I., Pitt M. A., Kujala J. V. (2010). Adaptive design optimization: A mutual information-based approach to model discrimination in cognitive science. Neural Computation, 22 (4), 887–905. DOI: https://doi.org/10.1162/neco.2009.02-09-959
DiMattina C. (2015). Fast adaptive estimation of multi-dimensional psychometric functions. Journal of Vision, 15 (9): 15 1–20, doi:10.1167/15.9.5. DOI: https://doi.org/10.1167/15.9.5
DiMattina C., Zhang K. (2008). How optimal stimuli for sensory neurons are constrained by network architecture. Neural Computation, 20 (3), 668–708. DOI: https://doi.org/10.1162/neco.2007.11-05-076
DiMattina C., Zhang K. (2011). Active data collection for efficient estimation and comparison of nonlinear neural models. Neural Computation, 23 (9), 2242–2288. DOI: https://doi.org/10.1162/NECO_a_00167
Dorr M., Lesmes L. A., Lu Z. L., Bex P. J. (2013). Rapid and reliable assessment of the contrast sensitivity function on an iPad. Investigative Ophthalmology & Visual Science, 54 (12), 7266–7273. DOI: https://doi.org/10.1167/iovs.13-11743
Hou F., Lesmes L., Bex P., Dorr M., Lu Z.-L. (2015). Using 10AFC to further improve the efficiency of quick CSF method. Journal of Vision, 15 (9): 15 1–18, doi:10.1167/15.9.2. DOI: https://doi.org/10.1167/15.9.2
Hou F., Lesmes L., Kim W., Gu H., Pitt M., Myung J., Lu Z.-L. (2016). The usefulness of the quick CSF method: A large sample study. Manuscript submitted for publication.
Hou F., Lu Z.-L., Huang C. B. (2014). The external noise normalized gain profile of spatial vision. Journal of Vision, 14 (13): 15 1–14, doi:10.1167/14.13.9. DOI: https://doi.org/10.1167/14.13.9
Hou F., Huang C. B., Lesmes L., Feng L. X., Tao L., Zhou Y. F., Lu Z.-L. (2010). qCSF in clinical application: Efficient characterization and classification of contrast sensitivity functions in amblyopia. Investigative Ophthalmology & Visual Science, 51 (10), 5365–5377. DOI: https://doi.org/10.1167/iovs.10-5468
Huang C., Tao L., Zhou Y., Lu Z. L. (2007). Treated amblyopes remain deficient in spatial vision: A contrast sensitivity and external noise study. Vision Research, 47 (1), 22–34. DOI: https://doi.org/10.1016/j.visres.2006.09.015
Kim W., Pitt M. A., Lu Z.-L., Steyvers M., Myung J. I. (2014). A hierarchical adaptive approach to optimal experimental design. Neural Computation, 26, 2463–2492. DOI: https://doi.org/10.1162/NECO_a_00654
Kleiner M., Brainard D., Pelli D., Ingling A., Murray R., Broussard C. (2007). What's new in Psychtoolbox-3. Perception, 36 (14), 1–16.
Kujala J. V., Lukka T. J. (2006). Bayesian adaptive estimation: The next dimension. Journal of Mathematical Psychology, 50 (4), 369–389. DOI: https://doi.org/10.1016/j.jmp.2005.12.005
Lee M. D. (2006). A hierarchical Bayesian model of human decision-making on an optimal stopping problem. Cognitive Science, 30 (3), 1–26. DOI: https://doi.org/10.1207/s15516709cog0000_69
Lesmes L. A., Jeon S. T., Lu Z. L., Dosher B. A. (2006). Bayesian adaptive estimation of threshold versus contrast external noise functions: The quick TvC method. Vision Research, 46 (19), 3160–3176. DOI: https://doi.org/10.1016/j.visres.2006.04.022
Lesmes L. A., Lu Z. L., Baek J., Albright T. D. (2010). Bayesian adaptive estimation of the contrast sensitivity function: The quick CSF method. Journal of Vision, 10 (3): 15 1–21, doi:10.1167/10.3.17. DOI: https://doi.org/10.1167/10.3.17
Lesmes L. A., Lu Z.-L., Baek J., Tran N., Dosher B. A., Albright T. D. (2015). Developing Bayesian adaptive methods for estimating sensitivity thresholds (d′) in Yes-No and forced-choice tasks. Frontiers in Psychology, 6, 1070, doi:10.3389/fpsyg.2015.01070. DOI: https://doi.org/10.3389/fpsyg.2015.01070
Lu Z.-L., Dosher B. A. (2013). Visual psychophysics: From laboratory to theory. Cambridge, MA: MIT Press. DOI: https://doi.org/10.7551/mitpress/9780262019453.001.0001
McAnany J. J., Alexander K. R. (2006). Contrast sensitivity for letter optotypes vs. gratings under conditions biased toward parvocellular and magnocellular pathways. Vision Research, 46 (10), 1574–1584. DOI: https://doi.org/10.1016/j.visres.2005.08.019
Oshika T., Okamoto C., Samejima T., Tokunaga T., Miyata K. (2006). Contrast sensitivity function and ocular higher-order wavefront aberrations in normal human eyes. Ophthalmology, 113 (10), 1807–1812. DOI: https://doi.org/10.1016/j.ophtha.2006.03.061
Rodriguez A., Dunson D. B., Gelfand A. E. (2008). The nested Dirichlet process. Journal of the American Statistical Association, 103, 1131–1154. DOI: https://doi.org/10.1198/016214508000000553
Teh Y. W., Jordan M. I., Beal M. J., Blei D. M. (2006). Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101, 1566–1581. DOI: https://doi.org/10.1198/016214506000000302
Wackerly D., Mendenhall W., Scheaffer R. (2007). Mathematical statistics with applications. Belmont, CA: Cengage Learning.
Wagenmakers E. J., Lee M., Lodewyckx T., Iverson G. J. (2008). Bayesian versus frequentist inference. In Hoijtink H., Klugkist I., Boelen P. A. (Eds.), Bayesian evaluation of informative hypotheses (pp. 181–207). New York: Springer. DOI: https://doi.org/10.1007/978-0-387-09612-4_9




