Dr Hien Duy

Dr Hien Duy Nguyen

ARC DECRA Research Fellow, Lecturer

College of Science, Health and Engineering

School of Engineering and Mathematical Sciences

Department of Mathematics and Statistics

PS2 - 213, Melbourne (Bundoora)

Qualifications

Bachelor of Economics (UQ), Bachelor of Science (Hons I; UQ), PhD (Statistics; UQ)

Role

Academic

Area of study

Computer Science
Mathematics and Statistics

Recent publications

  1. H.D. Nguyen (2018), Near universal consistency of the maximum pseudolikelihood estimator for discrete models, Journal of the Korean Statistical Society, to appear

  2. H.D. Nguyen and F. Chamroukhi (2018), An Introduction to the Practical and Theoretical Aspects of Mixture-of-Experts Modeling, WIREs Data Mining and Knowledge Discovery, to appear

  3. H.D. Nguyen and G.J. McLachlan (2018), Some theoretical results regarding the polygonal distribution, Communications in Statistics: Theory and Methods, to appear

  4. H.D. Nguyen and G.J. McLachlan (2018), Chunked-and-averaged estimators for vector parameters, Statistics and Probability Letters, to appear

  5. H.D. Nguyen, G.J. McLachlan, J.F.P. Ullmann, V. Voleti, W. Li, E.M.C. Hillman, D.C. Reutens, and A.L. Janke (2018), Whole-Volume Clustering of time series data from zebrafish brain calcium images via mixture modeling, Statistical Analysis and Data Mining, to appear

  6. P. Orban, C. Dansereau, L. Desbois, V. Mongeau-Perusse, C-E. Giguere, H. Nguyen, A. Mendrek, E. Stip, and P. Bellec (2018), Multisite generalizability of schizophrenia diagnosis classification based on functional brain connectivity, Schizophrenia Research, to appear

  7. L.R. Lloyd-Jones, H.D. Nguyen, and G.J. McLachlan (2018), A globally convergent algorithm for lasso-penalized mixture of linear regression models, Computational Statistics and Data Analysis, vol. 119, pp. 19-38

  8. H. D. Nguyen and A. T. Jones (2018), Big Data-Appropriate Clustering via Stochastic Approximation and Gaussian Mixture Models, Data Analytics: Concepts, Techniques and Applications, CRC Press

  9. G.J. McLachlan and H.D. Nguyen (2017), Contribution to the discussion of paper by M. Drton and M. Plummer, Journal of the Royal Statistical Society B, vol. 79, p. 365

  10. H.D. Nguyen (2017), A Novel Algorithm for Clustering of Data on the Unit Sphere via Mixture Models, in JSM Proceedings: Statistical Computing Section

  11. H.D. Nguyen (2017), A Two-Sample Kolmogorov-Smirnov-Like Test for Big Data, in Proceedings of the Fifteenth Australasian Data Mining Conference

  12. H.D. Nguyen (2017), An introduction to MM algorithms for machine learning and statistical estimation, WIREs Data Mining and Knowledge Discovery, vol. 7, e1198

  13. H.D. Nguyen and G.J. McLachlan (2017), Progress on a conjecture regarding the triangular distribution, Communications in Statistics: Theory and Methods, vol. 46, pp. 11261-11271

  14. H.D. Nguyen and G.J. McLachlan (2017), Iteratively-reweighted least-squares fitting of support vector machines: a majorization-minimization algorithm approach, in Proceedings of the 2017 Future Technologies Conference (FTC)

  15. H.D. Nguyen, G.J. McLachlan, and M.M. Hill (2017), Permutation tests with false discovery corrections for comparative-profiling proteomics experiments, in Methods in Molecular Biology: Proteomics Bioinformatics, Springer

  16. H.D. Nguyen, G.J. McLachlan, P. Orban, P. Bellec, and A.L. Janke (2017), Maximum pseudolikelihood estimation for model-based clustering of time-series, Neural Computation, vol. 29, pp. 990-1020

  17. C. Oyarzun, A. Sanjurjo, and H. Nguyen (2017), Response functions, European Economic Review, vol. 98, pp. 1-31

  18. L.R. Lloyd-Jones, H.D. Nguyen, G.J. McLachlan, W. Sumpton, and Y.-G. Wang (2016), Mixture of time dependent growth models with an application to blue swimmer crab length-frequency data, Biometrics, vol. 72, pp. 1255-1265

  19. H.D. Nguyen, L.R. Lloyd-Jones, and G.J. McLachlan (2016), A universal approximation theorem for mixture of experts models, Neural Computation, vol. 28, pp. 2585-2593

  20. A.T. Jones and H.D. Nguyen (2016), lowmemtkmeans: Low Memory Use Trimmed K-Means, The Comprehensive R Archive Network, URL: https://cran.r-project.org/web/packages/lowmemtkmeans

  21. H.D. Nguyen, L.R. Lloyd-Jones, and G.J. McLachlan (2016), A block minorization-maximization algorithm for heteroscedastic regression, IEEE Signal Processing Letters, vol. 23, pp. 1031-1135

  22. H.D. Nguyen and G.J. McLachlan (2016), Linear mixed models with marginally symmetric nonparametric random-effects, Computational Statistics and Data Analysis, vol. 106, pp. 151-169

  23. H.D. Nguyen and G.J. McLachlan (2016), Maximum likelihood estimation of triangular and polygonal distributions, Computational Statistics and Data Analysis, vol. 106, pp. 23-36

  24. H.D. Nguyen and I.A. Wood (2016), Asymptotic normality of the maximum pseudolikelihood estimator for fully-visible Boltzmann machines, IEEE Transactions on Neural Networks and Learning Systems, vol. 27, pp. 897-902

  25. H.D. Nguyen and I.A. Wood (2016), A block successive lower-bound maximization algorithm for the maximum pseudolikelihood estimation of fully visible Boltzmann machines, Neural Computation, vol 28, pp. 485-492

  26. H. D. Nguyen, G. J. McLachlan, J.F.P. Ullmann, and A.L. Janke (2016), Spatial clustering of time-series via mixtures of autoregressions models and Markov random fields, Statistica Neerlandica, vol. 70, pp. 414-439

  27. H. D. Nguyen, G. J. McLachlan, J.F.P. Ullmann, and A.L. Janke (2016), Laplace mixture autoregressive models, Statistics and Probability Letters, vol. 110, pp. 18-24

  28. H.D. Nguyen and G.J. McLachlan (2016), Laplace mixture of linear experts, Computational Statistics and Data Analysis, vol. 93, pp. 177-191

  29. H.D. Nguyen, G.J. McLachlan, and I.A. Wood (2016), Mixtures of spatial spline regressions for clustering and classification, Computational Statistics and Data Analysis, vol. 93, pp. 76-85

  30. H.D. Nguyen and G.J. McLachlan (2015), Maximum likelihood estimation of Gaussian mixture models without matrix operations, Advances in Data Analysis and Classification, vol. 9, pp. 371-394

Resume

Download Resume

Older publications

  1. H.D. Nguyen (2015), NostalgiR: Advanced Text-Based Plots, The Comprehensive R Archive Network, URL: http://CRAN.R-project.org/package=NostalgiR

  2. H.D. Nguyen (2015), Finite mixture models for regression problems, In The University of Queensland (Research Higher Degrees) Theses Collection

  3. D. Chen, A. Shah, H. Nguyen, D. Loo, K. Inder, and M. Hill (2014), Online quantitative proteomics p-value calculator for permutation-based statistical testing of peptide ratios, Journal of Proteomics Research, vol. 13, pp. 4184-4191

  4. L.R. Lloyd-Jones, H.D. Nguyen, Y-G. Wang, and M.F. O’Neill (2014), Improved estimation of size-transition matrices using tag-recapture data, Canadian Journal of Fisheries and Aquatic Sciences, vol. 71, pp. 1385-1394

  5. H.D. Nguyen and G.J. McLachlan (2014), Asymptotic inference for hidden process regression models, in Proceedings of the 2014 IEEE Statistical Signals Processing Workshop

  6. H.D. Nguyen, G. J. McLachlan, N. Cherbuin, and A. L. Janke (2014), False discovery rate control in magnetic resonance imaging studies via Markov random fields, IEEE Transactions on Medical Imaging, vol. 33, pp. 1735-1748

  7. H.D. Nguyen, A.L. Janke, N. Cherbuin, G.J. McLachlan, P. Sachdev, and K.J. Anstey (2013), Spatial false discovery rate control for magnetic resonance imaging studies, in Proceedings of the 2013 Digital Imaging: Techniques and Applications (DICTA) conference

  8. K.L. Inder, Y.Z. Zheng, M.J. Davis, H. Moon, D. Loo, H. Nguyen, J.A. Clements, R.G. Parton, L.J. Foster, and M.M. Hill (2012), Expression of PTRF in PC-3 cells modulated cholesterol dynamics and actin cytoskeleton impacting secretion pathways, Molecular and Cellular Proteomics, vol. 11, M111.012245

  9. H.D. Nguyen, M.M. Hill, and I.A. Wood (2012), A robust permutation test for quantitative SILAC proteomics experiments, Journal of Integrated OMICS, vol. 2, pp. 80-93

  10. H.D. Nguyen and I.A. Wood (2012), Variable selection in statistical models using population-based incremental learning with applications to genome-wide association studies, in Proceedings of the 2012 IEEE Congress on Evolutionary Computation (CEC)

Research projects

ARC DE170101134: Feasible algorithms for big inference.

This project aims to develop algorithms for computationally-intensive statistical tools to analyse Big Data. Big Data is ubiquitous in science, engineering, industry and finance, but needs special machine learning to conduct correct inferential analysis. Computational bottlenecks make many tried-and-true tools of statistical inference inadequate. This project will develop tools including false discovery rate control, heteroscedastic and robust regression and mixture models, via Big Data-appropriate optimisation and composite-likelihood estimation. It will make open, well-documented, and accessible software available for the scalable and distributable analysis of Big Data. The expected outcome is a suite of scalable algorithms to analyse Big Data.

ARC DP180101192 (with Geoff McLachlan, UQ; and Sharon Lee, UQ): Classification methods for providing personalised and class decisions.

This project provides a novel approach to the clustering of multivariate samples on entities in a class that automatically matches the sample clusters across the entities, allowing for inter-sample variation between the samples in a class. The project aims to develop a widely applicable, mixture-model-based framework for the simultaneous clustering of multivariate samples with inter-sample variation in a class and for the matching of the clusters across the entities in the class. The project will use a statistical approach to automatically match the clusters, since the overall mixture model provides a template for the class. It will provide a basis for discriminating between different classes in addition to the identification of atypical data points within a sample and of anomalous samples within a class. Key applications include biological image analysis and the analysis of data in flow cytometry which is one of the fundamental research tools for the life scientist.