学术报告:Near Unsupervised Learning (NUL) in the Context of Big Data

发布时间:2014-07-08

Speaker: Prof. Saman Halgamuge

Time: July 18, 13:00

Venue: 3-528, SEIEE


Abstract:

Unsupervised learning is used for analysing and clustering data when the expected cluster or class labels are not available, i.e., we are not aware of the type of information to be found. When we know only a little about the data labels, it is still challenging to make conclusions out of the data, although this may be the case for many real world data mining problems. We name the type of learning algorithms useful in this scenario as Near Unsupervised Learning (NUL). My group has been developing NUL algorithms based on the previously proposed Growing Self Organising Maps. The concept and the algorithm development in NUL and the application in various biological data mining problems will be discussed. Some “unusual” features and signatures captured by my team will also be presented. Real problems attempted using NUL includes the following: 1) Metagenomics involves the challenging problem of clustering and eventually labelling genomic data of microbial species that cannot be easily grown in laboratories. We only know about 2% of these species found on Earth. Could this be the life form we expect to find on another planet? How do we understand and use some unique characteristics of microbes living in our environment and our body? 2) Analysing metabolite profiles of various wheat plants to understand how some type of the plants can survive droughts is an area where NUL can provide good solutions, 3) Can we analyse signals coming from biological neural networks grown on wet labs to differentiate the sick brain tissues from healthy ones? Our collaborating researchers create mice with genetics based brain diseases and analyse the brain tissues with and without the introduction of drugs. Which drugs (for example drugs preventing epileptic attacks) are more effective on a particular type of sickness? The following research papers summarise the application of these methods:

[1]    C.D. Wijetunge, Z. Li, I. Saeed, J. Bowne, A.L. Hsu, U. Roessner, A. Bacic and S.K. Halgamuge, "Exploratory Analysis of High-Throughput Metabolomic Data", Metabolomics, 2013, 9 (6), 1311-1320, Springer. [2]   I. Saeed, S.L. Tang and S. K. Halgamuge, “Unsupervised discovery of microbial population structure within metagenomes using nucleotide base composition'', Nucleic Acids Research, Volume 40, Issue 5, 2012, Oxford University Press.

[3]  I. Saeed and S. K. Halgamuge, “The oligonucleotide frequency derived error gradient and its application to the binning of metagenome fragments '', BMC Genomics, 10(Suppl 3): S10, 2009

[4]  A. L. Hsu and S. K. Halgamuge, “Class structure visualization with semi-supervised growing self-organizing maps", Neurocomputing, Vol: 71   Issue: 16-18, Pages: 3124-3130, Elsevier, 2008.

[5]  C.K. Chan, A.L. Hsu, S.K. Halgamuge and S.L. Tang, “Binning Sequences Using Very Sparse Labels within A     Metagenome'', BMC Bioinformatics, 2008, 9:215, 28 April 2008. 


Bio:

Prof. Saman Halgamuge is a Professor of the Department of Mechanical Engineering and the school wide initiative on Biomedical Engineering and Associate Dean (International) for the Melbourne School of Engineering, The University of Melbourne. He completed B.Sc. Engineering (Electronics and Telecommunications) at the University of Moratuwa, Sri Lanka and went on to graduate with Dipl.-Ing and PhD degrees in Electrical Engineering at Technical University of Darmstadt, Germany.

Professor Halgamuge has an outstanding reputation for fundamental research in Big Data Analytics, Unsupervised Learning and Bio-inspired Optimization. His research has applications in Bioinformatics, Mechatronics and Sustainable Energy. He is the co-author of over 250 research papers including 6 books, 20 book chapters and 80 journal papers with over 4800 citations and h-factor of 28. He is listed as one of the most cited (top 1%) scientists in the last 10 years by ISI Essential Science Indicators. At Melbourne, he has completed supervision of 27 PhD students. He serves on the editorial boards of 6 journals including ACTA journal on Control and Intelligent systems and BMC Bioinformatics. He chaired 12 conferences and served as a member of about 75 conference program committees. His profile is at http://scholar.google.com.au/citations?sortby=pubdate&hl=en&user=9cafqywAAAAJ&view_op=list_works

联系我们 webmaster@cs.sjtu.edu.cn

上海交通大学计算机科学与工程系版权所有 @ 2013