Current research activities focus on machine learning and data mining, natural language processing and text mining, information retrieval and knowledge management.
Scientific and Engineering Applications
- Biological text mining. Development of a generic text mining tool for the biological domain. The BioMinT tool searches the online scientific literature and automatically extracts specifically targeted information concerning genes and proteins. This information is used to fill templates in order to (1) accelerate, by partially automating, the annotation and update of genomics and proteomics databases; and (2) generate readable and structured reports in response to queries from biological researchers and practitioners. This work is part of a European project involving the AI Lab, the Swiss Institute of Bioinformatics, and 4 other European partners.
- Proteomics. Application of data mining techniques to proteomics research issues such as the detection of biomarkers for diagnosis from protein mass spectra, characterization of complex protein families, prediction of post-translational modifications and their impact on protein function.
- Retrieval and analysis of medical texts. The two types of biomedical text repositories used are clinical records produced at the Hospital and biomedical digital libraries (MedLine). Current research projects include named-entity recognition in patient records and text mining for prediction of adverse drug reactions.