UFES workshop: Applied bioinformatics in Health Sciences

Workshop details: 6 days (9am-5.30pm)




Kuan Rong Chan, Justin Ooi, Clara Koh

Workshop description:

This workshop provides the fundamental concepts of bioinformatics and describes how they can be effectively used in biomedical sciences and research. The codes provided in the workshop will enable students to perform data table manipulation, raw data processing, data pre-processing and data visualisation. For those with a stronger foundation in the programming language Python and R, we have also provided the codes for more advanced data analysis, including pathway enrichment, machine learning and multi-omics integration. We hope that the lessons will serve as a primer to get participants interested in bioinformatics, and eventually apply them in their research. Our relationship goes beyond the workshop, and we look forward to collaborating with all the participants!

Learning Outcomes:

Students will first learn the basics of the different programming languages, Python and R, and appreciate the use of these programming languages for health sciences and data analysis.

After learning the basics, students will be introduced to using Python and R to manipulate data tables, perform data pre-processing and data visualisation. At the end, students will be trained to be comfortable in using Python and R, and be able to execute basic commands confidently. We have also structured the workshop to include real case study examples so students can better learn how these knowledge can help advance their research.

Finally, to be able to interpret omics datasets, students will learn how to use machine learning techniques and develop webtools to showcase their research.

The course will be divided into lectures and tutorials, where the students will experience how to use the different bioinformatic tools during the tutorial sessions.


As a researcher, it is critical to know how to obtain, analyse and interpret data from experiments. However, there are currently no graduate modules that cover the fundamental concepts of data analysis. This module should be of broad interest to most students who are pursuing a PhD in the field of sciences. The knowledge learnt can be immediately applied to their research projects, and even be useful for their future career if they are venturing into industry or academia.

Detailed breakdown on the workshop topics :

Day 1

Main topics:

  1. Introduction to data science

  2. File naming best practices

  3. Introduction to Python and R

  4. Introduction to Pandas

  5. Applied biostatistics

Day 2

Main topics:

  1. Data pre-processing

  2. Outlier detection and management

  3. Designing experiment best practices

  4. Data visualisation

  5. Practical sessions for Python and R for data visualisation

  6. Case management and data analysis

Day 3

Main topics:

  1. Differential analysis

  2. Pathway enrichment and analysis

  3. Data management and democraticisation

  4. Practical sessions for Python and R for pathway enrichment analysis

  5. Webtools for omics data analysis: How to use STAGEs

Day 4

Main topics:

  1. Machine learning

  2. Artificial intelligence

  3. Practical sessions for R to perform PCA and PLS-DA

  4. Group work: Understanding innate immune responses that govern adaptive immune responses

Day 5

Main topics:

  1. Analysing and integrating multiple big datasets

  2. Webtool development

  3. Practical sessions for R to perform MOFA

  4. Practical sessions for Python for webtool development

Day 6

Main topics:

  1. Group work and practical sessions on a data science project

  2. Decoupler Python codes for RNAseq analysis

  3. Student feedback and discussions

Mode of teaching and assessment

The lessons will be split into lectures and tutorials, where students will have hands-on sessions on how to download different data analysis tools, and how to use them effectively. Hence, every student will need to bring their laptops during lessons to learn most effectively. Group work will allow students to interact and use Python to solve a biomedical question. The instructors will be involved in facilitating these sessions and students will present their data analysis findings during classes.