UFES workshop: Applied bioinformatics in Health Sciences
Workshop details: 6 days (9am-5.30pm)
Prerequisites:
None
Instructors:
Kuan Rong Chan, Justin Ooi, Clara Koh
Workshop description:
This workshop provides the fundamental concepts of bioinformatics and describes how they can be effectively used in biomedical sciences and research. The codes provided in the workshop will enable students to perform data table manipulation, raw data processing, data pre-processing and data visualisation. For those with a stronger foundation in the programming language Python and R, we have also provided the codes for more advanced data analysis, including pathway enrichment, machine learning and multi-omics integration. We hope that the lessons will serve as a primer to get participants interested in bioinformatics, and eventually apply them in their research. Our relationship goes beyond the workshop, and we look forward to collaborating with all the participants!
Learning Outcomes:
Students will first learn the basics of the different programming languages, Python and R, and appreciate the use of these programming languages for health sciences and data analysis.
After learning the basics, students will be introduced to using Python and R to manipulate data tables, perform data pre-processing and data visualisation. At the end, students will be trained to be comfortable in using Python and R, and be able to execute basic commands confidently. We have also structured the workshop to include real case study examples so students can better learn how these knowledge can help advance their research.
Finally, to be able to interpret omics datasets, students will learn how to use machine learning techniques and develop webtools to showcase their research.
The course will be divided into lectures and tutorials, where the students will experience how to use the different bioinformatic tools during the tutorial sessions.
Rationale:
As a researcher, it is critical to know how to obtain, analyse and interpret data from experiments. However, there are currently no graduate modules that cover the fundamental concepts of data analysis. This module should be of broad interest to most students who are pursuing a PhD in the field of sciences. The knowledge learnt can be immediately applied to their research projects, and even be useful for their future career if they are venturing into industry or academia.
Detailed breakdown on the workshop topics :
Day 1
Main topics:
Introduction to data science
File naming best practices
Introduction to Python and R
Introduction to Pandas
Applied biostatistics
Day 2
Main topics:
Data pre-processing
Outlier detection and management
Designing experiment best practices
Data visualisation
Practical sessions for Python and R for data visualisation
Case management and data analysis
Day 3
Main topics:
Differential analysis
Pathway enrichment and analysis
Data management and democraticisation
Practical sessions for Python and R for pathway enrichment analysis
Webtools for omics data analysis: How to use STAGEs
Day 4
Main topics:
Machine learning
Artificial intelligence
Practical sessions for R to perform PCA and PLS-DA
Group work: Understanding innate immune responses that govern adaptive immune responses
Day 5
Main topics:
Analysing and integrating multiple big datasets
Webtool development
Practical sessions for R to perform MOFA
Practical sessions for Python for webtool development
Day 6
Main topics:
Group work and practical sessions on a data science project
Decoupler Python codes for RNAseq analysis
Student feedback and discussions
Mode of teaching and assessment
The lessons will be split into lectures and tutorials, where students will have hands-on sessions on how to download different data analysis tools, and how to use them effectively. Hence, every student will need to bring their laptops during lessons to learn most effectively. Group work will allow students to interact and use Python to solve a biomedical question. The instructors will be involved in facilitating these sessions and students will present their data analysis findings during classes.