BIOINFORMATICA

Ingegneria Informatica BIOINFORMATICA

0622700047
DIPARTIMENTO DI INGEGNERIA DELL'INFORMAZIONE ED ELETTRICA E MATEMATICA APPLICATA
EQF7
COMPUTER ENGINEERING
2018/2019

YEAR OF COURSE 2
YEAR OF DIDACTIC SYSTEM 2017
SECONDO SEMESTRE
CFUHOURSACTIVITY
432LESSONS
216EXERCISES
Objectives
With this course, students will be introduced to the technologies, methodologies and tools for analysing results produced by next generation sequencing infrastructures. The main objective of the course is to allow students to know what the theoretical aspects behind the main algorithms for genome sequences analysis are and how to apply them for solving the new challenges for multi-genomes management and analysis.

KNOWLEDGE AND UNDERSTANDING
During the course, students will acquire knowledge about biological and bioinformatics resources and how to use them, how the next generation sequencing machines works and how to elaborate their outputs, what are the methodologies and techniques from software and architectural point of view at the basis of the main algorithms for genomic data analysis, how to use these algorithms for NGS applications (whole genome, exomes, transcriptomes, etc.) and analysis.

APPLIED KNOWLEDGE AND UNDERSTANDING
Students will have the opportunity to directly apply all the acquired knowledge and skills thanks to the development of a project work on real cases. In particular the projects will concern with the development of pipelines for the analysis of data coming from tumoral cell lines using the resources available in the department, in the laboratory of Molecular Medicine and Genomics at the DIPMED and in the new coming Genome Research Center for Health in the Baronissi Campus.
Prerequisites
In order to achieve the objectives of the course even not formally requested it is strongly recommended, that students have followed the courses of [0622900007] Elements Of Biology and [0622900008] elements of medical genetics and genomics.
Contents
•Unit 1: Introduction to Bioinformatics environment (Th:2h, Lab:8h)
oIntroduction to Linux Operating System
oIntroduction to Python
oIntroduction to R
•Biological and Bioinformatics databases and resources -2 (Th:4h Ex:2h )
oGenome sequences databases (Ensembl, GeneBank)
oProtein sequences databases (UniProt, SwissProt)
oBioinformatics resources (UCSC Genome Browser, Galaxy)
oGene Ontology databases
oBioinformatics frameworks and tools for databases, ontology and resource usage
•Sequence Alignment-3 (Th:4h Ex:2h)
oIntroduction to Sequence Alignment
oDynamic programming to compare DNA sequences
oApplication of Combinatorial Algorithms to analyze DNA Sequences
oCommon tools and framework for sequence alignment
•Genome Sequencing -4 (Th:10h Ex:8h)
oIntroduction to Genome Sequencing
oNext Generation Sequencing technologies
oBioinformatics Algorithms (Algorithmic Warm-up and Randomized Algorithms)
oApplication of Graph Algorithms to assemble genome and variant analysis
oApplication of Euler's Theorem to Assemble Genomes
•Next Generation Sequencing applications -5 (Th:6h Ex: 4h)
oWhole Genome Sequencing
oExome Sequencing
oTranscriptomics
oDe Novo Sequencing
oMetagenomics
oTools, environments and pipelines for NGS applications
•Next Generation Sequence Analysis -6 (Th:4h Ex: 2h Lab: 16h)
oIntroduction to Next Generation Sequencing data format
oread-to-reference alignment algorithms
oBioinformatics methods involved in the analysis of large-scale datasets
oFunctional Analysis - Gene Ontology Enrichment Analysis
oGenomic Data Science and Clustering
oBioinformatics Application Challenges for project work

Bioinfoirmatics Application Challenges (BAC)
In order to directly apply the knowledge and competencies acquired during the course students will be organised in groups for developing a bioinformatics pipeline using sequencing data from tumoral cell lines for doing:
Case 1:
a.Quality check of the reads
b.Alignment of the reads against reference genome
c.Identification of variants
d.Annotation of the variants
e.Functional analysis of the genes showing variants
Case 2:
a.Quality check of the reads
b.Alignment of the reads against reference genome
c.Quantification of expressed genes
d.Identification of differentially expressed genes
e.Functional analysis

TOTAL (LECTURE 30h / EXERCISE/PRACTICE 18h / LABORATORY 24h)
Teaching Methods
the course (72h of lectures, exercises and laboratory activities) is characterized by a dynamic setting, that includes the analysis of study cases with the active participation of the students who will perform specific insights on the use of NGS technologies and genome analysis tools and frameworks during the implementation of the project work. In particular, the teaching activities will include lectures (30h), exercises (18h) and laboratory (24h) working groups for the development of the project. For the development of the project work students will apply their knowledge in order to, independently, choose the most appropriate technologies (frameworks, tools, etc.) to solve specific problems in the selected application domains. the educational activities will be supported by the use of the DIEM e-learning platform (http://elearning.diem.unisa.it) to facilitate and stimulate discussion and debate among students as well as for the notification and distribution of teaching materials.
Verification of learning
The final exam is designed to assess the overall knowledge and understanding of the concepts presented in the course, the ability to apply that knowledge to develop specific applications as well as the ability to communicate and present the work carried out (communication skills). The examination consists of a practical part and an oral exam (interview). The practical part consists of the development of a project work to be carried out in groups (2-4 students) on one of the two proposed BAC. The oral exam consists of the presentation of what has been achieved during the development of the project work. Each group members expose its own contribution for the realization of the project together with a discussion of the bioinformatics tools and framework used and the achieved results.
In the final evaluation, expressed with a mark range of 30/30, the practical part will weigh 65% and the oral exam for 35%. “honours” (30/30 cum laude) will be awarded to students who demonstrate a full mastery of all the main methodological and technological aspects addressed in the course and how they can be used for the creation of applications and solutions in different application domains together with the implications derived from their use.
Texts
COURSE BOOKS
COMPUTATIONAL METHODS FOR NEXT GENERATION SEQUENCING DATA ANALYSIS (MANDOIU I AND ZELIKOVSKY A) (2016)
BIOINFORMATICS ALGORITHMS - AN ACTIVE LEARNING APPROACH (3RD EDITION - 2018) PHILLIP COMPEAU & PAVEL PEVZNER.
SUGGESTED BOOKS AND LEARNING MATERIAL
HTTPS://EN.WIKIBOOKS.ORG/WIKI/NEXT_GENERATION_SEQUENCING_%28NGS%29
NEXT-GENERATION SEQUENCING DATA ANALYSIS (XINKUN WANG) (2014)

BIOINFORMATICS: A PRACTICAL HANDBOOK OF NEXT GENERATION SEQUENCING AND ITS APPLICATIONS BY LLOYD LOW, MARTTI TAMMI 2017
More Information
The course language is English.
  BETA VERSION Data source ESSE3 [Ultima Sincronizzazione: 2019-10-21]