Computational Biology

A tantárgy neve magyarul / Name of the subject in Hungarian: Számítógépes biológia

Last updated: 2010. április 9.

Budapest University of Technology and Economics
Faculty of Electrical Engineering and Informatics
Mérnök informatikus szak

 

BSc képzés

 

Course ID Semester Assessment Credit Tantárgyfélév
VIMIA076   3/1/0/v 4  
4. Instructors

 

Dr. Falus András

 

egyetemi tanár

 

SOTE Genetikai, Sejt- és Immunbiológiai Intézet

 

Dr. Antal Péter

 

egyetemi adjunktus

 

Méréstechnika és Információs Rendszerek Tanszék

 

7. Objectives, learning outcomes and obtained knowledge The first part of this course covers a variety of items raised by the contemporary biology and medicine. Topics in first 3 weeks will include: genetic and epigenetic code, organisation of the genome, hierarchic and non-hierarchic regulation of the life at cell, organ, organism and population level, genomics, proteomics, metabolomics, evolution, genealogy, omics databases, principles of systems biology, high-throughput molecular technologies and gene diagnostics. The next four weeks connect these basic aspects with quantitative, personalized and predictive approach of cancer, cardiovascular (e.g. hypertension and myocardial infarct), neurodegenerative (e.g. Alzheimer and multiple sclerosis) and metabolic diseases (e.g. diabetes and obesity), arthritis, asthma, AIDS, flu epidemics as well as vaccination, drug development and gene therapy. A separate chapter will cover the principles of stem (embryonic, tissue and induced pluripotent-) cell research.

 

This part of the course can serve as a preparation for a course on computational science in biomedicine, and to handle very large biological data sets.

 

 

The second part of the course introduces the statistical, algorithmic, knowledge representational, and information technological aspects of bioinformatics with particular emphases on medical genomics and personalized medicine. We discuss the characteristics of high-throughput data, the basics of Bayesian decision theory, the foundations of graphical models including Markov networks and Bayesian networks, modern conceptualization of causation, and corresponding intensive statistical methods. We overview multiple methods of gene expression data analysis including dimensionality reduction techniques, clustering, biclustering, Markov network learning, subgraph learning, module network learning, and gene prioritization. Finally we discuss methods in statistical genetics including linkage analysis, candidate gene and whole genome association, and the principles of adaptive study design.

 

8. Synopsis Part 1. Highlights in experimental biology and medicine – András Falus

 

1.       Introduction of the systems biology concept, basics of molecular cell biology, genetics, genomics/proteomics/metabolomics and epigenetics, Evolution at various levels.

 

2.       High-throughput technologies, PCR, SNP, CGH, expression arrays.

 

3.       Omics databases, data-mining, systemic biology integration. Bioinformatics as integrating tool set

 

4.       Physiological and pathological challenges  for bioinformatics I. (cancer, infection, inflammation, allergy)

 

5.       Physiological and pathological challenges  for bioinformatics II. (arthritis, neurodegenerative syndromes, cardiovascular and metabolic diseases

 

6.       Physiological and pathological challenges  for bioinformatics III. (stem cell research, vaccination, drug development)

 

Part 2. Bioinformatics – Péter Antal

 

7.        Biomedical data and knowledge

 

Gene expression measurement, genotyping, and sequencing technologies. Gene regulation, gene regulatory networks, biological pathways. Standards for high-throughput data. Medical data, clinical coding systems, clinical informatics. Ontologies for biological knowledge and data.

 

8.       Decision theory

 

The Bayesian decision problem, Bayesian decision. Bayes error. Bayes factor. Utility theory, general loss functions, clinical utilities (micromort, the quality-adjusted life years). Quality of prediction, the Receiver Operating Characteristics, Area under the ROC Curve. Decision support systems. Sequential decisions, Markov decision process, and the optimal policy. Value of further information in medical diagnosis. Adaptive study design and the multi-armed bandit problem, active learning, budgeted learning in pharmacology.

 

9.       Graphical models

 

The graphoid axioms. Markov networks. Bayesian networks: soundness, completeness, observational equivalence classes of directed acyclic graphs, stabile distributions. Causal interpretation with and without hidden variables. Biomedical criterion of causality, elimination of confounding. Structural equations. The calculus of interventions. Dynamic Bayesian networks.  Decision networks

 

10     Computer intensive statistics

 

The bootstrap approach. Permutation tests. Monte Carlo methods with graphical models: DAG-MCMC, ordering-MCMC, reversible jump MCMC. Analysis of incomplete data with MCMC methods.

 

11     Gene expression data analysis

 

Image processing. Feature extraction. Normalization. Distance, variance, and topology preserving mapping. Clustering. Prediction. The conditional Bayesian approach. The feature subset selection problem. Biclustering. Module network learning. Gene prioritization. Gene annotation and gene set based analysis. Pathway analysis.

 

12     Genetic association studies

 

Linkage analysis, family-based association. Candidate gene and whole genome association. Study design: gene prioritization and adaptive study design. Image processing and feature extraction. Haplotype reconstruction methods. Univariate methods. Multivariate methods: logistic regression, Bayesian networks.

 

 

9. Method of instruction

3 lectures and 1 exercise.

10. Assessment In Part 1 the students are required to present solid knowledge in basic systems biology and to design a realistic research plan in biomedical science or pharmacogenomics. A ppt presentation is expected to perform. Critical element of the grading will be the convincing presentations.

 

The written exam and the presentations will illustrate students’ ability to form a convincing logical demonstration to the different audiences. The students will be also exposed to simulated conflict situations by receiving comments from the audience.

 

Grading will be based on the following criteria:

 

- Evaluation of the basic knowledge                                       40 points

 

- Clarity of the approach of the experimental problem                       20 points

 

- Presentation skills                                                      20 point

 

- Class participation & activity on the sessions                             20 points

 

 

In Part 2 the students are required to prepare a report of their data analysis and present their work in a live PowerPoint presentation.

 

Critical element of the grading will be the presentation and the quality of the analysis.

 

Grading will be based on the following criteria:

 

- Problem formulation, goals of the data analysis                        20 points

 

- Statistical aspects                                              20 points

 

- Computational aspects                                            20 points

 

- Interpretation, including presentation                           20 points

 

- Embedding and application                                    20 points

 

13. References, textbooks and resources Part 1 Alberts B. et al, Molecular biology of the cell, Falus A. (ed.)  Immunogenomics and Human Disease, John Wiley & Sons, Ltd.  (2005), Falus A (Editor) Clinical Applications of Immunomics. Springer, 2008. Lodish H et al. http://bcs.whfreeman.com/lodish5e/default.asp?s=&n=&i=&v=&o=&ns=0&t=&uid=0&rau=0, Malcolm A et al Genomics, Proteomics and systems Biology, www.bio.davidson.edu/genomics

 

 

Part 2: Pierre Baldi, Soren Brunak, Sren Brunak: The Machine Learning Approach, Second Edition (Adaptive Computation and Machine Learning) Further  recommended reading::D. Siegmund, B. Yakir, The statistics of gene mapping, Springer, 2007, Pierre Baldi DNA Microarrays and Gene Expression : From Experiments to Data Analysis and Modeling

 

14. Required learning hours and assignment
Kontakt óra56
Félévközi készülés órákra14
Felkészülés zárthelyire
Házi feladat elkészítése30
Kijelölt írásos tananyag elsajátítása
Vizsgafelkészülés20
Összesen120
15. Syllabus prepared by
Dr. Falus András

 

egyetemi tanár

 

SOTE Genetikai, Sejt- és Immunbiológiai Intézet

 

Dr. Antal Péter

 

egyetemi adjunktus

 

Méréstechnika és Információs Rendszerek Tanszék