vissza a tantárgylistához   nyomtatható verzió    

    Data Science - Part 2

    A tantárgy neve magyarul / Name of the subject in Hungarian: Adatbányászat - 2

    Last updated: 2010. november 10.

    Budapest University of Technology and Economics
    Faculty of Electrical Engineering and Informatics

    Mérnök informatikus szak

    BSc képzés

    Course ID Semester Assessment Credit Tantárgyfélév
    VISZA084   0/0/2/f 2  
    3. Course coordinator and department dr. Katona Gyula,
    Web page of the course www.cs.bme.hu/....
    4. Instructors
    Name

     

    Position

     

    Department

     

    András A. BENCZÚR

     

    Lecturer

     

    Department of Computer Science and Information Theory

     

    András LUKÁCS

     

    Lecturer

     

    Department of Computer Science and Information Theory

     

    Gyula Y. KATONA

     

    Assoc. professor

     

    Department of Computer Science and Information Theory

     

    Gábor WIENER

     

    Assoc. professor

     

    Department of Computer Science and Information Theory

     

    5. Required knowledge

    The course requires basic knowledge in data mining. (See also the course Data Mining: Models and Algorithms) Background in probability theory and linear algebra is important. Knowledge in combinatorics and algorithms is an advantage.

    May be studied in the same semester as „Data mining - Part 1”

    7. Objectives, learning outcomes and obtained knowledge

    The aim of the course is to discuss advanced techniques of data mining with useful knowledge of related disciplines supporting real-world, especially bioinformatics data mining projects. By the end of the course, students will be able to analyze biological (genomic, microarray, pathway, protein, chemical) data sets using complex data mining methods.

    8. Synopsis

    1.  Advanced classification methods: Bagging, boosting, AdaBoost.
    2.  Random forest. Implementation of models by WEKA.
    3.  Support Vector Machine. Kernel methods, graph kernels. Protein function prediction.
    4.  Similarity measures, fingerprint based similarity search. Sketches.
    5.  Dimensionality reduction by spectral methods, singular value decomposition, low-rank                             approximation.
    6.  Spectral clustering, bi-clustering for microarrays.
    7.  Mixture models. Maximum likelihood estimators, EM-algorithm.
    8.  Gauss Mixture Models. Midterm test.
    9.  Search engines, web information retrieval, PageRank and beyond.
    10. Rank learning for Protein Structure Prediction.
    11. Text mining, natural language processing. Building databases and networks from PubMed and BioMed Central.
    12. Graph mining algorithms. Frequent subgraph mining in microarray-based co-expression networks.
    13. Semi-supervised classification of network data, graph stacking in biological networks.
    14. Feature selection methods for unbalanced data sets. Final test.

     

    9. Method of instruction

    Handouts, PowerPoint presentations, relevant research papers, web page, course mailing list and Wiki. Weekly regular office hour for consultations.

    10. Assessment

    Case study: A practical problem. Choosing the model, solution method. Implementation of the algorithm.        

    Grading principles:                    

    Model:                               40% 

    Solution method:                40%

    Implementation:                  20%

     

    12. Consultations You can reach the instructor at the following e-mail address for consultation:

     

    András A. Benczúr : benczur@ilab.sztaki.hu

     

    13. References, textbooks and resources

     

    14. Required learning hours and assignment
    Number of contact hours28
    Preparation to the classes12
    Preparation to the tests
    Homework20
    Assigned reading
    Preparation to the exam
    Total60
    15. Syllabus prepared by
    Name

     

    Position

     

    Department

     

    András A. Benczúr

     

    Lecturer

     

    Department of Computer Science and Information Theory

     

    Comments

    May be studied in the same semester as „Data mining - Part 1”