• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
Article
Efficient indexing of peptides for database search using Tide

Acquaye F. L., Kertesz-Farkas A., Stafford Noble W.

Journal of Proteome Research. 2023. Vol. 22. No. 2. P. 577-584.

Article
Mint: MDL-based approach for Mining INTeresting Numerical Pattern Sets

Makhalova T., Kuznetsov S., Napoli A.

Data Mining and Knowledge Discovery. 2022. P. 108-145.

Book chapter
Modeling Generalization in Domain Taxonomies Using a Maximum Likelihood Criterion

Zhirayr Hayrapetyan, Nascimento S., Trevor F. et al.

In bk.: Information Systems and Technologies: WorldCIST 2022, Volume 2. Iss. 469. Springer, 2022. P. 141-147.

Book chapter
Ontology-Controlled Automated Cumulative Scaffolding for Personalized Adaptive Learning

Dudyrev F., Neznanov A., Anisimova K.

In bk.: Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners’ and Doctoral Consortium -23rd International Conference, AIED 2022, Durham, UK, July 27–31, 2022, Proceedings, Part II. Springer, 2022. P. 436-439.

Book chapter
Triclustering in Big Data Setting

Egurnov D., Точилкин Д. С., Ignatov D. I.

In bk.: Complex Data Analytics with Formal Concept Analysis. Springer, 2022. P. 239-258.

Article
Triclusters of Close Values for the Analysis of 3D Data

Egurnov D., Ignatov D. I.

Automation and Remote Control. 2022. Vol. 83. No. 6. P. 894-902.

Article
Deep Convolutional Neural Networks Help Scoring Tandem Mass Spectrometry Data in Database-Searching Approaches

Kudriavtseva P., Kashkinov M., Kertész-Farkas A.

Journal of Proteome Research. 2021. Vol. 20. No. 10. P. 4708-4717.

Article
Language models for some extensions of the Lambek calculus

Kanovich M., Kuznetsov S., Scedrov A.

Information and Computation. 2022. Vol. 287.

Basics of Data Analysis

2025/2026
Academic Year
ENG
Instruction in English
ECTS credits
Type:
Elective course
When:
1 year, 1, 2 module

Instructor

Course Syllabus

Abstract

Data analysis is to help the user in enhancing and augmenting knowledge of the domain as represented by the concepts and statements of relation between them. This view distinguishes this class from related subjects such as applied statistics, machine learning, data mining, etc. Two main pathways for knowledge discovery are: (1) summarization, for developing and augmenting concepts, and (2) correlation, for enhancing and establishing relations between concepts. The term summarization is understood quite broadly here to embrace not only simple summaries like totals and means, but also more complex summaries: the principal components of a set of features and cluster structures in a set of entities. Similarly, correlation here covers both bivariate and multivariate relations between input and target features including regression, classification trees and Bayesian classifiers. Another feature of the class is that its main thrust is in giving an in-depth presentation of a few basic techniques and their properties rather than to cover a broad spectrum of approaches developed so far. This allows me to bring forward a number of mathematically derived interpretation tools and relations between methods that are usually overlooked.