3-й международный семинар "Анализ формальных понятий в исследовании данных" прошел 7 июня 2019 по адресу Кочновский проезд, д.3

Формальные понятия оказались очень важными для исследования знаний как в качестве инструмента для краткого представления ассоциативных правил, так и в качестве инструмента для кластеризации и построения таксономий. Целью семинара FCA4KD является объединение исследователей, работающих над различными аспектами извлечения знаний на основе АФП, с приложениями в таких областях, как компьютерные и информационные науки, лингвистика, социальные науки, биоинженерия, химия и т. д.

Темы семинара:
· решетки формальных понятий и связанные с ними структуры

· импликации на атрибутах и зависимости в данных

· предварительная обработка данных

· сокращение избыточности и размерности

· информационный поиск

· кклассификация

· ккластеризация

· ассоциативные правила и другие зависимости данных

· онтологии
Докладчики:
Андрей Родин, (Институт Философии РАН) выступил с докладом "Проблема обоснования в представлении знаний"
В то время как традиционная философская эпистемология проводит четкое различие между истинным мнением и знанием (как обоснованным истинным мнением), формализация этого различия с помощью стандартных логических средств оказывается затруднительной. В теории представления знаний как разделе компьютерной науки это важное понятийное различие на сегодняшний день, как правило, вовсе не учитывается. В практическом плане это приводит к тому, что существующие системы цифрового представления знаний не предоставляют пользователям никаких специальных средств для рутинной проверки тех знаний, к которым пользователи данной системы получают доступ. Полученные в последние годы результаты исследований на стыке вычислительной математической логики, формальной эпистемологии и компьютерной науки предоставляют новые возможности для эффективной формализации и вычислительной реализации обосновательных процедур. В частности, схема для представления проверяемых знаний может быть построена с помощью гомотопической теории типов.

М.Ю. Богатырев, Тульский государственный университет (ТулГУ), "Towards constructing multidimensional formal contexts on natural language texts".
Recent success of applying vector-based and graph-based models of text’s semantics demonstrates possible interpretation of semantics as multidimensional notion. In this paper, brief survey of such models is presented and an idea of modeling multidimensional text’s semantics with multidimensional formal contexts is discussed. Several variants of realization of three-dimensional formal contexts with the usage of text’s semantic model of conceptual graphs are presented. Investigations were made on the texts of abstracts of biomedical papers from the PubMed databases.

Н.В. Шилов, Университет Иннополис, "Designing ontology for classification and navigation in Computer Languages Universe"
During the semicentennial history of Computer Science and Information Technologies, several thousands of computer languages have been created. The computer language universe includes languages for different purposes (programming, specification, modeling, etc.). In each of these branches of computer languages it is possible to track several approaches (imperative, declarative, object-oriented, etc.), disciplines of processing (sequential, non-deterministic, distributed, etc.), and formalized models, such as Turing machines or logic inference machines. The listed arguments justify the importance of of an adequate classification for computer languages. Computer language paradigms are the basis for the classification of the computer languages. They are based on joint attributes which allow us to differentiate branches in the computer language universe. We present our computer-aided approach to the problem of computer language classification and paradigm identification. The basic idea consists in the development of a specialized knowledge portal for automatic search and updating, providing free access to information about computer languages. The primary aims of our project are the research of the ontology of computer languages and assistance in the search for appropriate languages for computer system designers and developers. The paper presents our vision of the classification problem, basic ideas of our approach to the problem, current state and challenges of the project, and design of query language (based on combination of temporal, belief, description logics augmented by FCA constructs - derivatives and concepts).

С.А. Нерсисян , Московский государственный университет имени М.В.Ломоносова (МГУ), "Fitting a mixture of distributions that are close to uniform on boxes"
Fitting mixture distributions is a widely used clustering approach which finds many applications in various areas like computer science, biology, medicine etc. Since in the most cases there is no exact algorithm for global maximum likelihood (or maximum a posteriori) estimation of mixture distribution parameters, some special local optimization techniques like EM-algorithm are usually utilized. In this work EM-algorithm was applied to a mixture of generalized Gaussian distributions which played the role of a smooth approximation to the uniform distribution on a box with variable position and edge lengths. One of the advantages of this approach is interpretability: for each of resulting clusters and each data feature the algorithm will output the corresponding range. The approach proposed can be considered as a generalization of the previously studied problem of optimal box positioning which can be also formulated as a problem from formal concept analysis, namely, the problem of finding an interval pattern concept of maximum extent size.

Д.В. Виноградов, Федеральный исследовательский центр «Информатика и управление» Российской Академии Наук (ФИЦ ИУ РАН) "Random similarities computed on GPGPU"
The paper describes an implementation of a very simple probabilistic algorithm for finding similarities between training examples for General-purpose graphics card (GPGPU) calculations. The algorithm was programmed in OpenCL and its capabilities were investigated using AMD Radeon VII graphics card under Kubuntu Linux 18.04 LTS.

Е.Ф. Гончарова, Национальный исследовательский университет «Высшая школа экономики», "Increasing the efficiency of packet classifiers based on closed descriptions".
The efficient representation of packet classifiers has become a significant challenge due to the rapid growth of data kept and processed in the forwarding tables. In our work we propose two novel techniques for reducing the size of forwarding tables both in length and width by the elimination of redundant bits and unreachable actions. We consider the task of transferring the forwarding packet to the correct destination as the task of multinomial classification. Thus, the process of reducing the forwarding table size corresponds to feature selection procedure with slight modifications. The presented techniques are based on computation of closed description and building the decision trees for classification. The main challenge in applying decision trees to the task is processing the overlapping rules. To overcome this challenge we propose to imply the JSM hypothesis technique to eliminate the unreachable actions assigned to the overlapping rules. The experiments were held on data generated by the ClassBench software. The proposed approaches result in significant decrease of bits that should be included in the forwarding tables as features.

А.А. Незнанов, Национальный исследовательский университет «Высшая школа экономики», "Ontology Based Learning and FCA-based Approach in Automatic Item Generation"
In the report we discuss modern state of methodologies, methods and tools of an automatic item generation (AIG) for knowledge assessment. The most interested questions are problem of developing specific learning ontologies for AIG optimization, the role of the Semantic Web and other knowledge technology stacks in education, the implementation of adaptive and personalized learning.
We propose specific ontology consisted of a thesaurus, scale definitions, term distinctions and formal contexts linked with thesaurus nodes and scales. Such ontology helps to generate test items and provide an adaptive assessment of learning outcomes on several levels. We also discuss an architecture, requirements and basic components of distributed software system for adaptive learning process support.

Дата

10 июня 2019

Рубрики

Наука

Темы

исследования и аналитика репортаж о событии

В статье упомянуты

Международная лаборатория интеллектуальных систем и структурного анализа