The work is devoted to academic papers recommendation task considered as link prediction on a static citation network. We compare several graph embeddings, text-based and fusion models in the link prediction problem on academic papers citation dataset. We showed that fusion models of graph and text information outperform other approaches based on graph or text information alone. We prove this via an extensive set of experiments with different train/test splits that our fusion models are robust and retain superior performance even with a reduced train set.
This book constitutes the proceedings of the 19th Russian Conference on Artificial Intelligence, RCAI 2021, held in Moscow, Russia, in October 2021.
The 19 full papers and 7 short papers presented in this volume were carefully reviewed and selected from 80 submissions. The conference deals with a wide range of topics, categorized into the following topical headings: cognitive research; data mining, machine learning, classification; knowledge engineering; multi-agent systems and robotics; natural language processing; fuzzy models and soft computer; intelligent systems; and tools for designing intelligent systems.
The present article reviews some recent papers concerned with chaotic time series prediction in the context of predictive clustering, and discusses in greater detail some novel techniques designed to avoid ‘a curse of exponential growth’ – errors grow exponentially depending on the number of steps ahead to be predicted. These techniques are non-successive observations, combined with a prognosis that employs already predicted values, the concept of non-predictable points, and a quality assessment of clusters used. The approach discussed, allows one to separate calculation into two parts: the first part, essentially larger, is performed off-line, the second, immediate prediction routine, is carried out on-line. This makes it possible to design fast and efficient prediction algorithms. A wide-ranging simulation, suggests that the error term associated with the prediction sub-model used, provided that clusters used to predict are chosen correctly, vanishes as the validation set size grows to infinity. Similarly, the error term associated with an incorrect choice of clusters used to predict, decreases when a validation set size increases.
Recently the World faced force push to distant learning caused by COVID-19 disease. Statistical numbers show a notable increasing number of users of corporate educational solutions utilizing cloud architecture. However, non-cloud-based learning tools do not meet this growth. In this work the authors consider the causes of that contradictory behaviour and present an explanation based on differences between two types of these educational systems. Also, the authors formulate an interpretation giving a list of extracted technologies or product features that allow corporate solutions to quickly gain popularity among educational society. In addition, clear examples of their connection to learning methods that can improve teaching, learning, and the last, but not the least a user’s experience are provided. And finally, the authors highlight a sig- nificant role of integration and interoperability standards supporting easy com- ponents replacement and scaling.
Our concern is the problem of determining the data complexity of answering an ontology-mediated query (OMQ) given in linear temporal logic LTL over (Z, <) and deciding whether it is rewritable to an FO(<)-query, possibly with extra predicates. First, we observe that, in line with the circuit complexity and FO-definability of regular languages, OMQ answering in AC0, ACC0 and NC1 coincides with FO(<, ≡)-rewritability using unary predicates x ≡ 0 (mod n), FO(<, MOD)-rewritability, and FO(RPR)-rewritability using relational primitive recursion, respectively. We then show that deciding FO(<)-, FO(<, ≡)- and FO(<, MOD)-rewritability of LTL OMQs is ExpSpace-complete, and that these problems become PSpace-complete for OMQs with a linear Horn ontology and an atomic query, and also a positive query in the cases of FO(<)- and FO(<, ≡)-rewritability. Further, we consider FO(<)-rewritability of OMQs with a binary-clause ontology and identify OMQ classes, for which deciding it is PSpace-, Πp2- and coNP-complete.
CatLog is a categorial grammar parser/theorem-prover developed by Glyn Morrill and his co-authors. CatLog is based on an extension of Lambek calculus. A distinctive feature of this extension is the usage of brackets for controlled non-associativity and a subexponential modality whose contraction rule interacts with bracketing in a sophisticated way. We consider two variants of the calculus, appearing in different versions of CatLog. Both systems are, unfortunately, undecidable in general.We consider fragments where the usage of subexponential is restricted by so-called bracket non-negative/non-positive conditions, prove that these fragments are decidable, and pinpoint their place in the complexity hierarchy. We also consider a more complicated, but more practically interesting problem of inducing (guessing) brackets. For this problem, we prove one decidability and one undecidability result, and leave some open questions for further research.
We show that deciding boundedness (aka FO-rewritability) of monadic single rule datalog programs (sirups) is 2\Exp-hard, which matches the upper bound known since 1988 and finally settles a long-standing open problem. We obtain this result as a byproduct of an attempt to classify monadic 'disjunctive sirups'---Boolean conjunctive queries $\q$ with unary and binary predicates mediated by a disjunctive rule $T(x) łor F(x) łeftarrow A(x)$---according to the data complexity of their evaluation. Apart from establishing that deciding FO-rewritability of disjunctive sirups with a dag-shaped $\q$ is also 2\Exp-hard, we make substantial progress towards obtaining a complete FO/Ł-hardness dichotomy of disjunctive sirups with ditree-shaped $\q$.
We investigate ontology-based data access to temporal data. We consider temporal ontologies given in linear temporal logic LTL interpreted over discrete time . Queries are given in LTL or , monadic first-order logic with a built-in linear order. Our concern is first-order rewritability of ontology-mediated queries (OMQs) consisting of a temporal ontology and a query. By taking account of the temporal operators used in the ontology and distinguishing between ontologies given in full LTL and its core, Krom and Horn fragments, we identify a hierarchy of OMQs with atomic queries by proving rewritability into either , first-order logic with the built-in linear order, or , which extends with the standard arithmetic predicates , for any fixed , or , which extends with relational primitive recursion. In terms of circuit complexity, - and -rewritability guarantee OMQ answering in uniform and, respectively, .
We obtain similar hierarchies for more expressive types of queries: positive LTL-formulas, monotone - and arbitrary -formulas. Our results are directly applicable if the temporal data to be accessed is one-dimensional; moreover, they lay foundations for investigating ontology-based access using combinations of temporal and description logics over two-dimensional temporal data.
We investigate language interpretations of two extensions of the Lambek calculus: with additive conjunction and disjunction and with additive conjunction and the unit constant. For extensions with additive connectives, we show that conjunction and disjunction behave differently. Adding both of them leads to incompleteness due to the distributivity law. We show that with conjunction only no issues with distributivity arise. In contrast, there exists a corollary of the distributivity law in the language with disjunction only which is not derivable in the non-distributive system. Moreover, this difference keeps valid for systems with permutation and/or weakening structural rules, that is, intuitionistic linear and affine logics and affine multiplicative-additive Lambek calculus. For the extension of the Lambek calculus with the unit constant, we present a calculus which reflects natural algebraic properties of the empty word. We do not claim completeness for this calculus, but we prove undecidability for the whole range of systems extending this minimal calculus and sound w.r.t. language models. As a corollary, we show that in the language with the unit there exists a sequent that is true if all variables are interpreted by regular language, but not true in language models in general.
Social networks are an integral part of modern life. They allow us to communicate online and exchange all kinds of information. In this paper, we consider the social network Instagram and its hashtags as a key tool for finding relevant information and new friends. The aim of our work is an empirical analysis of hashtags for posts in Instagram with certain locations. We obtain database of users of the Instagram network and collect a dataset of posts for three Far Eastern cities. Then, we build a friendship graph, for which we solve the link prediction problem. We show that both, structural and attributive graph information, such as hashtags, is important to achieve best quality.
The Logical Perspectives Summer School and Workshop Series aims at giving advanced introductions into various branches of logic, and providing researchers — including early career scientists — with an opportunity to present their work.
In particular, LP 2021 Summer School (June 14–16) and Workshop (June 17–19) will focus on computational proof theory, broadly understood. The programme will comprise three mini-courses on different aspects of computational proof theory, and also a number of contributed talks.
Pattern mining is well established in data mining research, especially for mining binary datasets. Surprisingly, there is much less work about numerical pattern mining and this research area remains under-explored. In this paper we propose MINT, an efficient MDL-based algorithm for mining numerical datasets. The MDL principle is a robust and reliable framework widely used in pattern mining, and as well in subgroup discovery. In MINT we reuse MDL for discovering useful patterns and returning a set of non-redundant overlapping patterns with well-defined boundaries and covering meaningful groups of objects. MINT is not alone in the category of numerical pattern miners based on MDL. In the experiments presented in the paper we show that MINT outperforms competitors among which IPD, REALKRIMP, and SLIM.
P108 - Predictive mathematical modelling of recurrence periods for the secondary distant metastases in patients with ER/PR/HER2/Ki-67 subtypes of breast cancer