Мы используем файлы cookies для улучшения работы сайта НИУ ВШЭ и большего удобства его использования. Более подробную информацию об использовании файлов cookies можно найти здесь, наши правила обработки персональных данных – здесь. Продолжая пользоваться сайтом, вы подтверждаете, что были проинформированы об использовании файлов cookies сайтом НИУ ВШЭ и согласны с нашими правилами обработки персональных данных. Вы можете отключить файлы cookies в настройках Вашего браузера.
Адрес: 109028, г. Москва, Покровский бульвар, д. 11, корпус S, комната S938 (станции метро "Чистые пруды" и "Курская").
Телефон: +7(495) 772-95-90 *27319
VizDoom is a flexible and easy-to-use 3D reinforcement learning research platform based on the well-known Doom first-person shooter. The challenge is to create bots that compete in the DeathMatch track, making decisions based solely on visual in-formation from the screen. The paper offers a com-parison of different approaches with reinforcement learning: Q-learning and policy-gradient algorithms. We explore the distributed learning paradigm in re-inforcement learning, and also discuss the differences in speed and quality of convergence when adding an object detection module.
Accurate depth estimation from images is a fundamental task in deep learning. It has many applications including scene understanding and reconstruction. Datasets for supervised depth estimation are hard to obtain and usually do not contain a sufficient number of images or a sufficient variety of scenes. Since inputs for depth estimation are simple RGB images, it is easy to obtain a large number of various unlabeled images. We consider that depth masks can be labeled by using manual marking. Thus, we researched the possibility of performing an active learning approach for selecting unlabeled samples to be labeled. In this work, we concentrated on using the learning loss method to perform active learning train selection. We performed multiple experiments with the learning loss algorithm and evaluated the resulting model.
Graph visualization is an effective and efficient way to discover complex inter-connections between elements within the nested structure of data. To accomplish this type of representation machine learning algorithms use a technique called graph embedding and node embedding in particular. However, in this paper, we will compare well-known techniques to yet largely under-explored setting of graph embedding named community embedding: embedding individual communities instead of individual nodes. This type of embedding can be especially useful in graph visualization and community detection tasks. Despite the fact that graph embedding and clustering tasks are separate, a good solution to the first one tends to have a correlation with the solution of the second problem and may have a positive impact if knowledge is transferred.
The work is devoted to academic papers recommendation task considered as link prediction on a static citation network. We compare several graph embeddings, text-based and fusion models in the link prediction problem on academic papers citation dataset. We showed that fusion models of graph and text information outperform other approaches based on graph or text information alone. We prove this via an extensive set of experiments with different train/test splits that our fusion models are robust and retain superior performance even with a reduced train set.
We consider the problem of voice cloning, which is desirable in many film-related industries, and developed a new modification of the AutoVC state-of-the-art model in the task of voice conversion. We studied the replacement of recurrent modules with convolutional layers while maintaining the quality of the original model. The result of our work showed the speed improvement on longer voice tracks and faster training with the tiniest deterioration in sound quality, as evidenced by the reconstitution loss and Mel-cepstral distortion.
Many tasks in graph machine learning, such as link prediction and node classification, are typically solved using representation learning. Each node or edge in the network is encoded via an embedding. Though there exists a lot of network embeddings for static graphs, the task becomes much more complicated when the dynamic (i.e., temporal) network is analyzed. In this paper, we propose a novel approach for dynamic network representation learning based on Temporal Graph Network by using a highly custom message generating function by extracting Causal Anonymous Walks. We provide a benchmark pipeline for the evaluation of temporal network embeddings. This work provides the first comprehensive comparison framework for temporal network representation learning for graph machine learning problems involving node classification and link prediction in every available setting. The proposed model outperforms state-of-the-art baseline models. The work also justifies their difference based on evaluation in various transductive/inductive edge/node classification tasks. In addition, we show the applicability and superior performance of our model in the real-world downstream graph machine learning task provided by one of the top European banks, involving credit scoring based on transaction data.