• A
  • A
  • A
  • АБB
  • АБB
  • АБB
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта

PhD Research Seminar

Мероприятие завершено

Reporter 1: Airat Valiev
Title: In-Prompt Ensemble with Entities and Knowledge Graph for Medical Error Correction

Abstract: This paper presents our LLM-based system designed for the MEDIQA-CORR @ NAACL-ClinicalNLP 2024 Shared Task 3, focusing on medical error detection and correction in medical records. Our approach consists of three key components: entity extraction, prompt engineering, and ensemble. First, we automatically extract biomedical entities such as therapies, diagnoses, and biological species. Next, we explore few-shot learning techniques and incorporate graph information from the MeSH database for the identified entities. Finally, we investigate two methods for ensembling: (i) combining the predictions of three previous LLMs using an AND strategy within a prompt and (ii) integrating the previous predictions into the prompt as separate ‘expert’ solutions, accompanied by trust scores representing their performance. The latter system ranked second with a BERTScore score of 0.8059 and third with an aggregated score of 0.7806 out of the 15 teams’ solutions in the shared task.

Reporter 2: Aziz Temirkhanov
Title: Digital twins for data storage systems

Abstract: ABSTRACT High-precision systems modeling is one of the main areas of industrial data analysis. Models of systems, their digital twins, are used to predict their behavior under various conditions. In this study, we developed several models of a storage system using machine learning-based generative models to predict performance metrics such as IOPS and latency. The models achieve prediction errors ranging from 4–10% for IOPS and 3–16% for latency and demonstrate high correlation (up to 0.99) with observed data. By leveraging Little’s law for validation, these models provide reliable performance estimates. Our results outperform conventional regression methods, offering a vendor-agnostic approach for simulating data storage system behavior. These findings have significant applications for predictive maintenance, performance optimization, and uncertainty estimation in storage system design