11, Pokrovsky boulevard.
Phone: +7 (495) 531-00-00 *27254
Kashin B. S., Kosov E., Limonova I. V. et al.
Journal of Complexity. 2022. Vol. 71.
Kleeva D., Soghoyan G., Komoltsev I. et al.
Journal of Neural Engineering. 2022. Vol. 19. No. 3.
Kolpakov A., Talambutsa A.
Proceedings of the American Mathematical Society. 2022. Vol. 150. No. 6. P. 2301-2307.
Nesterov R., Bernardinello L., Lomazova I. A. et al.
Software and Systems Modeling. 2022.
Chirkova N., Troshin S.
In bk.: ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. Association for Computing Machinery (ACM), 2021. P. 703-715.
The Faculty of Computer Science was created with the goal of becoming one of the world’s leading faculties for developers and researchers in data analysis, machine learning, big data, theoretical computer science, bioinformatics, system and software engineering, system programming, and distributed computing. In cooperation with major companies like Yandex, Sberbank, SAS, Samsung, 1C, and many others, the Faculty provides both deep theoretical knowledge and hands-on practical experience in many branches of contemporary computer science.
Students of the Data Science and Business Analytics double degree programme work on individual and group projects every year. We asked third-year students Kamil Alyakaev and Stas Ushakov to talk about their project. Last year, another project earned Kamil and Stas a 1C:Scholarship.
How did you choose this topic and your fellow worker?
What is your project about?
Stas: The main topic of the project is to research algorithms for document illumination recovery. Our task is to study existing algorithms for removing shadows, glare and highlights from photos of documents. In fact, we need to find a method that can bring a photo of a document on a phone to an acceptable quality.
Kamil: This means that if there are shadows or highlights on documents, the method in question should be able to replace these hard-to-recognize areas with areas of clearly distinguishable text.
Stas: It may seem easy to do, but there are many pitfalls and we had no experience with such tasks. It helped a lot that there were clear objectives at each stage: this was a great achievement of Kamil. We divided responsibilities: practice for me, theory for Kamil. In the end, it was almost like that, but Kamil also collected test and training data for neural network-based methods. The result was not only good analytics on available open-source methods but also relatively large datasets for training and testing models.
How did you organize your work on the project?
Kamil: I cannot say that work on the project started smoothly. We did not even know exactly where to start: there was only a formal formulation of the task. Most of the time, until mid-winter, we were looking for theory, information, and methods, which turned out to be quite difficult to find. After that, some concrete tasks emerged. Around here we agreed with Stas that I would be in charge of the theoretical part of the project and setting goals, and he would be directly in charge of the practical part, that is, running the methods we found and reporting on how poorly they worked.
There were no methods created specifically for our purpose, but there were some that suited our purpose. We used them and had to train them specifically in these solution models.
Specific training required an equally specific dataset, which did not exist before our project. For about a week we painstakingly photographed the printed text on paper with different backgrounds and lighting, in two copies - with and without shadow. After collecting about 1200 pairs of such photographs, we subjected the dataset to processing, creating a third type of photographs - shadow maps. These indicated the location of the light anomaly, which was interpreted by the model for later removal. In this way, we were able to train at least one neural network to work consistently with these kinds of photos.
What is the result of the project?
Kamil: We consider the final result of our work to be the compilation of method evaluations and recommendations for working with them, as well as the dataset itself. We were supposed to implement our own method, but we did not have enough time and knowledge to create a user-friendly application. On the subject of practical application, again, we left behind a dataset and a list of recommendations, which will definitely make the work easier for those who set themselves similar or related goals.
Was it difficult to win the scholarship?
Kamil: Our supervisor felt that the project was worthy of attention and, at the very least, participation in the scholarship competition. We didn't know what kind of works we were competing against and what criteria would be used to select them, so I can't say anything about the difficulty.
Stas: We just submitted the project to the scholarship competition without much hope. The maximum we hoped for was a plus in the portfolio for participation. But we did the project with full dedication, we did not slack off anywhere. So in my world view, this scholarship is our "good luck - the reward for courage", as the old song says.
1C:Scholarship is given for achievements in team projects, term papers, software projects, and theses, the topics of which are provided by 1C. Read more about 1C:Scholarship.