Lemons, Magic and Computer Vision
.png)
Alexey Tolkachev, a student of the Master of Data Science online programme, a leading data analyst, talked with us about his victory in the Agro Code hackathon, shared his impressions of the programme, and explained why he plans to pursue computer vision in the future.
Alexey Tolkachev
Master of Data Science student
— First of all, congratulations on winning the hackathon! Could you tell us a little about yourself and how you got started in data science?
Thank you! I’ve already spent almost a year in Sochi where I’m engaged in the development of the computer vision solutions for the Russian Railways.
I was introduced to the wonders of machine learning in Innopolis. I worked there for three and a half years. It’s hard to remember how I got there, but I think I heard about this field at a presentation of the laboratory of machine learning and data representation. At that time, I didn’t have much knowledge about software development (the very trip to Innopolis was planned to change activities from system administration to development), or about machine learning. That’s why the announced projects seemed to be some kind of computer magic. And who wouldn't want to do magic?
— Why did you choose HSE University's Master of Data Science programme?
The answer is quite trivial. I wanted to get a master's degree from a first-rate university without being tied to a specific location. And it was a problem to do this anywhere but in Moscow or St. Petersburg.
— Why did you choose the data scientist track?
Honestly, that is a complicated question. I’m interested in research, so initially, I considered the researcher track. As a rule, one of the results of research is an article based on the conducted research. However, talking to the programme managers, it became clear that in the research track it isn’t possible to write articles, because, in my opinion, there is not enough time. That is why I decided to choose the track that would reinforce the skills I already have. In addition, as I understand, a data scientist is engaged in applied research.
— Tell us more about your participation in the hackathon: how did you come up with the idea of participating in the Agro Code hackathon? Have you participated in events like this before?
The hackathon was advertised on Slack by my colleagues. I looked it up and decided to take part, I’m a student after all. Before that, I participated once in the GISHack hackathon, where I had to segment buildings using space imagery in one of the tasks. My team of three people managed to take third place. I also take part in Kaggle competitions. In Agro Code, it was a solo competition.
— Tell us more about your project at this hackathon.
We were given a dataset (about 2,500 pictures) with images of lemons 1056x1056 pixels. The task was to determine the flaws of a lemon from the image. One image of a lemon can correspond to several types of flaws at the same time. It was necessary to write code that trained the model and did the full cycle of data reading, model training and subsequent prediction in twenty minutes on the organizer's platform. It was forbidden to use external data and the internet during the case solving. The competition had a leaderboard, which was compiled based on the Macro Milticlass ROC-AUC for a subset of the test data. The published code with the best metric value on a subset of the test data was considered the final code and ran on all the test data. The result obtained on the full set of data was considered the final one in determining the winner.
— By the way, in terms of results and knowledge — what programme course was the most interesting to you?
The course on data scraping was implemented in the best manner. This is, perhaps, the perfect example of how a practical course should look like, it has everything that is needed: theoretical knowledge, real-life problems, a large interesting project, which I had to implement step by step during three weeks (from the basic functionality to completing various details that improve the work), useful webinars with the discussion of additional materials and teacher's excellent command of English. All this combined made the course very cool. Among other courses, I'd like to mention Ilya Shchurov's courses on probability theory and statistics and advanced Python course by Yury Gorishny and Dmitry Borisov.
— What do you plan to do after graduation?
I’d like to find a PhD program related to computer vision (in particular, image analysis in medicine) and enrol in it. There are several reasons why I chose this particular field:
First, my personal connection with the field. In 2018, I participated in the learning projects in classical machine learning, computer vision, and natural language processing. I'm a curious person, and I was interested in all these fields. However, further projects that my future employers wanted to pursue were in the area of computer vision.
Secondly, some time ago, when I could only program in Pascal, I wanted to create an application, which, as it turned out later, could use computer vision. I never created the application (it became obsolete), but I learned how the technologies that I was going to use, work. In fact, I satisfied the need to create the “magic” that I mentioned earlier by learning the secret of the trick.
And finally, classical computer vision, as it turns out, uses a lot of different mathematics that I had enjoyed in my school and university years. And personally, I really enjoy seeing the visualization of any fundamental mathematical concepts in something simple and visual.
Ilya Schurov