"I want my students to have a better understanding of the optimization algorithms after my course"
Darina Dvinskikh has been working at the Faculty of Computer Science since September. She has been hired via the tenure track programme. Darina told us how her choice of research topic for the master's thesis determined her path in science and where the Wasserstein barycenters can be useful.
Ever since high school, I liked mathematics and physics. I knew exactly what I was going to study engineering in Moscow. I participated in school competitions, first only in mathematics and, in high school, also in physics. As a result, I got prizes in both subjects, which is equivalent to passing the state exam with the highest grade.
When choosing a university, I considered HSE University, Moscow State University and Moscow Institute of Physics and Technology. Eventually I chose MIPT. There I finished my bachelor's degree in applied mathematics and physics, and in the master's programme I studied at a joint programme of MIPT and Skoltech. I did not initially think about post-graduate studies, as I planned to go into industry after finishing my master's.
During master's studies I worked on developing numerical optimization methods to calculate Wasserstein barycenters. The notion of a Wasserstein barycenter is related to the notion of an average object, which is defined through the optimal transport metric. To find a barycentre one should solve an optimization problem of minimizing the sum of distances from each object to the required barycentre. In Euclidean space, the barycentre of points is simply the average of these points. For "complex" objects, such as probabilistic measures or objects defined by probabilistic measures (e.g. pictures), the usual notion of a Euclidean mean is not suitable, because the geometry of such objects is far from Euclidean.
Such "complex" objects could be, for example, MRI scans of the human brain. We want to determine whether a particular brain scan shows signs of a disease. Of course, it can be done by a doctor, but we would like to see a computer do it too. Patients' scans may differ greatly from each other, so it makes sense to first calculate the "average" image and then compare each scan to this "average". To be more precise, it is better to calculate "average" images for each group of healthy people, e.g. grouped by sex and age. Then you can use transport metrics to compare how much the MRI image of a new patient differs from the "average" image of healthy people in the same group, and make assumptions about the presence of any disease.
The problem of calculating barycenters is an optimization problem, and in general it does not have an analytical solution, so numerical methods need to be developed. Barycenters are used in unsupervised learning to estimate the mean object and are a valid estimate in the pattern recognition problem under some additional conditions on transformations of this pattern. For example, some canonical spelling of the numeral 1 can be taken as a pattern, and MNIST dataset can be viewed as a realization of this pattern under the action of some transformation associated with a particular person's handwriting. Then the "average" of all the units from MNIST would act as some kind of estimate of the original canonical unit.
What I like about this problem is that it combines several sections of optimization at once. Sometimes you study some theory in optimization, e.g. duality, and wonder where it can be applied. And Wasserstein's problem of finding a barycenter clarifies this.
In our final year together with our supervisor and scientific adviser we wrote a paper in which we proposed a distributed algorithm for calculating barycentres and submitted it to NeurIPS, international machine learning conference. It was accepted into the Spotlight category, in which authors can not only present their poster but also give a five-minute talk. After this experience, I started thinking about a PhD. I heard some feednack from the friends who worked in the industry — they complained about the boring and monotonous tasks. This played a decisive role. I liked my research better.
At the end of my master's programme, I applied to Humboldt University in Berlin. I chose the university for a reason: I wanted to work with Professor Vladimir Spokoiny. It was he who posed the problem of barycentres, which I studied in the master's programme. I met Prof. Spokoiny at a conference in Moscow. I did not apply to other places; I thought that if I did not get in, I would go to the industry. But everything worked out well and I was accepted.
During my PhD, I worked in the Spokoiny's group at the Weierstrass Institute. Pavel Dvurechensky, who supervised my research work while I was pursuing my master's, also works there. I dealt with numerical optimization methods, distributed computing, and also continued to work on the problem of calculating Wasserstein barycenters.
On the whole I enjoyed living and working in Germany. I especially liked going to school by bicycle - in Berlin it is quite convenient, as there are bicycle lanes almost everywhere. But after I had completed my degree, I still decided to return to Russia.
HSE University has a special programme for those who have obtained their PhD not in Russia; it is called tenure track. Having returned to Russia, I applied to take part in this programme. I was accepted, and so I became an associate professor at HSE University.
Now I'm working on gradientless methods, but I'm also looking for a new area of interest to me. I also teach a course in modern numerical optimization. Optimisation forms the theoretical basis for machine learning, which is currently enjoying great popularity. I want my students to have a better understanding of the optimization algorithms they use in machine learning tasks after my course.
Leisure time is a change of activity, at work I spend a lot of time sitting at the desk, so I like to spend my free time actively. I like ice skating, cross-country and downhill skiing, playing volleyball and swimming.