"The belief that everything can be modelled gave me the incentive to model everything"
Alexander Tarakanov started out in science as a physicist, earned his PhD in the USA and worked both in industry and academia. Since September, he has been working at the Faculty of Computer Science via the Tenure Track programme. We talked to him about his research, favourite papers and plans.
I've had a long journey in science: I started with physics, then switched to numerical analysis, then there was machine learning in the oil industry, and then machine learning per se.
As a child, I liked solving problems, so I participated in school competitions in physics and mathematics. I liked both subjects, but in high school I had to make a choice which one to devote more time to. As there was quite a lot of competition in maths in our region, I chose physics. The competitions gave me a bonus when applying to university — since I studied physics, I naturally went to the Moscow Institute of Physics and Technology (MIPT).
At MIPT I started with particle physics, but then I realized that fundamental science is psychologically difficult, because decades may pass before you see the results of your work. So I switched to applied science, particularly applied mathematics.
It was then, in my fourth or fifth year, that I first became interested in programming. Computers make it possible to study a variety of physical systems through numerical modelling. I realized that studying the world does not always require a laboratory with expensive equipment, but rather a laptop. This was one of my main stimuli to study applied mathematics. Roughly speaking, the belief that everything can be modelled gave me the incentive to model everything.
What is numerical analysis? Any physical system can be described by a system of partial differential equations. These equations are non-linear and you can't solve them on paper, you need a computer. This is where numerical methods come to the rescue. First we construct a model of the system as a set of equations, then we solve these equations numerically, using a computer. Numerical methods give an approximate solution, but with good, controllable accuracy. From the solution we can draw practical conclusions.
I became interested in the oil industry after completing my bachelor's degree. At that time the oil industry provided the ideal opportunity to satisfy both scientific and material interests: it was what machine learning is now. This determined the choice of the master's degree: petroleum engineering.
Perks of American PhD
After my master's degree, I went to work for an oil company. At some point during my work I wanted to deepen my knowledge in numerical methods and in modelling the flow of liquids and gases. A second master's degree was not practical, so I chose PhD studies. I applied to several universities and ended up going to Texas A&M University, which is highly regarded in the oil industry.
In a way, doctoral studies are similar to a master's degree, only more difficult. In America it is generally possible to enter doctoral studies without having a master's degree, a bachelor's is enough, but then the study will last at least five years instead of three for masters. There are compulsory academic hours that have to be passed. The thesis defense process in America is almost the same as in Russia. The only but very significant difference is the lack of bureaucracy (as compared to Russia).
In America, as opposite to the 2013 Russia, I was surprised by a number of computers in use and a free entrance to the university campus — everything was open throughout the day, one could come in at any time.
America is an interesting country, but not mine. The priority is given to drivers: it's very convenient to drive between the promenade areas and restaurants, stopping at places of interest. In a typical American city, you don't get as much pleasure from walking the streets as in Moscow or St. Petersburg (and in many other cities too) when you feel the city's life just walking and being among people. In the US, you only see that in parks. Elsewhere there is a great risk of being the only pedestrian on the road, except perhaps for joggers.
The United Kingdom is much closer to my lifestyle: almost all cities have a rich history, lots of parks, you can walk for hours. In a way, all European cities are similar - castle, town hall, cathedral, old houses — but that doesn't lessen the pleasure of exploring new cities. But I hadn't originally planned to stay abroad — I wanted to see the world and understand life in other countries from the inside, so I could return later.
How do you become a postdoc?
After finishing my PhD I started looking for a postdoc position in Europe - I wanted to learn more about European culture. There were many attempts; it is not easy to find a postdoc position because of the high competition. I got in after some thirty attempts.
My first postdoc was at the Heriot-Watt University in Edinburgh, Scotland, and the second was at Manchester University in England.
In Edinburgh, I liked the very idea of the project. There's the Kyoto Protocol, in which all countries try to agree to emit less carbon dioxide into the atmosphere. The project they took me in was based on a different principle: its authors decided to pump carbon dioxide from the atmosphere underground. I was struck by the fact that people did not wait for everyone to agree but took action. That interested me a lot and I decided to try to simulate those processes. That's how I came into the world of machine learning.
This is also where my co-author and I wrote my favourite article about Bayesian experiment design. Bayesian experiment design is a problem about how to build a data collection system before an experiment is conducted. This problem usually boils down to Markov chains, so it is computationally complex. It took a lot of time and effort to work on this paper, as there were a number of technical difficulties. But at some point it came to an understanding of how they could be bypassed. We came up with quite an original method whose description we have published.
My second postdoc was purely about mathematical statistics and machine learning. There were two projects.
The first project was sponsored by Rolls Royce. They wanted to develop a methodology for describing alloys based on photographs at the micron scale. The aim was to extract from the photo a set of quantitative features that would describe the material sufficiently well. There were two reasons for this.
The first was to reduce the role of subjective judgement when comparing two alloys. For example, several evaluators could give different evaluations regarding the suitability of an alloy for a certain part. The effect of subjective evaluation can be reduced by making the comparison based on objective quantitative characteristics.
A second reason was to reduce the memory cost of storing images of different alloys: instead of storing an image itself, a relatively small set of features can be stored.
The second project was a statistical analysis of the incidence of COVID-19.
It just so happened that at that time a pandemic had started, and some of the researchers switched to it. I also worked on this for a while and we published a paper in which we used matstatistics to estimate the proportion of cases of COVID-19 as a proportion of the total number of people who contracted the disease. We found that only 10% of the total number of cases were reported. Here we adapted the previously known statistical method to the new data.
The article on COVID-19 is noteworthy in that it solved the problem using classical, somewhat "boring" numerical methods. Machine learning is in vogue now, everyone wants to learn neurons. There is probably not a single person in the fourth year at HSE University who does not want to do neural networks. And we managed without it.
The academia and the industry
After two postdocs, I returned to Moscow and worked at Huawei for a year. There, I worked on the problem of high dynamic range. Presently, phones and TVs are being produced with more advanced displays, which allow to show a more detailed, vivid picture. The main advantage of such displays is the ability to display images with high contrast. On such a display it is easier to cause the viewer to see a similarity between the real scene and the image.
I have been solving the problem of optimising the algorithms to reproduce the sensation of brightness contrast between the darkest and brightest objects as accurately as possible while still maintaining a "natural" image.
There are advantages and disadvantages to working in the industry and in the academy. In the industry, for example, it's much easier with equipment, but your choice of research topics is limited. At the academy, you have the freedom to choose your topic. You can implement your ideas, publish your work with your name on it and see the result of your work. That is why I came to the academy, to HSE University. And I want to share my experience.
In my research I am interested in solving practical problems from a scientific point of view, applying theory to practice.
In machine learning, there is the problem of obtaining the data. And numerical methods have good models and methods, but they are slow. Here too, machine learning models are being applied to speed up numerical methods. Since I have experience in both areas, I plan to work at the intersection of the two. You can develop machine learning methods to solve partial differential equations, or you can accelerate existing solutions and apply them in practice.
For instance, there is a problem that Skoltech is solving now. They are conducting experiments with elementary particles. They do not set up all the experiments at once, but try to predict which cases will be interesting and set up experiments only in these cases. To do this, they solve the Gross-Pitajewski equations. This is a complex task that can be accelerated using machine learning techniques.
In my spare time, I like to play sports - tennis, free diving, walking, moving around. This summer I went to the mountains for the first time. At work I work with my head, and afterwards I want it to rest.