Publications
Modern experiments in high-energy physics require an increasing amount of simulated data. Monte-Carlo simulation of calorimeter responses is by far the most computationally expensive part of such simulations. Recent works have shown that the application of generative neural networks to this task can significantly speed up the simulations while maintaining an appropriate degree of accuracy. This paper explores different approaches to designing and training generative neural networks for simulation of the electromagnetic calorimeter response in the LHCb experiment.
In the present work, we introduce a machine learning-based approach for galaxy clustering. It requires to determine clusters to provide further galaxies groups' masses estimation. The knowledge of mass distribution is crucial in dark matter research and study of the large-scale structure of the Universe. State-of-the-art telescopes allow various spectroscopy range data accumulation that highlights the need for algorithms with a substantial generalization property. The data we deal with is a combination of more than twenty different catalogues. It is required to provide clustering of all combined galaxies. We produce a regression on the redshifts with the coefficient of determination R2 equals 0.99992 on the validation dataset with training dataset for 3,154,894 of galaxies (0.0016 < z < 7.0519).
Modern large-scale data-farms consist of hundreds of thousands of storage devices that span distributed infrastructure. Devices used in modern data centers (such as controllers, links, SSD- and HDD-disks) can fail due to hardware as well as software problems. Such failures or anomalies can be detected by monitoring the activity of components using machine learning techniques. In order to use these techniques, researchers need plenty of historical data of devices in normal and failure mode for training algorithms. In this work, we challenge two problems: 1) lack of storage data in the methods above by creating a simulator and 2) applying existing online algorithms that can faster detect a failure occurred in one of the components.
We created a Go-based (golang) package for simulating the behavior of modern storage infrastructure. The software is based on the discrete-event modeling paradigm and captures the structure and dynamics of high-level storage system building blocks. The package's exible structure allows us to create a model of a real-world storage system with a configurable number of components. The primary area of interest is exploring the storage machine's behavior under stress testing or exploitation in the medium-or long-term for observing failures of its components.
To discover failures in the time series distribution generated by the simulator, we modified a change point detection algorithm that works in online mode. The goal of the change-point detection is to discover differences in time series distribution. This work describes an approach for failure detection in time series data based on direct density ratio estimation via binary classifiers.
The problem of community detection in a network with features at its nodes takes into account both the graph structure and node features. The goal is to find relatively dense groups of interconnected entities sharing some features in common. We apply the so-called data recovery approach to the problem by combining the least-squares recovery criteria for both, the graph structure and node features. In this way, we obtain a new clustering criterion and a corresponding algorithm for finding clusters one-by-one, so that the process can be interpreted as that of detecting communities indeed. We show that our proposed method is effective on real-world data, as well as on synthetic data involving either only quantitative features or only categorical attributes or both. In the cases at which attributes are categorical, state-of-the-art algorithms are available. Our algorithm appears competitive against them
The problem of community detection in a network with features at its nodes takes into account both the graph structure and node features. The goal is to find relatively dense groups of interconnected entities sharing some features in common. Algorithms based on probabilistic community models require the node features to be categorical. We use a data-driven model by combining the least-squares data recovery criteria for both, the graph structure and node features. This allows us to take into account both quantitative and categorical features. After deriving an equivalent complementary criterion to optimize, we apply a greedy-wise algorithm for detecting communities in sequence. We experimentally show that our proposed method is effective on both real-world data and synthetic data. In the cases at which attributes are categorical, we compare our approach with state-of-the-art algorithms. Our algorithm appears competitive against them.
The results of an amplitude analysis of the charmless three-body decay B+→π+π+π-, in which CP-violation effects are taken into account, are reported. The analysis is based on a data sample corresponding to an integrated luminosity of 3 fb-1 of pp collisions recorded with the LHCb detector. The most challenging aspect of the analysis is the description of the behavior of the π+π- S-wave contribution, which is achieved by using three complementary approaches based on the isobar model, the K-matrix formalism, and a quasi-model-independent procedure. Additional resonant contributions for all three methods are described using a common isobar model, and include the ρ(770)0, ω(782) and ρ(1450)0 resonances in the π+π- P-wave, the f2(1270) resonance in the π+π- D-wave, and the ρ3(1690)0 resonance in the π+π- F-wave. Significant CP-violation effects are observed in both S- and D-waves, as well as in the interference between the S- and P-waves. The results from all three approaches agree and provide new insight into the dynamics and the origin of CP-violation effects in B+→π+π+π- decays.
Recently some specific classes of non-smooth and non-Lipsch-itz convex optimization problems were considered by Yu. Nesterov and H. Lu. We consider convex programming problems with similar smoothness conditions for the objective function and functional constraints. We introduce a new concept of an inexact model and propose some analogues of switching subgradient schemes for convex programming problems for the relatively Lipschitz-continuous objective function and functional constraints. Some class of online convex optimization problems is considered. The proposed methods are optimal in the class of optimization problems with relatively Lipschitz-continuous objective and functional constraints.
The problem of community detection in a network with features at its nodes takes into account both the graph structure and node features. The goal is to find relatively dense groups of interconnected entities sharing some features in common. We apply the so-called data recovery approach to the problem by combining the least-squares recovery criteria for both, the graph structure and node features. In this way, we obtain a new clustering criterion and a corresponding algorithm for finding clusters/communities one-by-one. We show that our proposed method is effective on real-world data, as well as on synthetic data involving either only quantitative features or only categorical attributes or both. Our algorithm appears competitive against state-of-the-art algorithms.
We propose a way to simulate Cherenkov detector response using a generative adversarial neural network to bypass low-level details. This network is trained to reproduce high level features of the simulated detector events based on input observables of incident particles. This allows the dramatic increase of simulation speed. We demonstrate that this approach provides simulation precision which is consistent with the baseline and discuss possible implications of these results.
The problem of community detection in a network with features at its nodes takes into account both the graph structure and node features. The goal is to find relatively dense groups of interconnected entities sharing some features in common. Existing approaches require the number of communities pre-specified. We apply the so-called data recovery approach to allow a relaxation of the criterion for finding communities one-by-one. We show that our proposed method is effective on real-world data, as well as on synthetic data involving either only quantitative features or only categorical attributes or both. In the cases at which attributes are categorical, state-of-the-art algorithms are available. Our algorithm appears competitive against them. © 2020 CEUR-WS. All rights reserved.
In this work, we propose an approach for electromagnetic shower generation on a track level. Currently, Monte Carlo simulation occupies 50-70\% of total computing resources that are used by physicists experiments worldwide. Thus, speedup of the simulation step allows to reduce simulation cost and accelerate synthetic experiments. In this paper, we suggest dividing the problem of shower generation into two separate issues: graph generation and tracks features generation. Both these problems can be efficiently solved with a cascade of deep autoregressive generative network and graph convolution network. The novelty of the proposed approach lies in the Neural networks application to the generation of the complex recursive physical process.
It has become a de-facto standard to represent words as elements of a vector space (word2vec, GloVe). While this approach is convenient, it is unnatural for language: words form a graph with a latent hierarchical structure, and this structure has to be revealed and encoded by word embeddings. We introduce GraphGlove: unsupervised graph word representations which are learned end-to-end. In our setting, each word is a node in a weighted graph and the distance between words is the shortest path distance between the corresponding nodes. We adopt a recent method learning a representation of data in the form of a differentiable weighted graph and use it to modify the GloVe training algorithm. We show that our graph-based representations substantially outperform vector-based methods on word similarity and analogy tasks. Our analysis reveals that the structure of the learned graphs is hierarchical and similar to that of WordNet, the geometry is highly non-trivial and contains subgraphs with different local topology.
The increasing luminosities of future Large Hadron Collider runs and next generation of collider experiments will require an unprecedented amount of simulated events to be produced. Such large scale productions are extremely demanding in terms of computing resources. Thus new approaches to event generation and simulation of detector responses are needed. In LHCb, the accurate simulation of Cherenkov detectors takes a sizeable fraction of CPU time. An alternative approach is described here, when one generates high-level reconstructed observables using a generative neural network to bypass low level details. This network is trained to reproduce the particle species likelihood function values based on the track kinematic parameters and detector occupancy. The fast simulation is trained using real data samples collected by LHCb during run 2. We demonstrate that this approach provides high-fidelity results.
The Ξc0 baryon is unstable and usually decays into charmless final states by the c→sud¯ transition. It can, however, also disintegrate into a π- meson and a Λc+ baryon via s quark decay or via cs→dc weak scattering. The interplay between the latter two processes governs the size of the branching fraction B(Ξc0→π-Λc+), first measured here to be (0.55±0.02±0.18)%, where the first uncertainty is statistical and second systematic. This result is compatible with the larger of the theoretical predictions that connect models of hyperon decays using partially conserved axial currents and SU(3) symmetry with those involving the heavy-quark expansion and heavy-quark symmetry. In addition, the branching fraction of the normalization channel, B(Ξc+→pK-π+)=(1.135±0.002±0.387)% is measured.
We report four narrow peaks in the Ξb0K- mass spectrum obtained using pp collisions at center-of-mass energies of 7, 8, and 13 TeV, corresponding to a total integrated luminosity of 9 fb-1 recorded by the LHCb experiment. Referring to these states by their mass, the mass values are m[Ωb(6316)-]=6315.64±0.31±0.07±0.50 MeV, m[Ωb(6330)-]=6330.30±0.28±0.07±0.50 MeV, m[Ωb(6340)-]=6339.71±0.26±0.05±0.50 MeV, m[Ωb(6350)-]=6349.88±0.35±0.05±0.50 MeV, where the uncertainties are statistical, systematic, and the last is due to the knowledge of the Ξb0 mass. The natural widths of the three lower mass states are consistent with zero, and the 90% confidence-level upper limits are determined to be Γ[Ωb(6316)-]<2.8 MeV, Γ[Ωb(6330)-]<3.1 MeV and Γ[Ωb(6340)-]<1.5 MeV. The natural width of the Ωb(6350)- peak is 1.4-0.8+1.0±0.1 MeV, which is 2.5σ from zero and corresponds to an upper limit of 2.8 MeV. The peaks have local significances ranging from 3.6σ to 7.2σ. After accounting for the look-elsewhere effect, the significances of the Ωb(6316)- and Ωb(6330)- peaks are reduced to 2.1σ and 2.6σ, respectively, while the two higher mass peaks exceed 5σ. The observed peaks are consistent with expectations for excited Ωb- resonances.
The first observation of the decay B0→D0¯D0K+π− is reported using proton-proton collision data corresponding to an integrated luminosity of 4.7 fb−1 collected by the LHCb experiment in 2011, 2012 and 2016. The measurement is performed in the full kinematically allowed range of the decay outside of the D*− region. The ratio of the branching fraction relative to that of the control channel B0→D*−D0K+ is measured to be R=(14.2±1.1±1.0)%, where the first uncertainty is statistical and the second is systematic. The absolute branching fraction of B0→D0¯D0K+π− decays is thus determined to be B(B0→D0¯D0K+π−)=(3.50±0.27±0.26±0.30)×10−4, where the third uncertainty is due to the branching fraction of the control channel. This decay mode is expected to provide insights to spectroscopy and the charm-loop contributions in rare semileptonic decays.
Ratios of isospin amplitudes in hadron decays are a useful probe of the interplay between weak and strong interactions and allow searches for physics beyond the standard model. We present the first results on isospin amplitudes in b-baryon decays, using data corresponding to an integrated luminosity of 8.5 fb-1, collected with the LHCb detector in pp collisions at center of mass energies of 7, 8, and 13 TeV. The isospin amplitude ratio |A1(Λb0→J/ψΣ0)/A0(Λb0→J/ψΛ)|, where the subscript on A indicates the final-state isospin, is measured to be less than 1/21.8 at 95% confidence level. The Cabibbo suppressed Ξb0→J/ψΛ decay is observed for the first time, allowing for the measurement |A0(Ξb0→J/ψΛ)/A1/2(Ξb0→J/ψΞ0)|=0.37±0.06±0.02, where the uncertainties are statistical and systematic, respectively.
We study measurable dependence of measures on a parameter in the following two classical problems: constructing conditional measures and the Kantorovich optimal transportation. For parametric families of measures and mappings we prove the existence of conditional measures measurably depending on the parameter. A particular emphasis is made on the Borel measurability (which cannot be always achieved). Our second main result gives sufficient conditions for the Borel measurability of optimal transports and transportation costs with respect to a parameter in the case where marginal measures and cost functions depend on a parameter. As a corollary we obtain the Borel measurability with respect to the parameter for disintegrations of optimal plans. Finally, we show that the Skorohod parametrization of measures by mappings can be also made measurable with respect to a parameter.
The cross-sections of 𝜓(2𝑆) meson production in proton-proton collisions at 𝑠√=13 TeV are measured with a data sample collected by the LHCb detector corresponding to an integrated luminosity of 275 pb−1. The production cross-sections for prompt 𝜓(2𝑆) mesons and those for 𝜓(2𝑆) mesons from b-hadron decays (𝜓(2𝑆)-from- 𝑏) are determined as functions of the transverse momentum, 𝑝T, and the rapidity, y, of the 𝜓(2𝑆) meson in the kinematic range 2<𝑝T<20 GeV/𝑐 and 2.0<𝑦<4.5
. The production cross-sections integrated over this kinematic region are
𝜎( prompt 𝜓(2𝑆),13 TeV)=1.430±0.005 (stat)±0.099 (syst)μb,𝜎(𝜓(2𝑆)-from- 𝑏,13 TeV)=0.426±0.002 (stat)±0.030 (syst)μb.
A new measurement of 𝜓(2𝑆)
production cross-sections in pp collisions at 𝑠√=7 TeV is also performed using data collected in 2011, corresponding to an integrated luminosity of 614 pb−1. The integrated production cross-sections in the kinematic range 3.5<𝑝T<14 GeV/𝑐 and 2.0<𝑦<4.5
are
𝜎( prompt 𝜓(2𝑆),7 TeV)=0.471±0.001 (stat)±0.025 (syst)μb,𝜎(𝜓(2𝑆)-from- 𝑏,7 TeV)=0.126±0.001 (stat)±0.008 (syst)μb.
All results show reasonable agreement with theoretical calculations.
The cross-sections of 𝜓(2𝑆)ψ(2S) meson production in proton-proton collisions at 𝑠√=13 TeVs=13 TeV are measured with a data sample collected by the LHCb detector corresponding to an integrated luminosity of 275 pb−1275 pb−1. The production cross-sections for prompt 𝜓(2𝑆)ψ(2S) mesons and those for 𝜓(2𝑆)ψ(2S) mesons from b-hadron decays (𝜓(2𝑆)-from- 𝑏ψ(2S)-from- b) are determined as functions of the transverse momentum, 𝑝TpT, and the rapidity, y, of the 𝜓(2𝑆)ψ(2S) meson in the kinematic range 2<𝑝T<20 GeV/𝑐2<pT<20 GeV/c and 2.0<𝑦<4.52.0<y<4.5. The production cross-sections integrated over this kinematic region are
𝜎( prompt 𝜓(2𝑆),13 TeV)=1.430±0.005 (stat)±0.099 (syst)μb,𝜎(𝜓(2𝑆)-from- 𝑏,13 TeV)=0.426±0.002 (stat)±0.030 (syst)μb.σ( prompt ψ(2S),13 TeV)=1.430±0.005 (stat)±0.099 (syst)μb,σ(ψ(2S)-from- b,13 TeV)=0.426±0.002 (stat)±0.030 (syst)μb.
A new measurement of 𝜓(2𝑆)ψ(2S) production cross-sections in pp collisions at 𝑠√=7 TeVs=7 TeV is also performed using data collected in 2011, corresponding to an integrated luminosity of 614 pb−1614 pb−1. The integrated production cross-sections in the kinematic range 3.5<𝑝T<14 GeV/𝑐3.5<pT<14 GeV/c and 2.0<𝑦<4.52.0<y<4.5 are
𝜎( prompt 𝜓(2𝑆),7 TeV)=0.471±0.001 (stat)±0.025 (syst)μb,𝜎(𝜓(2𝑆)-from- 𝑏,7 TeV)=0.126±0.001 (stat)±0.008 (syst)μb.σ( prompt ψ(2S),7 TeV)=0.471±0.001 (stat)±0.025 (syst)μb,σ(ψ(2S)-from- b,7 TeV)=0.126±0.001 (stat)±0.008 (syst)μb.
All results show reasonable agreement with theoretical calculations.
An angular analysis of the B0→K∗0(→K+π−)μ+μ− decay is presented using a data set corresponding to an integrated luminosity of 4.7 fb−1 of pp collision data collected with the LHCb experiment. The full set of CP-averaged observables are determined in bins of the invariant mass squared of the dimuon system. Contamination from decays with the K+π− system in an S-wave configuration is taken into account. The tension seen between the previous LHCb results and the Standard Model predictions persists with the new data. The precise value of the significance of this tension depends on the choice of theory nuisance parameters.