# Seminar of the laboratory "Strongly invertible function and their use for fast data compression", Bruno Bauwens

On September 18, 2019 Bruno Bauwens will give a lecture "Strongly invertible function and their use for fast data compression".

Abstract:

We say that a probabilistic function F: X -> Y is (K,epsilon)-invertible if with probability 1-epsilon, the value F(x) identifies x inside any subset of K elements in X. More precisely, there exists a deterministic function g that takes a K-element subset of X and an element y as input, such that for all K-element sets S and x in S: Pr [g(S,F(x)) = x] > 1-epsilon This property is stronger than what is typically obtained from hashing, because the recovery of x does not require the knowledge of the randomly chosen hash function. We show that there exist polynomial time constructions of (2^k,epsilon)-invertible functions F: {0,1}^n -> {0,1}^m for which m-k < log^2 (n/epsilon). The functions are obtained from extractors constructed by Guruswamy, Umans, Vadhan, JACM, 2009. Application 1. We consider compression of a string x towards a target size m, (m is smaller than the length of x). In other words, we consider compression functions that have x and m as input and must generate an m-bit description of x. We show that we can speed up any such compressor to a polynomial time one, at the cost of slower decompression and a polylogarithmic increase in the compression length. Moreover, we obtain polynomial time compression down to the Kolmogorov complexity of the string (up to an additive polylog term), provided a good upper bound of this complexity is given. Unfortunately, the result is theoretical, since the decompression time increases by a factor 2^m. Application 2. In distributed compression, several sources need to compress their own strings in isolation. The compressed strings are sent to a decompressor, which needs to reconstruct all strings. If these strings contain correlated information, not all sources need to send all of the information. The minimal sizes of the messages are given by the Slepian-Wolf theorem for sequences generated by a stationary ergodic process. Our second result generalizes this theorem by removing any assumption on the generating mechanism. Moreover, the invertible functions above provide polynomial time compression to this optimal length.

We present simplified constructions with improved parameters compared to the papers:

- B.Bauwens and M.Zimand, Linear list-approximation for short programs (or the power of a few random bits), CCC 2014, http://arxiv.org/abs/1311.7278

- M.Zimand, Kolmogorov complexity version of Slepian-Wolf coding, STOC 2017, Montreal, June 19-23, 2017 http://arxiv.org/abs/1511.03602

- B.Bauwens, Optimal probabilistic polynomial time compression and the Slepian-Wolf theorem: tighter versions and simple proofs, 2018, https://arxiv.org/abs/1802.00750

Address:11 Pokrovsky Boulevard, building D

Language: English

Time: 18:10-19:30

Room: D109

If you need a pass to the building, please contact Dina Chernyshova: dchernyshova@hse.ru.