Dear all,
On Wednesday, February 5th, at 14h00 in Aula Dal Passo of Tor Vergata Math Department, RoMaDS https://www.mat.uniroma2.it/~rds/events.php will host
Stefano Favaro (Università di Torino) with the seminar
“A smoothed-Bayesian approach to frequency recovery from sketched data”
Abstract: We introduce a novel statistical perspective on a classical problem at the intersection of computer science and information theory: recovering the empirical frequency of a symbol in a large discrete dataset using only a compressed representation, or sketch, obtained via random hashing. Departing from traditional algorithmic approaches, recent works have proposed Bayesian nonparametric (BNP) methods that can provide more informative frequency estimates by leveraging modeling assumptions on the distribution of the sketched data. In this paper, we propose a smoothed-Bayesian method, inspired by existing BNP approaches but designed in a frequentist framework to overcome the computational limitations of the BNP approaches when dealing with large-scale data from realistic distributions, including those with power-law tail behaviors. For sketches obtained with a single hash function, our approach is supported by frequentist guarantees, including unbiasedness and optimality under a squared error loss function within a class of linear estimators. For sketches with multiple hash functions, we introduce an approach based on multi-view learning to construct computationally efficient frequency estimators. We validate our method on synthetic and real data, comparing its performance to that of existing alternatives. Joint work with Mario Beraha (Politecnico di Milano) and Matteo Sesia (University of Southern California)
We encourage in-person partecipation. Should you be unable to come, here is the link to the Teams streaming:
https://teams.microsoft.com/l/meetup-join/19%3arfsL73KX-fw86y1YnXq2nk5VnZFwP... https://teams.microsoft.com/l/meetup-join/19%3arfsL73KX-fw86y1YnXq2nk5VnZFwPU-iIPEmqet8NCg1%40thread.tacv2/1738141133192?context={%22Tid%22%3a%2224c5be2a-d764-40c5-9975-82d08ae47d0e%22%2c%22Oid%22%3a%22650fc4a8-4cec-4bd2-87bc-90d134074fe6%22}
The seminar is part of the Excellence Project MatMod@TOV.