SEMINARS IN STATISTICS @ COLLEGIO CARLO ALBERTO https://www.carloalberto.org/events/category/seminars/seminars-in-statistics/?tribe-bar-date=2019-09-01
Venerdi 17 Dicembre 2021, alle ore 12.00, presso il Collegio Carlo Alberto, in Piazza Arbarello 8, Torino, si terrà il seguente seminario:
------------------------------------------------
Speaker: *Lorenzo Masoero *(Amazon)
Title: *Improved prediction and optimal sequencing strategies for genomic variant discovery via Bayesian nonparametrics*
Abstract:Despite the advent of Big Data, data-gathering in many domains can still be an expensive process that necessitates careful planning when operating under a fixed, limited budget. For instance, sequencing new genomic data is a complex procedure that requires careful tuning: researchers can spend resources to sequence a greater number of genomes (quantity), or spend resources to sequence genomes with increased accuracy (quality). In this talk, I consider the common setting in which scientists have already conducted a pilot study to reveal variants in a genome and are contemplating a follow-up study. Spending additional resources has the potential to reveal new variations in the genome, and thereby new genetic insights. Therefore, practitioners are interested in (i) predicting how many new discoveries they will make under different experimental design choices. In turn, they can leverage these predictions to optimally allocate available resources in the design of a future experiment, e.g. (ii) to maximize the number of future discoveries or (iii) to optimize the usefulness of a future experiment for the task at hand, e.g. the power of an associated statistical test. I discuss novel methodologies to solve the problems mentioned above. Our approach relies on a Bayesian nonparametric formulation that facilitates (i) prediction for the number of new variants in the follow-up study based on the pilot study. We show empirically that, when experimental conditions are kept constant between the pilot and follow-up, our method's prediction is competitive with the best existing methods. Unlike current methods, though, our new method allows practitioners to change experimental conditions between the pilot and the follow-up. We demonstrate how this distinction allows our method to be used for more realistic predictions and for optimal allocation of a fixed budget between quality and quantity. In particular, we first show how, under a fixed budget, my predictions can be used to maximize (ii) the number of new genomic variants discovered in a follow-up study. Last, we show how our framework can guide practitioners in other experimental design problems, and specifically how to achieve (iii) the highest possible power in statistical tests in the context of rare variants association studies.
------------------------------------------------
In ottemperanza alle norme anti Covid, per partecipare in presenza è necessario prenotarsi tramite il seguente form online: https://forms.gle/jVPLoDCQ1ANSB5dg8
Sarà possibile seguire il seminario anche in streaming: Join Zoom Meeting https://us02web.zoom.us/j/88971670320?pwd=VXFseHNlYm5EKzFmZ25oTGdlRWw4QT09 Meeting ID: 889 7167 0320 Passcode: 362208
Il seminario è organizzato dalla "de Castro" Statistics Initiative
www.carloalberto.org/stats
Cordiali saluti, Pierpaolo De Blasi --- University of Torino & Collegio Carlo Alberto
carloalberto.org/pdeblasi https://sites.google.com/a/carloalberto.org/pdeblasi/