data science sur un plateau - github pages · a seulement 20 km au sud de paris, l’ensae...

2
Data science sur un plateau Nous avons le plaisir de vous inviter pour cette troisi` eme ´ edition du s´ eminaire ` a ENSAE ParisTech le lundi 22 janvier 2018 de 14:00 ` a 17:30 dans l’amphi 200. Lors de cette session nous pourrons ´ ecouter : Learning from MOM’s principles (Joint with M. Lerasle) Slides Guillaume Lecu´ e (ENSAE-CREST, CNRS) We obtain estimation error rates for estimators obtained by aggregation of regularized median- of-means tests, following a construction of Le Cam. The results hold with exponentially large probability – as in the Gaussian framework with independent noise–under only weak moments assumptions on data and without assuming independence between noise and design. Any norm may be used for regularization. When it has some sparsity inducing power we recover sparse rates of convergence. The procedure is robust since a large part of data may be corrupted, these outliers have nothing to do with the oracle we want to reconstruct. Our general risk bound is of order max minimax rate in the i.i.d. setup, number of outliers number of observations . In particular, the number of outliers may be as large as (number of data) ×(minimax rate) with- out affecting this rate. The other data do not have to be identically distributed but should only have equivalent L 1 and L 2 moments. For example, the minimax rate s log(ed/s)/N of recovery of a s-sparse vector in R d is achieved with exponentially large probability by a median-of-means version of the LASSO when the noise has q0 moments for some q0 > 2, the entries of the design ma- trix should have C0 log(ed) moments and the dataset can be corrupted up to C1s log(ed/s) outliers. Computational Optimal Transport for Imaging and Learning Gabriel Peyr´ e (CNRS and Ecole Normale Sup´ erieure) Optimal transport (OT) has become a fundamental mathematical tool at the interface between calculus of variations, partial differential equations and probability. It took however much more time for this notion to become mainstream in numerical applications. This situation is in large part due to the high computational cost of the underlying optimization problems. There is a recent wave of activity on the use of OT-related methods in fields as diverse as computer vision, computer graphics, statistical inference, machine learning and image processing. In this talk, I will review an emerging class of numerical approaches for the approximate resolution of OT- based optimization problems. This offers a new perspective for the application of OT in imaging sciences (to perform color transfer or shape and texture morphing) and machine learning (to perform clustering, classification and generative models in deep learning). More information and references can be found on the website of our book “Computational Optimal Transport” https://optimaltransport.github.io/ Structures ´ el´ ementaires de la G´ eom´ etrie de l’Information : fonc- tion caract´ eristique de Koszul-Souriau, m´ etrique de Fisher-Souriau et densit´ e` a maximum d’entropie d’ordre sup´ erieur Slides Fr´ ed´ eric Barbaresco (Representative of Key Technology Domain PCC (Pro- cessing, Control & Cognition) Board, T L & A S) 1

Upload: others

Post on 11-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data science sur un plateau - GitHub Pages · A seulement 20 km au sud de Paris, l’ENSAE ParisTech se situe au cœur du campus de l’X, accessible depuis le Boulevard des Marechaux,

Data science sur un plateau

Nous avons le plaisir de vous inviter pour cette troisieme edition du seminaire a ENSAEParisTech le lundi 22 janvier 2018 de 14:00 a 17:30 dans l’amphi 200. Lors de cettesession nous pourrons ecouter :

Learning from MOM’s principles (Joint with M. Lerasle) SlidesGuillaume Lecue (ENSAE-CREST, CNRS)We obtain estimation error rates for estimators obtained by aggregation of regularized median-of-means tests, following a construction of Le Cam. The results hold with exponentially largeprobability – as in the Gaussian framework with independent noise–under only weak momentsassumptions on data and without assuming independence between noise and design. Any normmay be used for regularization. When it has some sparsity inducing power we recover sparse ratesof convergence. The procedure is robust since a large part of data may be corrupted, these outliershave nothing to do with the oracle we want to reconstruct. Our general risk bound is of order

max

(minimax rate in the i.i.d. setup, number of outliers

number of observations

).

In particular, the number of outliers may be as large as (number of data) ×(minimax rate) with-out affecting this rate. The other data do not have to be identically distributed but should onlyhave equivalent L1 and L2 moments. For example, the minimax rate s log(ed/s)/N of recoveryof a s-sparse vector in Rd is achieved with exponentially large probability by a median-of-meansversion of the LASSO when the noise has q0 moments for some q0 > 2, the entries of the design ma-trix should have C0 log(ed) moments and the dataset can be corrupted up to C1s log(ed/s) outliers.

Computational Optimal Transport for Imaging and LearningGabriel Peyre (CNRS and Ecole Normale Superieure)Optimal transport (OT) has become a fundamental mathematical tool at the interface betweencalculus of variations, partial differential equations and probability. It took however much moretime for this notion to become mainstream in numerical applications. This situation is in largepart due to the high computational cost of the underlying optimization problems. There is arecent wave of activity on the use of OT-related methods in fields as diverse as computer vision,computer graphics, statistical inference, machine learning and image processing. In this talk,I will review an emerging class of numerical approaches for the approximate resolution of OT-based optimization problems. This offers a new perspective for the application of OT in imagingsciences (to perform color transfer or shape and texture morphing) and machine learning (toperform clustering, classification and generative models in deep learning).More information and references can be found on the website of our book “Computational OptimalTransport” https://optimaltransport.github.io/

Structures elementaires de la Geometrie de l’Information : fonc-tion caracteristique de Koszul-Souriau, metrique de Fisher-Souriauet densite a maximum d’entropie d’ordre superieur SlidesFrederic Barbaresco (Representative of Key Technology Domain PCC (Pro-cessing, Control & Cognition) Board, Thales Land & Air Systems)

1

Page 2: Data science sur un plateau - GitHub Pages · A seulement 20 km au sud de Paris, l’ENSAE ParisTech se situe au cœur du campus de l’X, accessible depuis le Boulevard des Marechaux,

ACCES ET TRANSPORTS

Adresse et plan d’accesA seulement 20 km au sud de Paris, l’ENSAE ParisTech se situe au cœur du campus del’X, accessible depuis le Boulevard des Marechaux, qui en fait le tour.

ENSAE ParisTech

5 avenue Henry Le Chatelier

TSA 26644

91764 Palaiseau Cedex

Coordonnees GPS : 48◦42’40.5"N 2◦12’27.3"E

http://www.ensae.fr/ensae/images/ENSAEpic/acces/Plan-Campus-Scientifique.png

2