bayesian quadrature - university of torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 ·...

Bayesian Quadrature:Model-based Approximate Integration

David DuvenaudUniversity of Cambridge

December 8, 2012

The Quadrature Problem

� We want to estimate anintegral

∫f (x)p(x)dx

� Most computational problems

in inference correspond to

integrals:

� Expectations� Marginal distributions� Integrating out

nuisance parameters� Normalization

constants� Model comparison

function f(x)

input density p(x)

Sampling Methods

� Monte Carlo methods:Sample from p(x), takeempirical mean:

N∑i=1

f (xi )

function f(x)

input density p(x)

samples

Sampling Methods

N∑i=1

f (xi )

� Possibly sub-optimal for tworeasons:

function f(x)

input density p(x)

samples

Sampling Methods

N∑i=1

f (xi )

� Possibly sub-optimal for two

reasons:

� Random bunching up x

function f(x)

input density p(x)

samples

Sampling Methods

N∑i=1

f (xi )

reasons:

� Random bunching up� Often, nearby function

values will be similar

function f(x)

input density p(x)

samples

Sampling Methods

N∑i=1

f (xi )

reasons:

� Random bunching up� Often, nearby function

values will be similar

� Model-based andquasi-Monte Carlo methodsspread out samples to achievefaster convergence.

function f(x)

input density p(x)

samples

Model-based Integration

� Place a prior on f , for example, a GP

� Posterior over f implies a posterior over Z .

GP meanGP mean ± SDp(Z)

GP meanGP mean ±2SDp(Z)

samples

� We’ll call using a GP prior Bayesian Quadrature

Bayesian Quadrature Estimator

� Posterior over Z has mean linear in f (xs):

Egp [Z |f (xs)] =N∑i=1

zTK−1f (xi )

where zn =∫k(x , xn)p(x)dx

� Natural to minimize posterior variance of Z :

V [Z |f (xs)] =

∫ ∫k(x , x ′)p(x)p(x ′)dxdx ′ − zTK−1z

� Doesn’t depend on function values at all!

� Choosing samples sequentially to minimize variance:Sequential Bayesian Quadrature.

zTK−1f (xi )

V [Z |f (xs)] =

zTK−1f (xi )

V [Z |f (xs)] =

zTK−1f (xi )

V [Z |f (xs)] =

Things you can do with Bayesian Quadrature

� Can incorporate knowledge of function (symmetries)

f (x , y) = f (y , x)⇔ ks(x , y , x ′, y ′) = k(x , y , x ′, y ′) + k(x , y ′, x ′, y)

+ k(x ′, y , x , y ′) + k(x ′, y ′, x , y)

� Can condition on gradients

� Posterior variance is a natural convergence diagnostic

100 120 140 160 180 200−0.2

−0.1

# of samples

+ k(x ′, y , x , y ′) + k(x ′, y ′, x , y)

100 120 140 160 180 200−0.2

−0.1

# of samples

+ k(x ′, y , x , y ′) + k(x ′, y ′, x , y)

100 120 140 160 180 200−0.2

−0.1

# of samples

More things you can do with Bayesian Quadrature

� Can compute likelihood of GP, learn kernel

� Can compute marginals with error bars, in two ways:

� Simply from the GP posterior:

� Or by recomputing fθ(x) for different θ with same x

0.2 0.4 0.6 0.8 1 1.2σ

True σx

Marg. Like.

� Or by recomputing fθ(x) for different θ with same x

0.2 0.4 0.6 0.8 1 1.2σ

True σx

Marg. Like.

� Much nicer than histograms!

Rates of Convergence

What is rate of convergence of SBQ when its assumptions are true?

Expected Variance / MMD

Expected Variance / MMD Empirical Rates in RKHS

Expected Variance / MMDEmpirical Rates out of

Expected Variance / MMD Bound on Bayesian Error

GPs vs Log-GPs for Inference

True FunctionEvaluations

True FunctionEvaluationsGP Posterior

log`(x)

True Log-func

Evaluations

log`(x)

True Log-func

Evaluations

GP Posterior

True FunctionEvaluationsGP PosteriorLog-GP Posterior

log`(x)

True Log-func

Evaluations

GP Posterior

GPs vs Log-GPs

True FunctionEvaluations

GPs vs Log-GPs

−200

−150

−100

log`(x)

True Log-func

Evaluations

GPs vs Log-GPs

−200

−150

−100

log`(x)

True Log-func

Evaluations

GP Posterior

GPs vs Log-GPs

−200

−150

−100

log`(x)

True Log-func

Evaluations

GP Posterior

Integrating under Log-GPs

log`(x)

True Log-func

Evaluations

GP Posterior

Integrating under Log-GPs

True FunctionEvaluationsMean of GPApprox Log-GP

Inducing Points

log`(x)

True Log-func

Evaluations

GP Posterior

Inducing Points

Conclusions

� Model-based integration allows active learning about integrals,can require fewer samples than MCMC, and allows us tocheck our assumptions.

� BQ has nice convergence properties if its assumptions arecorrect.

� For inference, GP is not especially appropriate, but othermodels are intractable.

Conclusions

Limitations and Future Directions

� Right now, BQ really only works in low dimensions ( < 10 ),when the function is fairly smooth, and is only worth usingwhen computing f (x) is expensive.

� How to extend to high dimensions? Gradient observations arehelpful, but a D-dimensional gradient is D separateobservations.

� It seems unlikely that we’ll find another tractablenonparametric distribution like GPs - should we accept thatwe’ll need a second round of approximate integration on asurrogate model?

� How much overhead is worthwhile? Bounded rationality workseems relevant.

Thanks!

bayesian quadrature - university of torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 ·...

Documents

liquidity in european equity etfs: what really matters? ·...

elg3575 8. la modulation damplitude en quadrature et ssb

bayesian validation of grammar productions for the...

bayesian methods for historical linguistics

bidirectional radio over fiber transmission system using...

ter.a.terre.free.frter.a.terre.free.fr/privatecluzo/uk_2020_classicrockbluesissue.pdf ·...

experimental comparison of quadrature formulas convergence

projet d'avis - la quadrature du net

random generation of relational bayesian networks

les clom (mooc), les compétences et la personnalisation des...

première partie, chapitre(s) 2 (et 3) la quadrature du char

pierre bessière lppa – collège de france - cnrs cours «...

5) 6) 7) 8) 9) introduction to - courses · 2018-09-20 ·...

okoubaka pellegrin & normand is really a genus of...

création de site web de bibliothèque (v3) : la quadrature...

tuteurs : frédéric schmidt albrecht schmidt maël...

bayesian approach to uncertainty assessment in seismic ·...

l'irlande du nord et le brexit : la quadrature du cerde?

bayesian hierarchical reconstruction of protein...

laurent houdart - quadrature-bois · laurent houdart moe...