# researchinappliedeconometrics chapter2...

TRANSCRIPT

Research in Applied Econometrics Chapter 2. Contingent Valuation Econometrics

Research in Applied EconometricsChapter 2. Contingent Valuation Econometrics

Pr. Philippe Polomé, Université Lumière Lyon 2

M1 APE Analyse des Politiques ÉconomiquesM1 RISE Gouvernance des Risques Environnementaux

2019 – 2020

Research in Applied Econometrics Chapter 2. Contingent Valuation EconometricsPrinciple of Economic Value

OutlinePrinciple of Economic Value

Principles of Contingent Valuation

Maximum Likelihood Estimation

Application of ML : Logit & probit

Single-bounded CV EstimationApplicationComputing Welfare Measures

Double-bounded CV EstimationApplication : Exxon-Valdez

Questionnaire Issues and Biases

The allegory of the bridge

Jules Dupuit (1804 – 1866)French civil/bridge engineer & economistHad to choose among many demands for building new bridges.If a fee (péage) made it possible to pay for the bridge exploita-tion, investment will (in the long run) be profitable/reasonnable.However, Dupuit argues that some of the individuals who crossthe bridge would be willing to pay more than the fee.Thus, from the viewpoint of a public operator, one should con-sider the maximum amount that individuals are willing to pay.The différence between this amount and the (paid) fee is theconsumer surplus.

Research in Applied Econometrics Chapter 2. Contingent Valuation EconometricsPrinciple of Economic Value

Economic Impact of the Bridge

I 1º approximation : fee × number of passagesI Price ×quantities

I Dupuit : among those who cross, some are willing to paymore than the feeI They have a surplusI We should therefore sum these to the fee to find the economic

value of the bridgeI Whether the fee is actually paid or not does not change the

economic value of the bridgeI It changes only the division of the value between the bridge

manager and usersI However, the absence of a fee means it actually becomes zero

and thus the bridge is (likely) used by more people

Graphical Representation of the Bridge Example

Graphical Representation of the Bridge Example

Graphical Representation of the Bridge Example

Research in Applied Econometrics Chapter 2. Contingent Valuation EconometricsPrinciple of Economic Value

Economic Value

I Willingness to payI Based on individual utilityI NOT a price or a cost

I non-economists often confuse themI NOT financial or accountingI May be purely immaterial

I No actual (or future) transaction is required to define a valueI Willingness to accept compensation is also valid in case of lossI They are a standard notion of value in applied economics

Issues in Non-market ValuesI Commensurability

I Is the environment reallyamenable to a singlequality index measure ?

I e.g. what is air “quality”what pollutants ? How doyou combine them ?

I Measurement, actuallygetting proper data

I Substitutability: can peopleactually compare coffee andwhales ?I Sustainability : is natural

capital actuallyconvertible to financialcapital ?

Nonmarket Values

I WTP are limited by the individual budgetI =⇒ in this sense, they represent a capacity to pay

I There is an interpretation in terms of public finance : thebudget that a collectivity could levy to finance theenvironnemental corresponding to the WTP

I =⇒ Other things equal, with the utility function, a richperson’s WTP will be higher than a poor’s

I So that the rich person’s “opinion” will weight more in thecollectivity budget

I WTP and compensations are expressed in moneyI They are thus comparable between individuals and can be

addedI Usually not the case w/ non-economic notions of value

Understanding the sources of economic value : a typologySource Example in a forest contextDirect Consumptive Use(private goods)

Hunting and gathering productsWooden products / Cultivation

Direct Recreational Use(public goods)

Hunting and gathering practicesHiking / Nature watching

Indirect/functional Use

Water: Filter / Flood protectionAir: Filter / Fixing carbonSoil: Erosion / DesertificationLandscape

OptionUse: Preserve future / 3rd party useQuasi-use: Value of information

Non-use“Patrimonial”: Existence & Heritage“Moral”: Role of humanity wrt nature,Non-human rights

French Guidelines (Valeurs tutélaires) for Transport1I Context of road infrastructure, mainlyI Value of Statistical Life VSL (VVS) : 3 M€ 2010

I Value of a Year of Life VYL (VAV) : 115 000 € 2010I Value of a seriously injured : 15 % of VSL, 450 000 € 2010I Value of a lightly injured : 2 % of VSL, soit 60 000 € 2010

I Value of carbonI Value 2013 : 32 € 2010/tCO2I Value 2030 : 100 € 2010/tCO2

I Value of time depends onI Motive (professional, holiday...)I Distance (urban, <20km, 20-80km, ...)I Mode

I Multiples values in transport sector : Environment, noise...1Commissariat général à la stratégie et à la prospective, L’évaluation

socioéconomique des investissements publics , www.strategie.gouv.fr, sept.2013

Database of valuation studies : www.evri.ca

4000+ recordsBenefit Transfert

Research in Applied Econometrics Chapter 2. Contingent Valuation EconometricsPrinciples of Contingent Valuation

OutlinePrinciple of Economic Value

Principles of Contingent Valuation

Maximum Likelihood Estimation

Application of ML : Logit & probit

Single-bounded CV EstimationApplicationComputing Welfare Measures

Double-bounded CV EstimationApplication : Exxon-Valdez

Questionnaire Issues and Biases

Research in Applied Econometrics Chapter 2. Contingent Valuation EconometricsPrinciples of Contingent Valuation

Purpose

I Operationalize the theoretical notions of valuesI In practice : impossible to measure every individual’s benefit

I Resort to statistical techniquesI Representative samples w/ control variables to enable

inference to the populationI Individual benefits are never identified

I Econometric techniques are thus essentialI We’ll need the packages

I DCchoiceI Ecdat, stats (should be there already)

Research in Applied Econometrics Chapter 2. Contingent Valuation EconometricsPrinciples of Contingent Valuation

Classification of Valuation Techniques

I Based on stated preferencesI Contingent Valuation (“Évaluation” Contingente)I Choice experiments / Contingent choices

I Based on revealed preferencesI Transport cost – estimating demand for transport

I Principle : one must travel to enjoy a siteI Hedonic prices – estimating demand for housing

I Principle : house prices depend on their environmentI Based on (infered) prices

I Principle : at the marginal buyer, price is the WTPI Others, not based on preferences

I Land compensation

Stated Preferences Techniques

I A sample of people is surveyed directly on their preferencesabout a public projectI To infer a measure of individual statistical value

I at the population levelI Interviews can be anything : telephone, postal mail, e-mail,

websiteI Preferably face-to-face, but it’s more expensive

I or combinaisonsI The sample depends on the objective

Contingent Valuation

I One potential environmental change z0 → z1 is describedI Together with its stated cost

I The context of such cost is important : taxes, fees, prices...I A single question is asked : for or against said change

I The question is sometimes repeated w/ another costI At this stage, model only the 1st Q

Questionnaire Structure : wide - precise - wideI Opening questions

I Possible filters to select certain respondentsI General Q on the environment to bring to the particular case

of interestI While making the respondent think about itI We want informed, thought about, answers

I Evaluation QI Debriefing

I Why did the respondent answer what s.he did ?I Did s.he not believe the scenario ?

I Collect data on specific potential explanatory variablesI e.g. if survey on a lake quality, what use has the respondent of

the lake ?I Tourism, recreational (boat, fish...)

I Socio-econ dataI Primarily for inference to population

The Dichotomous FormatI This is the most popular, least controversial “elicitation”

formatI Assumed least prone to untruthful answers

I While not too demanding for the respondentI Consider an environmental change from z0 to z1

I To simplify, consider only an improvementI Respondents are proposed a “bid” b

I They answer yes or noI But may also state that they don’t know or refuse to answer

I This is similar to “posted price” market contextI There is a good (the environmental improvement)I The situation is a bit like asking whether to buy itI Respondents are routinely in such situation

I Except, not in a public good contextI Further, “buying” cannot really be thatI So we need a context, but we discuss that later

The Dichotomous FormatI Formalizing, let the Indirect Utility

v (z , y) + ε

I It represents individual’s preferencesI from the point of view of the researcherI The ε error term reflect multiple influences the researcher does

not know aboutI These influences are modeled as a random variable, but that

does not mean people act randomlyI This is called a Random Utility Model (RUM)

I If the answer is “Yes”, then it must be that

v (z1, y − b) + ε1 ≥ v (z0, y) + ε0

I and thus, that WTP > b

WTP distributionI 4 to 6 (different) bids are proposed to different respondents

I Each respondent ever sees only one bidI Consider the proportion of “Yes” for each of these bids

WTP distributionI Assume that

I For a bid of 0, the proportion of Yes is 100%I For some high bid the proportion would be zeroI Respondents have a single form of their utility function

I but differ according to observable data X and unobservables εI “connect-the-dots” as an estimate of the WTP dist.

WTP distributionI Going back to the IUF, we had the answer is “Yes”, then

v (z1, y − b) + ε1 ≥ v (z0, y) + ε0

I In other words,

Pr {Yes|b} = Pr {v (z1, y − b) + ε1 ≥ v (z0, y) + ε0}= Pr {ε0 − ε1 ≤ v (z1, y − b)− v (z0, y)}= Pr {ε0 − ε1 ≤ g (b, y , . . .)}

g () has some properties because of V (.), see laterI If we make a hypothesis on the distribution of ε0 − ε1

I We have a model that can be estimated by MaximumLikelihood

I Logistic : LogitI Normal : Probit

WTP distribution

Research in Applied Econometrics Chapter 2. Contingent Valuation EconometricsMaximum Likelihood Estimation

OutlinePrinciple of Economic Value

Principles of Contingent Valuation

Maximum Likelihood Estimation

Application of ML : Logit & probit

Single-bounded CV EstimationApplicationComputing Welfare Measures

Double-bounded CV EstimationApplication : Exxon-Valdez

Questionnaire Issues and Biases

Research in Applied Econometrics Chapter 2. Contingent Valuation EconometricsMaximum Likelihood Estimation

DensityI f (y |θ) : probability density function, pdf, of a random

variable yI conditioned on a set of parameters θI It represents mathematically the data generating process of

each obs. of a sample of dataI The joint density of n independent and identically

distributed (iid) obs.I = the product of the individual i densities :

f (y1, ..., yn|θ) =n∏

i=1f (yi |θ) = L (θ|y)

I This joint density is the likelihood functionI a function of the unknown parameter vector θI y is used to indicate the collection of sample data

Research in Applied Econometrics Chapter 2. Contingent Valuation EconometricsMaximum Likelihood Estimation

Likelihood Function

I Intuitively, this is much the same as a joint probabilityI Consider 2 (6-sided) dices

I What is the probability of rolling a 36 ?I The likelihood function is the idea of the probability of the

sampleI Except that points have probability mass zero

Conditional Likelihood

I Generalize the likelihood function to allow the density todepend on conditioning variables : f (yi |xi , θ)I Take the classical LRM yi = xiβ + εi

I Suppose ε is normally distributed with mean 0 and varianceσ2 : ε ∼ n

(0, σ2)

I Then, yi ∼ n(xiβ, σ

2)I Thus yi are not iid : they have different meansI But they are independant, so that (yi − xiβ) /σ ∼ n (0, 1)

I thus L (θ|y ,X ) =

Πif (yi |xi , θ) = Π

i

1√2πσ2

exp{− (yi − xiβ)2

σ2

}

Conditional log-Likelihood

Usually simpler to work with the log :

ln L (θ|y) =n∑

i=1ln f (yi |θ)

thus ln L (θ|y ,X ) =

∑ln f (yi |xi , θ) = −1

2

n∑i=1

[lnσ2 + ln (2π) + (yi − xiβ)2 /σ2

]where X is the n × K matrix of data with ith row equal to xi

Maximum Likelihood Estimation Principle

I We see that with a discrete rvI f (yi |θ) is the probability of observing yi conditionnally on θI The likelihood function is then the probability of observing the

sample Y conditionnally on θI Assume that the sample that we have observed is the

most likelyI What value of θ makes the observed sample most likely ?I Answer : The value of θ that maximizes the likelihood function

I since then the observed sample will have maximum probabilityI When y is a continuous rv, instead of a discrete one,

I we cannot say anymore that f (yi |θ) is the probability ofobserving yi conditionnally on θ,

I but we retain the same principle.

Maximum Likelihood Estimation Principle

I The value of the parameter vector that maximizes L (θ|data)is the maximum likelihood estimates θI The value vector that maximizes L (θ|data) is the same as the

one that maximizes ln L (θ|data)I The necessary condition for maximizing ln L (θ|data) is

∂ ln L (θ|data) /∂θ = 0

I This is called the likelihood equations

ML Properties 1

I Conditionnally on correct distributional assumptionsI and under regularity conditionsI ML has very good properties

I In a sense, because the information supplied to the estimatoris very good : not only the sample but also the full distribution

I ConsistentI Asymptotically normalI Asymptotically efficient

I Achieves the Cramer–Rao lower bound for consistentestimators

Research in Applied Econometrics Chapter 2. Contingent Valuation EconometricsApplication of ML : Logit & probit

OutlinePrinciple of Economic Value

Principles of Contingent Valuation

Maximum Likelihood Estimation

Application of ML : Logit & probit

Single-bounded CV EstimationApplicationComputing Welfare Measures

Double-bounded CV EstimationApplication : Exxon-Valdez

Questionnaire Issues and Biases

Research in Applied Econometrics Chapter 2. Contingent Valuation EconometricsApplication of ML : Logit & probit

Specification (“how the probability is written”)

I Let a class of non-linear models with dichotomous responses :

Pr (y = 1|X ) = G (Xβ)

G takes values between zero and one : 0 ≤ G (Xβ) ≤ 1I This guarantees that estimated response probabilities will be

between zero and oneI That is not the case w/ the LRM

I Therefore, there is a non linear relation between y and XI Many functions could do this job

I 2 are popular : logistic and normal

Research in Applied Econometrics Chapter 2. Contingent Valuation EconometricsApplication of ML : Logit & probit

Logit & Probit

I Logit model, G is the distribution function (cumulativedensity) of the standard logistic r.v. :

G (Xβ) = exp (Xβ) / [1 + exp (Xβ)] = Λ (Xβ)

I Probit model, G is the distribution function of the standardnormal r.v., of which the density is noted φ (.) :

G (Xβ) =∫ Xβ

−∞φ (t) dt = Φ (Xβ)

with φ (Xβ) = (2π)−1/2 exp(− (Xβ)2 /2

)

Logit vs. ProbitI The logistic and normal distributions are similarI Logistic makes computations and analysis easier

I & allows for simplifications in more advanced models

Latent Variable ModelI Let y∗ a latent variable (that is, not directly observed) s.t.

y∗ = Xβ + ε

I e.g. y∗ is utility from an environmental improvement ∆zI Logit & probit may be derived from a latent variable model

I satifying all the classical LRM hypothesesI Utility is not observed, but only the consequence of the

individual decision{y∗i < 0 =⇒ yi = 0y∗i ≥ 0 =⇒ yi = 1

I We observe that the person is (y = 1) or is not (y = 0) willingto pay an amount b to “buy” an environmental improvementz1 − z0 = ∆z

Response ProbabilityI Hypotheses on ε

I independant XI standard logistic or standard normal

I Response probability for y :

Pr {y = 1|X} = Pr {y∗ ≥ 0|X}= Pr {ε > − (Xβ) |X}= 1− G (− (Xβ))= G (Xβ)

I Actually, we could multiply both sides of the equality withoutchanging the Pr

I Imposes to normalize the variance to 1I Since ε is normal or logistic, it is symmetrical around zero

I thus 1− G (−Xβ) = G (Xβ)

Maximum Likelihood Estimation

I Likelihood Function for the dichotomous case is∏i

[Pr {willing |β, σ,Xi}yi ] [1− Pr {willing |β, σ,Xi}](1−yi )

I ML seeks the maximum of (the log of) this functionI It does not have an explicit solution

I But yield numerical estimates βMVI Consistent but biaisedI Asymptotically efficient & normalI As long as the model hypotheses are true

I So, if you used Probit : is ε really normal ?I If the distribution hypothesis is not true, sometimes we may

retain the propertiesI Endogeneity of X is as serious as usual

Marginal effect a continuous regressor xj

I The effect of a marginal change in xjI on the response proba Pr {y = 1|X} = p (X )I is given by the partial derivative

∂p (X )∂xj

= ∂G (Xβ)∂xj

= g (Xβ)βj

I This is the marginal effect of xjI it depends on the values taken by all the regressors (not just

xj)I Compare to the LRM : ∂y/∂xj = βj

I it cannot bring the proba below zero or above one

Marginal effect a continuous regressor xjI Thus, the marginal effect is a non-linear combination of the

regressorsI It can be calculated at “interesting” points of X

I e.g. X : the sample average point : ∂y∂xj

(X)

I However, that does not mean much for discrete regressors e.g.gender

I Or it can be calculated for each i in the sample ∂y∂xj

(Xji )I and then we can compute an average of the “individual”

marginal effect ∂y∂xj

(Xj)

I In general, these are not the same : ∂y∂xj

(Xj) 6=∂y∂xj

(X)

I Which one do we choose ?I Often, that is too complicated for presentation

Marginal effect of a discrete regressor xj

I Effect of a change in xj discreteI from a to b (often, from 0 to 1)I on the response proba Pr {y = 1|X} = p (X )I Write X−j the set of all the regressors but xj , similarly β−jI The discrete change in p (Xi ) is

∆p (Xi ) = G(

X−ji β−j + bβj

)− G

(X−ji β−j + aβj

)I Such discrete effect differs from individual to individual

I Even the sign could in principle differI In R, such effects are not calculated automatically

I The above formula must be calculated explicitly

Compare Logit & Probit

I The sizes of the coefficients are not comparable betweenthese modelsI Approximately, multiply probit coef by 1.5 yields the logit coef

(rule of thumb !)I The marginal effects should be approximately the same

Measures of the quality of fit

I The correctly predicted percentageI may be appealingI ∀i compute the fit proba that yi takes value 1, G

(Xi β)

I If ≥ .5 we “predict” yi = 1 and zero otherwiseI Compute the % of correct predictions

I Problem : it is possible to see high % of correctly predictedwhile the model is not much usefulI e.g. in a sample of 200, 180 obs. of yi = 0 for which 150 are

predicted zero and 20 obs. of yi = 1 all predicted zeroI The model is clearly poorI But we still have 75% correct predictionsI A flat prediction of 0 has 90% correct predictions

I A better measure is a 2× 2 table as in the next slide

Goodness of Fit : Predictive Table

Observedyi = 1 yi = 0 Total

Predict yi = 1 350 122 472yi = 0 78 203 281Total 428 325 753

We’ll see the R cmd later

Goodness of Fit : Pseudo R-square

I Pseudo − R2 = 1− lnLUR/ lnL0I lnLUR log-likelihood of the estimated modelI lnL0 log-likelihood of a model with only the intercept

I i.e. forcing all β = 0 except for the interceptI Similar to an R2 for OLS regression

I since R2 = 1− SSRUR/SSR0

I There exists other measures of the quality of fitI but the fit is not what maximum likelihood is seeking

I contrarily to LSI I stress more the statistical significance of regressors

Research in Applied Econometrics Chapter 2. Contingent Valuation EconometricsSingle-bounded CV Estimation

OutlinePrinciple of Economic Value

Principles of Contingent Valuation

Maximum Likelihood Estimation

Application of ML : Logit & probit

Single-bounded CV EstimationApplicationComputing Welfare Measures

Double-bounded CV EstimationApplication : Exxon-Valdez

Questionnaire Issues and Biases

Research in Applied Econometrics Chapter 2. Contingent Valuation EconometricsSingle-bounded CV Estimation

Utility

I Come back to economic valueI Compensating Variation for a change from z0 to z1 is

V (z1, y − CV ) = V (z0, y)

CV is interpreted as the WTP to secure an improvement∆z = z1 − z0

I We observe Yes answers to bid b when b < WTPI Assume RUM, then

Pr {Yes|b} = Pr {ε0 − ε1 ≤ v (z1, y − b)− v (z0, y)}

Research in Applied Econometrics Chapter 2. Contingent Valuation EconometricsSingle-bounded CV Estimation

Linear utilityI Suppose V (zj , y) = αj + βy j = 0, 1

I β is the marginal utility of incomeI In principle, we would like it to decrease with income, but

simplifyI The WTP is s.t. α0 + βy + ε0 = α1 + β (y −WTP) + ε1

I Solving WTP = α + ε

β, with α = α1 − α0 and ε = ε1 − ε0

I So that e.g. E (WTP) = α/β

I The probability of Yes to bid b is then

Pr {Yes|b} = Pr {v (z1, y − b) + ε1 − v (z0, y)− ε0 > 0}= Pr {ε ≤ βb − α}

I This is a simple probit/logit contextI Utility V is not identified, only the difference

Probit/LogitI If ε ∼ n (0, 1) std normal, then

Pr {Yes|b} = 1− Φ (βb − α) = Φ (α− βb)I If ε is the std logistic, then

Pr {Yes|b} = 1/ exp (βb − α)I If we assume that WTP ≥ 0, then we can derive similarly :

I with ε ∼ log − normalPr {Yes|b} = 1− Φ (βb − α) = Φ (α− β ln (b))

I with ε ∼ log − logisticPr {Yes|b} = 1/ exp (β ln (b)− α)

I These last two are still the Probit and Logit models,respectively,

I But the ln of the bid is used instead of the bid itselfI And that is still compatible with RUM

Estimation

I Using these expressions, it is easy to derive the log-likelihoodfunctionI as in the previous section

I Usually, we want to account for individual characteristics XiI Those that are collected in the surveyI This is done through αi =

∑k=Kk=0 γkxki

I x0i = 1∀i for an interceptI Parametric estimation

I GLM package (core distribution)I DCchoice package

I Check or library( ) itI Example with the NaturalPark data

I of the Ecdat package (Croissant 2014)

Application

OutlinePrinciple of Economic Value

Principles of Contingent Valuation

Maximum Likelihood Estimation

Application of ML : Logit & probit

Single-bounded CV EstimationApplicationComputing Welfare Measures

Double-bounded CV EstimationApplication : Exxon-Valdez

Questionnaire Issues and Biases

Application

NaturalPark data

I From the Ecdat package (Croissant 2014)I consists of WTP and other socio-demographic variablesI of 312 individuals regarding the preservation of the Alentejo

natural park in PortugalI In help type NaturalPark, execute summary(NaturalPark)

I 7 variablesI bid1 is the first bid

I min 6, max 48I bidh & bidl are 2nd bids that we do not look into for the

momentI answers is a factor {YY, YN, NY, NN} of answers to 2 bids

I We’ll use only the 1st letter, Y for YesI Socio-demographics are Age, Sex, Income

Application

Data Transformation

I Rename NP <- NaturalPark for simplicityI Extract the answer to the 1st bid from “answers”

I NP$R1 <- ifelse(NP$answers == "yy" | NP$answers =="yn", 1, 0)

I What does that do ?I A call to the logical function ifelse( ) takes the form

ifelse(condition, true, false)I It returns true if condition is true, and false otherwise.I The vertical bar | is an “or” operator in a logical expression.I The prefix NP$ is a command that makes it possible to access

each variable contained in NP directly.I Convert bid1 to log (log in English is ln in French)

I NP$Lbid1 <- log(NP$bid1)I summary(NP) reveals that things have gone smoothly

Application

Estimation using glm

I glm is a classical package for many models of type G (Xβ)I Its use is much like lmI But you have to specify the link function G using option family

=I This is fairly flexible, but a bit complicatedI See RAE2017 for the specifications

I summary works on glmI As several usual commands

I Output is in part similar to lmI Coef values next to var names with their significance

I This is interpreted in a way similar to lmI Negative (signif.) coef implies a negative impact on Pr {Yes|b}I Note “sexfemale” indicates the effect for women

I Also gives the ln L value at optimum

Iterations and ConvergenceNon-linear models donot have explicit solutionsSolved numerically byalgorithmsI Newton-type in

plotI Iterates until

conditionI Risk of local max

I poor startpoint

Application

Estimation using DCchoiceI DCchoice is designed for such data

I But not for other contextsI Format for single-bounded is sbchoice(formula, data, dist =

"log-logistic")I the default dist is log-logistic

I This is in fact logisticI But the bid variable is interpreted in log

I formula follows Response ~ X | bidI | bid is mandatory

I the output is much more directly relevant for valuationpurposesI bid always shown last

I log(bid) if log-logistic or log-normal were selectedI Measures of mean WTP, we will see later

Application

Goodness of fit

I table(predict(fittted model, type = "probability")>.5,NP$R1)I This is a contingency table that counts the number of

predicted YesI predicted prob>.5 (returns TRUE or FALSE)I against the actual Yes/NoI per individual, so with each individual’s Xi (individual

predictions)

SB.NP.DC.logit 0 1FALSE (predicted 0) 85 38TRUE (predicted 1) 56 133

Application

Plotting

I sbchoice cmd produces an object that can be plotted directlyI direct plot ofthe object is the fitted probability of Yes for the

bid rangeI probably, conditionnaly on average age, income, sex – the

package isn’t explicitI Using a predict method helps

I observe what is outside the rangeI compare logit & probit fitted curves

I see RAE2017.RI In particular : logit has slightly fatter tails, inducing a higher

WTPI To use predict

I Creates a matrix of new dataI Chooses the proper type, here we want a proba, “response”

Application

Logit vs. Probit predictI As can be seen, for the same data

I Logit has slightly fatter tails than probit

Computing Welfare Measures

OutlinePrinciple of Economic Value

Principles of Contingent Valuation

Maximum Likelihood Estimation

Application of ML : Logit & probit

Single-bounded CV EstimationApplicationComputing Welfare Measures

Double-bounded CV EstimationApplication : Exxon-Valdez

Questionnaire Issues and Biases

Computing Welfare Measures

Expected WTP

I Recall that when we assumed V (zj , y) = αj + βy j = 0, 1I Then (max) WTP is s.t.

α0 + βy + ε0 = α1 + β (y −WTP) + ε1

I So WTP = α+ ε

β, with α = α1 − α0 and ε = ε1 − ε0

I So that WTP is a r.v.I we can compute e.g. E (WTP) = α/β

I When αi =∑k=K

k=0 γkxkiI then E (WTP) becomes individualI So that we have to think about how we aggregate individual

expected WTP

Computing Welfare Measures

Other measures of welfareI E (WTP) is the most obvious measure

I However, the expectationis strongly influenced bythe tail of thedistribution G (.)

I While we do not actuallyhave data to fit it

I Since there are notmany bids

I And it does not feelvery seriousto ask a very high bid

Computing Welfare Measures

Truncated expectations

I E (WTP) =∫ ∞

0Pr {Yes|b} db

I Historically, the highest bid has been used to truncateE (WTP) ∫ b max

0Pr {Yes|b} db

I However, that is not a proper expectation since the support ofPr {Yes|b} does not stop at b max

I An alternative uses the truncated distribution :∫ b max

0Pr {Yes|b} / (1− Pr (Yes|b max)) db

Computing Welfare Measures

Median WTP

I Finally the median hasbeen suggested as amore robust measure :bmedian s.t.Pr {Yes|bmedian} = 1/2I i.e. the bid s.t. 50%

would be favorable

Computing Welfare Measures

Shape of the WTP functionI Clearly, the form of the WTP function depends on the form of

V (.)I For some forms, some values of β lead to impossibilities

Distribution Expected MedianNormal α

βαβ

Logistic αβ

αβ

Log-normal exp(αβ

)exp

(1

2β2

)exp

(αβ

)Log-logistic exp

(αβ

)Γ(1− 1

β

)Γ(1 + 1

β

)exp

(αβ

)I Again, if αi =

∑k=Kk=0 γkxki , each of these forms are individual

I So the question arises whether to compute a sample mean or asample median

I DCchoice appears to compute a sample meanI But is not explicit

Computing Welfare Measures

How do we choose a welfare measure ?I That might be the time for a debate ?I 3 welfare measures : untruncated, properly truncated, median

I Of course we would not select an infinite oneI The smallest estimates to avoid criticism ?I Do these measures differ significantly from each other ?

I We will see that laterI 4 well-known distributions

I (log-)normal, (log-)logistic : they do not differ substantiallyI There are othersI There are also other estimators e.g. non-parametric

I The estimate of the best-fit model ?I 2 aggregation rules : sample mean or sample median

I researcher usually take the first, but what is the meaning of asample mean of individual medians ?

I DCchoice does not provide any guidance

Computing Welfare Measures

Computing the welfare measure

I DCchoice computes automaticallyI With glm, use the above formulas

I See the code in RAE.RI Much as I like DCchoice, I must note that for the data we use

(NaturalPark)I The mean WTP does not coincide with the median

I for the symmetrical distributions (normal & logistic)I That is a problem I should write the authors

Computing Welfare Measures

Confidence Intervals

I In the end, WTP, under any of its forms, is an estimateI As such it has a confidence intervalI Much as for β, you should always report the CI

I At a minimum to give an idea of the varianceI and to show whether it is significantly different from zero

I There are 2 main methodsI Krinsky & RobbI Bootstrap

I These methods are much broader than valuationI They are useful in all types of research in applied econometrics

Computing Welfare Measures

Constructing Confidence Intervals : Krinsky and Robb

I Start with the estimated vector of coefficientsI By ML, it is distributed (multivariate) normally

I Its matrix of variance-covariance has been estimated in the MLprocess

I Draw D times from a multivariate normal distribution withI mean = the vector of estimated coefficientsI the estimated variance-covariance matrix of these estimated

coefficientsI So, we have D vectors of estimated coefficients

I If D is large, the average of these D vectors is just our originalvector of coef.

Computing Welfare Measures

Constructing Confidence Intervals : Krinsky and Robb

I Compute the welfare measure for each such replicatedcoefficient vectorI Thus we have D estimated welfare measuresI some large, some small : an empirical distribution

I order them from smallest to largestI the 5% most extreme are deemed randomI the 95% most central are deemed reasonnableI and contitute the 95% confidence interval

I For example,I the lower and upper bounds of the 95% confidence interval

I corresponds to the 0.025 and 0.975 percentiles of themeasures, respectively

Computing Welfare Measures

Krinsky and Robb : Implementation in DCchoice

I Function krCI(.)I constructs CI for the 4 different WTPs

I estimated by functions sbchoice(.) or dbchoice(.)I call as krCI(obj, nsim = 1000, CI = 0.95)

I obj is an object of either the “sbchoice” or “dbchoice” class,I nsim is the number of draws from the multidimensional normal

I influences machine timeI CI is the percentile of the confidence intervals to be estimatedI returns an object “krCI”

I table of the simulated confidence intervalsI vectors containing the simulated WTPs

I Is there a package that does Krinsky & Robb for glm objects ?

Computing Welfare Measures

Constructing Confidence Intervals : Bootstrap

I Similar to Krinsky & RobbI except in the way the new estimated coefficients are obtainedI Essentially, instead of simulating new estimates

I We simulate new dataI and then calculate new estimates

Computing Welfare Measures

Mediocrity principle

I Consider that our sample is mediocre in the populationI We mean : it does not have anything exceptional

I Then, if we could draw a new sample from that population,I we would surely obtain a fairly mediocre sampleI that is, fairly similar to the original one

Computing Welfare Measures

Bootstrap principle

I It’s not possible to draw a new sampleI But imagine that using the original sample, we draw one obs,

I and we “put it back” in the sample (“replace”)I then we draw again

I repeat until we have the same number of obs as in the originalsample

I call this a bootstrap sampleI Each original obs. appears 0 to n times

I By the mediocrity principleI the bootstrap sample is fairly close to a real new sampleI Estimate a new vector of coefficients

I Repeat D times

Computing Welfare Measures

Bootstrap : Implementation in DCchoice

I Function bootCI() carries out the bootstrappingI and returns the bootstrap confidence intervalsI call bootCI(obj, nboot = 1000, CI = 0.95)

I Longer than K&R since each sample must be generatedI and then compute new estimatesI while K&R only simultates new estimates

I In the end, the results are similarI See RAER

I Note : another mean would be the resample cmdI Applicable to glmI But I don’t develop here

Computing Welfare Measures

Differences of welfare measures

I Sometimes we want to know whether a welfare measure issignificantly different from anotherI In other words : is their difference significantly different from

zeroI In terms of CI : does the CI of their difference include zero ?

I To compute that : Bootstrap similarlyI Krinsky and Robb is also possible but

I If the welfare measures are independant, their difference hasvariance-covariance that is the sum of eachvariance-covariance

I If they are not independant, then it’s difficult

Research in Applied Econometrics Chapter 2. Contingent Valuation EconometricsDouble-bounded CV Estimation

OutlinePrinciple of Economic Value

Principles of Contingent Valuation

Maximum Likelihood Estimation

Application of ML : Logit & probit

Single-bounded CV EstimationApplicationComputing Welfare Measures

Double-bounded CV EstimationApplication : Exxon-Valdez

Questionnaire Issues and Biases

Research in Applied Econometrics Chapter 2. Contingent Valuation EconometricsDouble-bounded CV Estimation

Double-bounded CV Estimation

I To increase the amount of information collected by the survey,

I The valuation q is asked a 2nd timeI If answer is Yes (No), then ask with a higher (lower) bid

I Called double-bounded dichotomous choiceI Dichotomous choice with follow-up

I More precisely, the phrasing could beI If ∆Z cost you b€, would you be WTP it ?

I If answered Yes : would you be WTP bU€ ? bU>bI If answered No : would you be WTP bL€ ? bL<b

Research in Applied Econometrics Chapter 2. Contingent Valuation EconometricsDouble-bounded CV Estimation

Double-bounded CV Estimation

I There are 4 outcomes per RI YY, YN, NY, NN

I YY indicates that WTP>bU

I YN that b<WTP<bU

I NY that bL<WTP<bI NN that WTP<bL

I Thus the answers are intervalsI Probit & Logit do not sufficeI Many use ML

I But the likelihood function is different

Estimation with DBDC dataI To develop the likelihood function

I it is necessary to express probabilities firstI Write PYY as the probability to answer Yes, Yes

I PYY = Pr{

bU < WTP}

= 1− G(bU)

I PYN = Pr{

b < WTP < bU} = G(bU)− G (b)

I PNY = Pr{

bL < WTP < b}

= G (b)− G(bL)

I PNN = Pr{

WTP < bL} = G(bL)

I For a sample of n obs. ln L =

N∑n=1

[dYY

n PYYn + dYN

n PYNn + dNY

n PNYn + dNN

n PNNn

]where n indexes individuals and dXX

n indicates whether nanswered XX (dich. variable)

Estimation with DBDC dataI There is no direct command corresponding to such likelihood

I It must be programmedI This is called “Full Information Maximum Likelihood” FIMLI We don’t do this

I It is pre-programmed in DCchoiceI For the same basic choices of distribution as for SBDC data

I Endogeneity issueI The 2nd bid is not exogenous

I Since it depends on the previous answerI Thus it contains unobserved characteristics of the individualsI Such unobservables also determine the 2nd choice

I This is in principle addressed by FIMLI A more general model is Bivariate probit

I allowing the 2 answers to have less than perfect correlationI not covered by DCchoice

Estimation with DBDC data : Std normal cdf

I PYY = Pr{

bU < WTP}

= 1− Φ(−α + βbU

)I PYN = Pr

{b < WTP < bU} = Φ

(−α + βbU)− Φ (−α + βb)

I PNY = Pr{

bL < WTP < b}

= Φ (−α + βb)− Φ(−α + βbL)

I PNN = Pr{

WTP < bL}

= Φ(−α + βbL

)I So : we estimate the same coefficients α and β as in SBDC

I But with more data, so that it is more efficientI Assuming people answer in the same way to both valuation

questionsI So : the computation of the welfare measures is the same

Application : Exxon-Valdez

OutlinePrinciple of Economic Value

Principles of Contingent Valuation

Maximum Likelihood Estimation

Application of ML : Logit & probit

Single-bounded CV EstimationApplicationComputing Welfare Measures

Double-bounded CV EstimationApplication : Exxon-Valdez

Questionnaire Issues and Biases

Application : Exxon-Valdez

Context

I 1989, about 35 000 tons of crude oil at seaI Ended up extending on about 26 000 km2 at sea

I and soil 1 600 km of coastlineI Most of the damage was in Prince William Sound and the

Gulf of Alaska up to the Kodiak IslandsI Several types of damages

I Professional fishing (minimal)I Tourism (possibly a benefit)I Environmental heritage lossI Punitive damages (supposed to be incentive)

I Valuation survey

Exxon-Valdez Questionnaire

Exxon-Valdez Questionnaire

Exxon-Valdez Questionnaire

Exxon-Valdez Questionnaire

Exxon-Valdez Questionnaire

Exxon-Valdez Questionnaire

Exxon-Valdez Questionnaire

Exxon-Valdez Questionnaire

Exxon-Valdez Questionnaire

Exxon-Valdez Questionnaire

Exxon-Valdez Questionnaire

Exxon-Valdez Questionnaire

Exxon-Valdez Questionnaire

Application : Exxon-Valdez

Exxon-Valdez QuestionnaireI Avoid Willingness to accept Q :

I For assumed strategic behaviourI The basis Q is

I Compensation for the loss of an environmental heritage during10 years ?

I Scenario : after 10 years environmental damage will be fullyrecovered

I Convert such “WTA” into a WTP to avoid that thecatastrophy happens again for 10 yearsI Scenario : in 10 years time, similar cat. will be impossible due

to double-hullI The scenario need not be true

I Since we are investigating human preferences for things thatmay never happen

I However, it must appear credible to respondents

Exxon-Valdez Questionnaire

Exxon-Valdez Questionnaire : valuation scenario

Exxon-Valdez Questionnaire : first valuation Q

Exxon-Valdez Questionnaire : follow-up valuation Q

Exxon-Valdez Questionnaire

Exxon-Valdez Questionnaire

Exxon-Valdez Questionnaire

Exxon-Valdez Questionnaire

Exxon-Valdez Questionnaire

Exxon-Valdez Questionnaire

Exxon-Valdez Questionnaire

Exxon-Valdez Questionnaire

Application : Exxon-Valdez

Reading the Exxon-Valdez DataI data(“CarsonDB”)

I it is only a frequency table for the DBDC surveyI Thus w/o the X data

T1 TU TL yy yn ny nn1 10 30 5 119 59 8 782 30 60 10 69 69 31 983 60 120 30 54 75 25 1014 120 250 60 35 53 30 139

I So there are 6 distinct bidsI 5, 10, 30, 60, 120, 250I There is always a large proportion of nn : part of protestI The proportion of yy decreases w/ bI The proportion of yn & of ny is roughly constant w/ b

I w/ about 2 to 3 more yn than ny

Application : Exxon-Valdez

Converting the Exxon-Valdez Data

I We need data that is individualI 119+59+...+53+30 = 1043 (cfr nobs in RAE.R)

I Observations are created from that frequency table by 3 steps.

1. create a new data frame db.data, filled w/ 0I to save the reconstructed individual observations and to

prepare indexes2. then organize the 3 columns containing the first bids (bid1),

and the increased and decreased second bids (bid2U and bid2L)3. then, fill in the answers to each bid corresponding to the

numbers in the frequency tableI Follow the detailed code in RAE2017.R

Application : Exxon-Valdez

Estimating WTP with DBDC data

I dbchoice(formula, data, dist = "log-logistic", par = NULL)I Usage similar to sbchoiceI Except for formula & par

I formula : R1 + R2 ~ var | bid1 + bid2I R1 + R2 : 4 response outcomes in 2 dichotomous variablesI bid1 + bid2 : 2 bidsI var : any number of covariates

I Unfortunately, we do not have anyI par : starting values, may be NULL

I There is no guarantee that the likelihood is unimodalI Optimization may not convergeI Different s.v. may lead to different optimaI Take the one w/ higher likelihood

Application : Exxon-Valdez

Estimating WTP with DBDC data

I See the code in RAE.RI Estimation goes smoothly

I See “convergence TRUE”I When this is not the case : the estimated values have no

meaningI You need to specify par = a vectorI Cfr the slide “Iterations and Convergence”

I Coef of bid is neg. & highly signif in all 4 modelsI The same 4 measures of welfare are given

I E (WTP)→∞ in the log-log model when |β| < 1I The measure is for a one-time paiement for 10 yearsI As it turns out, the median is always more conservative (for

probit-type models)

Application : Exxon-Valdez

CI

I Use exactly the same cmds as for SBI for the log-logistic model e.g.

I The CI for the (infinite) mean WTP is not definedI Otherwise we see similar results for Krinsky & Robb as for

BootstrapI approx ±5 around the central valueI e.g. for the 30.4 median : [26.5 - 35.5]

Application : Exxon-Valdez

Exxon-Valdez : Non-use valueI The value presented to the Courts was for a median WTP of

about 3.2$/y per household for 10 yearsI to avoid such accidents in the next 10 yearsI That is 32$ total

I Since the ∆z here is natural heritage of the whole USI this value refers to the 90.9 million US householdsI that is a aggregated median of $2 800 millions (2.8 billions)I Interpretation is that it is the amount that would obtain

exactly 50% if there was a referendumI It is not related to the cost of the hypothetical escort ship

programI But it can be taken as the minimal compensation for the loss

of natural heritage due to the spillI In the end, Exxon and the governor of Alaska settled out of

court for 1 billion of $

Application : Exxon-Valdez

Exxon-Valdez : Remarks

I In US law, the ultimate responsibility of goods is of the ownerof these goodsI Otherwise, if it was of the shipping cie, it would be easy for

goods owners to contract unsolvable firmsI and effectively escape responsibility

I Any tanker that calls in US territorial waters must havesubscribed an insuranceI that has a $1 billion fund that is seized by the authorities to

pay any damages

Application : Exxon-Valdez

Exxon-Valdez : Remarks

I This is part of the Oil Pollution Act of 1990I following the Exxon Valdez spillI "A company cannot ship oil into the United States until it

presents a plan to prevent spills that may occur. It must alsohave a detailed containment and cleanup plan in case of an oilspill emergency."

I In Europe, similar ideas advance slowlyI following the Erika (1999) and Prestige (2002) wrecks

Research in Applied Econometrics Chapter 2. Contingent Valuation EconometricsQuestionnaire Issues and Biases

OutlinePrinciple of Economic Value

Principles of Contingent Valuation

Maximum Likelihood Estimation

Application of ML : Logit & probit

Single-bounded CV EstimationApplicationComputing Welfare Measures

Double-bounded CV EstimationApplication : Exxon-Valdez

Questionnaire Issues and Biases

Research in Applied Econometrics Chapter 2. Contingent Valuation EconometricsQuestionnaire Issues and Biases

Context

I It is generally not easy to answer CV questionsI Not current to think of public/collective goods in terms of

price or valueI How will my answer be used ? Could it commit me ?

I Often, a realistic context is described in the questionnaireI A referendum (binding or consultative) on a local tax changeI A contribution to an associationI An entry fee on a site

I The paiement vehicle is often associated with such contextI If there is a paiement, how would be carried out ?I Changes of prices, tax raise or similar fees, voluntary

contributions...

Research in Applied Econometrics Chapter 2. Contingent Valuation EconometricsQuestionnaire Issues and Biases

Other Formats

I Open-ended : how much are you WTP?I Paiement cards, showing several amountsI “Bidding game”

I i.e. like an auctionI Psychologists are very critical with any valuation question

I They say that any amount stated by the researcher anchorsvery much the respondents’ anwers

I Within the valuation Q or anywhere else in the questionnaireI Yet, it is an empirical regularity that the proportion of Yes

decreases with the bidI Not always very smoothly, but still

Answers that are not Preferences Revealing

I Strategic : by lying, can I get more than by telling the truth ?I Open-ended and the average of individual WTPI This is less of a concern with the dichotomous formatI Avoid willingness to accept Q as in Exxon-Valdez

I SymbolicI The respondent does not identify correctly the ∆z of interestI Some people answer always Yes for environmental causes or

No when the word tax appearsI Some repondent answer what they think everyone (or the

government) should doI Debriefing : series of questions after the valuation question(s)

to eliminate such answers

Don’t know / refuse to answer

I Strategic or symbolic answers are not obviousI Don’t know / don’t answer are visible

I So actually, even a Yes/No Q always has 3 or 4 answers inpractice

I Distinguishing “don’t know” and “refuse” or notI One option to treat these answers is to remove them

I ok if they are not associated w/ a specific profileI Preliminary dich. choice model

I Answer “Yes or No” vs. sthg elseI Or directly a multinomial model

Elements of the QuestionnaireI Trade-off between detailed description versus interview length

& complexityI Control “size” (scope) effect

I “size” of the ∆zI e.g. it must be that the WTP to save 300 birds be < than for

3 000 birdsI Can be done by subsampling

I Control embeddingI Is ∆z valued for itself or taken symbolically of a larger set ?I e.g. protecting against another Exxon-Valdez at the same site

or for the whole USI This is often addressed by careful structure : broad Q first,

more and more preciseI Socio-demographics Q

I to acquire regressors of the valuation QI Allow inference to the population

I Interviewer effect

Bid Design

I Why should we use 4 or 6 bids ?I Trade-off : the more bids,

I the more we know about the WTP curveI but the less precise this knowledge

I Given the total size of the sampleI d-optimality seeks to minimize the asymptotic variances of the

estimatorsI c-optimality minimizes the confidence interval of the estimated

WTPI In the end, the literature settled for “sequential design”

1. start w/ a focus group and ask open Q2. use these first guesses for a first round of (100 ?) questionnaire3. estimate the model : can you identify the WTP?4. If not, adjust : higher (lower) bids if not enough “No”(“Yes”)