UNIVERSITY OF

ILLINOIS LIBRARY

AT URBANA-CHAMPAIGN

BOOKSTACKS

CENTRAL CIRCULATION BOOKSTACKS

The person charging this material is re- sponsible for its renewal or its return to the library from which it was borrowed on or before the Latest Date stamped below. You may be charged a minimum fee of $75.00 for each lost book.

Theft, mutilation, and underlining of book* are reasons for disciplinary action and may result In dismissal from the University. TO RENEW CALL TELEPHONE CENTER, 333-8400

UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN

AUG 0 1 1997

When renewing by phone, write new due date below previous due date. LI 62

Digitized by the Internet Archive

in 2012 with funding from

University of Illinois Urbana-Champaign

http://www.archive.org/details/theoryofoptimalb92100kras

Faculty Working Paper 92-0100

4JC-6

n

S~F X

g3QFs

-f OCJ-> ; 100 COPV

A Theory of Optimal Bank Size

FEB i / m-j

fUrbana-i

Stefan Krasa

Department of Economics University of Illinois

Anne P. ViUamil

Department of Economics

University of Illinois

>

Bureau of Economic and Business Research

College of Commerce and Business Administration

University of Illinois at Urbana-Champaign

BEBR

FACULTY WORKING PAPER NO. 92-0100

College of Commerce and Business Administration

University of Illinois at Urbana-Champaign

January 1992

A Theory of Optimal Bank Size

Stefan Krasa Anne P. Villamil

Department of Economics

A Theory of Optimal Bank Size

Stefan Krasa Anne P. Villamil*

First Draft: May 1991 This Draft: December 1991

Abstract

This paper provides a theory of optimal bank size determination with implications for the size distribution of banks in a model with asymmetric information and costly state verification. Production is subject to minimum scale requirements and two types of risk: diver- sifiable idiosyncratic project risk, and imperfectly diversifiable aggre- gate "macroeconomic" risk. We first show that delegated monitoring with two-sided simple debt contracts dominates direct investment if the cost of monitoring the intermediary is bounded and if the variance of the non-diversifiable macroeconomic risk is sufficiently small. We next show that: (i) banks are of finite size; (ii) bank size is inversely re- lated to the bank's exposure to macroeconomic risk, and (iii) multiple banks co-exist with the same size within a locale but with (possibly) different sizes across locales.

'Address of the authors: Department of Economics, University of Illinois, 1206 South Sixth Street, Champaign, IL 61820.

We gratefully acknowledge useful comments from Anthony Courakis and financial sup- port from the National Science Foundation (SES 89-09242).

1 Introduction

Recent research has studied how the structure of the financial system affects the transmission of business cycle shocks in an economy. See Gertler (1988) for an excellent survey of this emerging literature. An equally important but quite different problem is the following: How do business cycle fluctuations affect the structure of the financial system? This problem is important be- cause all developed countries are subject to regular and recurrent business cycle fluctuations. Further, banks in most countries operate under restrictive regulations which limit their ability to insure fully against macroeconomic risk. For example, in the U.S. there are branching restrictions which limit the geographic operation of banks and portfolio restrictions which limit the types of assets that banks may hold (e.g., Savings and Loan Institutions have been subject to such restrictions). It is obvious that these restrictions can lead to non-diversifiable portfolio risk, but little is known about the impli- cations for financial structure. This problem is of particular interest at the present time. The European Community is currently involved in a transition toward an economic and monetary union (EMU). To date, most discussions of the EMU have focused on the creation of a European central bank and its supervision of a single currency. However, once an administrative structure is in place, it will undoubtedly impose upan European" restrictions. What are the likely implications of such restrictions for the future European bank- ing system? This paper proposes a theoretical model that can be used to analyze this important policy question.

We propose a theory of optimal bank size determination which has im- plications for the size distribution of banks. We consider a costly state ver- ification model with finitely many borrowers and lenders where production is subject to: (i) minimum project scale requirements, (ii) diversifiable id- iosyncratic project risk, and (iii) a non-diversifiable aggregate (i.e., macroe- conomic) risk. Agents can undertake production in two ways. Borrowers and lenders can write direct bilateral investment contracts, or they can engage in intermediated investment by contracting with a bank that accepts deposits from lenders and grants loans to borrowers. Because there is non-trivial de- fault risk in the economy, the lenders must monitor either the borrowers (in the direct investment problem) or the bank (in the intermediated investment problem) in default states which occur with strictly positive probability.

The minimum project scale requirement implies that it takes multiple

2

lenders to finance the project of a single borrower. It is well known (cf., Diamond (1984) or Williamson (1986)) that "delegated monitoring" may be optimal under this requirement because it allows lenders to economize on monitoring costs. However, in an economy with non-trivial default risk lenders must "monitor the monitor" (i.e., bank) because the bank may mis- report the state to minimize its payments to lenders. Thus, the essential problem that lenders face is to provide the bank with an incentive to report truthfully to them. Krasa and Villamil (1991a) study this problem in a fi- nite economy that is subject to diversifiable default risk. They show that a particular type of contract commonly issued by banks (i.e., two-sided simple debt) solves this monitoring problem optimally. In contrast, in this paper we are concerned with the optimal investment arrangement (given two-sided simple debt contracts) when there is both diversifiable project risk and non- diversifiable aggregate or "macroeconomic" risk. When a bank is subject to non-diversifiable project (and hence default) risk, costly monitoring will necessarily occur in some states — regardless of the bank's size.

In choosing an optimal portfolio size (i.e., scale of operation) a bank faces the following tradeoff for most monitoring cost structures. Increasing the size of the bank's portfolio (i.e., contracting with additional borrowers) given some initial bank size generally decreases the bank's default probability, but increases the lenders' cost of monitoring the bank. Thus, the crucial question that the bank faces in choosing an optimal portfolio size is: Under what circumstances do the gains from decreased default risk dominate the losses from increased monitoring costs when the bank compares its current scale of operation with an increased scale of operation? We begin our analysis of this question with Theorem 1 which shows that delegated monitoring with two- sided simple debt contracts (i.e., intermediated investment) dominates direct investment if the lenders' cost of monitoring the intermediary is bounded and the variance of the non-diversifiable macroeconomic risk is sufficiently small. We interpret the variance of the macroeconomic risk as the magnitude of business cycle fluctuations. For the U.S., Prescott (1986) reports that the standard deviation of output for the period 1872 to 1985 is only 1.8 percent.

Theorem 2 provides the main result of the paper: A theory of optimal bank size with implications for the size distribution of banks. To our knowl- edge, this has been a neglected branch of the literature on financial inter- mediation. Of course, the problem of firm size distribution has been studied extensively in industrial organization theory. Panzar (1989, p. 33) summa-

rizes the findings from this literature by noting that firm size "is determined in large part by the . . . cost function," while industry structure (i.e., limits on the number and size distribution of firms that are present in equilibrium) "is determined by the market demand curve." Our Theorem 1 implies that this traditional industrial organization analysis is inadequate for a theory of financial structure. In Theorem 2 we develop a measure of the rate of port- folio diversification accruing from increases in bank size, and then show that both risk and cost considerations are essential determinants of bank size. The theory has three predictions that in principle are empirically testable. First, banks will be of finite size with the precise scale dependent upon the structure of monitoring costs and the degree of portfolio diversification that the bank can attain. Second, banks that are better able to diversify risk (e.g., because they are subject to less stringent portfolio restrictions) will be larger in size than banks which are less able to diversify risk. Third, multiple banks with similar risk and cost characteristics may co-exist. The first prediction pertains to firm size, while the latter two are industry structure predictions. Clearly, firm size and industry structure are related; however, we shall return to a more precise discussion of this relationship in Section 6.

2 The Model

Consider an economy with finite numbers of two types of risk-neutral agents, borrowers and lenders. Each borrower i = 1, . . . ,n is endowed with a risky investment project which transforms one unit of a single input at time zero into x, units of output at time one, where x, is the realization of a random variable X{ on the probability space {Q,A, P).1 For simplicity assume that borrowers have zero endowment of the input. Every lender j — 1, . . . ,m is endowed with a < 1 units of a homogeneous input,2 but has no direct ac- cess to a productive technology. Thus, the project of a borrower cannot be financed by a single lender which implies that in the absence of intermedia- tion more than one lender would have to verify a single borrower. The total available supply of investment is larger than the input required by all borrow- ers, so m lenders can be accommodated by the h borrowers (i.e., rha > h).

We will implicitly refer to this probability space when writing P for probability and E for expected value.

2For technical purposes assume that \/a is an integer.

Also, assume there is a riskless alternative investment project available to all lenders that yields return r with probability one.

All borrowers and lenders are fully informed about the distribution of Xi at time zero, but asymmetric information exists about the state of the project's actual realization ex-post: Only borrower i costlessly observes the realization x, of his/her project at time one. Let F,(x) denote the distribution of borrower Vs project and assume that the F{ are identical (i.e., all X, have the same distribution).3 We now depart from the standard intermediation framework by considering an economy with non-trivial correlation among the X(.4 Assume that each Xt can be decomposed into independent random variables Yi, i — l...,n, and Z, where Yt is an idiosyncratic risk associated with borrower z's project and Z is a non-diversifiable "macroeconomic" risk common to all borrowers. Thus,

Xt = Y, + Z,5 (1)

Assume that the distributions of Y, and Z have continuous density functions, Xi > 0 for every i (because borrowers can never produce "negative output" no matter how bad the macroeconomic shock), and that each Xi is bounded from above.

Let a technology exist which can be used by agents other than borrower i to verify at time one the realization xt of project Xt. Assume that this state verification technology is costly to use, and that when verification occurs, x, is privately revealed only to the individual who requests (deterministic) verification. Assume that the verification cost is comprised of both a pecu- niary component and an indirect "pecuniary equivalent" of a non-pecuniary cost.6 These costs may be thought of as the money paid to an attorney to file a claim (a pecuniary cost) and the monetary value of time lost when

3This assumption simplifies the analysis but is not essential for the results.

4See Dowd (1991) for an excellent survey of the literature on financial intermediation.

5To simplify the analysis we make the following technical assumption: Let Q = fii x Q2 and let P = Pi x P2, where Pi is a probability on Qi for i = 1,2. Assume that the random variables Yj are independent of Q21 >e-, f°r every u>i G fii the mapping u>2 t— ► Yi(u>i}u>2) is constant on Q2. Similarly, assume that u>\ ■— ► Z(wi,w2) is constant on Qi. This condition implies independence of Z and A", for every i € ^V, but is stronger than independence. In our analysis, standard independence would require conditions on Q which imply the existence of a regular conditional probability P{- \ Z) (cf., Parthasarathy (1977, Proposition 46.5)).

6The non-pecuniary costs permit negative utility but rule out negative consumption.

visiting the attorney (a pecuniary equivalent). Because agents have asym- metric information, a key problem is to ensure that the borrower reports the realization truthfully. We, like Williamson (1986), use the costly state verification framework to solve this problem. This model was introduced by Townsend (1979). However, unlike in Townsend's model where xt is publicly announced after verification occurs, in our model x, is privately revealed only to the agent who requests verification.7 This assumption is essential for our analysis since if all information could be made public ex-post, there would be no need to verify the bank in default states. However, it also appears to accurately describe the privacy and institutional features which characterize most lending arrangements. For example, Diamond (1984, p. 395) observes: "Financial intermediaries in the world monitor much information about their borrowers in enforcing loan covenants, but typically do not directly announce this information or serve an auditor's function."

3 Contract Arrangements

There are two types of basic investment arrangements, notably those involv- ing direct contracts between "primary" borrowers and lenders, and those involving an intermediary. In Section 3.1 we consider the direct investment problem. In Section 3.2 we consider intermediated investment where lenders and borrowers write contracts with a bank, and the bank is subject to non- trivial default risk.

3.1 Direct Investment

Let all direct, bilateral interactions between lenders and borrowers be regu- lated by a contract whose general form is defined as follows.

Definition 1. A one-sided contract between lenders and borrowers is a pair (R(-),S), where R(-) is an integrable, positive payment function on M+,

while pecuniary equivalents of non-pecuniary costs ensure that the costs can be shared by the contracting parties.

7See Krasa and Villamil (1991b) for an analysis of an economy with multiple het- erogeneous agents, costly state verification, public announcement, and deterministic or stochastic verification.

such that R(x) < x for every x E 1R+ and S is an open subset of JR+ which determines the states where monitoring occurs.

The contract (R{-), S) describes the total claims against the borrower by all lenders. If a lender invests 6 units of capital in a borrower's project, then his/her claim against the borrower is given by bR{x), where x is the bor- rower's announced wealth realization. Following standard practice in this lit- erature, we restrict the universe of contracts to the set of incentive-compatible contracts and denote this set by C = (/?(•), 5). Consequently, the realization announced by each borrower is the true realization. The following condition ensures that all contracts under consideration satisfy this restriction: There exists R E M+ such that S = {x: R(x) < R}. The imposition of this restric- tion is without loss of generality because the Revelation Principle establishes that any arbitrary contract can be replaced by an incentive-compatible con- tract with the same actual payoff (cf., Townsend (1988, p. 416)). Therefore, the set of all incentive-compatible contracts is fully specified by the tuple

(R(-),R).

We study a particular type of contract, called a simple debt contract, which is defined as follows:

Definition 2. (R(-),R) is a simple debt contract if: R(x) = x for x E S = {x < R} and R{x) = R if x E Sc = {x > R}.

The payment schedules in Definition 2 resemble simple debt because: (i) When verification occurs the payment to the lender is state contingent

(i.e., the borrower pays the entire realization for all outcomes below a

cutoff level), where the verification set S is viewed as the set of bankruptcy

states, (ii) When verification does not occur the payment to the lender is constant

(i.e., the borrower pays a fixed amount R for all realizations of the state

above the cutoff), where Sc is the set of all realization where verification

does not occur.

Townsend (1979) proved that debt contracts are optimal responses to asymmetric information problems in economies with deterministic costly state verification technologies because such contracts minimize verification costs. Agents verify only low realizations of X{ and accept fixed payments (which do not require monitoring) in all other states. Gale and Hellwig (1985)

and Williamson (1986) showed that simple debt is the optimal contract among all one-sided investment schemes. In contrast to debt contracts, where the borrower is the residual claimant in the default state (and hence may re- ceive a non-zero payment if the bankruptcy is not "too severe"), a simple debt contract requires the borrower's entire project realization to be trans- ferred to the lender in default states (i.e., see (i) above). This result will be useful in the analysis that follows, thus we state it formally.

Theorem GHW. Simple debt is the optimal contract among all one-sided investment schemes.

The strategy of the proof of Theorem GHW is as follows.8 Consider two optimal contracts. Let (R(-),R) be a simple debt contract and (A(-),A) be some alternative contract. Since both contracts are optimal, both must provide borrowers with the same expected payoff. Under the simple debt contract lenders request costly state verification if x < R, and under the alternative contract lenders request verification if x < A. Clearly A > R (otherwise the contracts cannot have the same expected return to borrowers), thus the expected verification costs must be less for the simple debt contract.

In our economy with correlation among project realizations, the direct investment problem is identical to the investment problem in an economy without correlation among projects. This follows from the fact that when borrowers and lenders write direct bilateral contracts, every lender must ver- ify the borrower with whom he/she contracts in default states. Hence, non- trivial correlation among projects is irrelevant (and simple debt contracts remain optimal). Note that even though lenders have the opportunity to in- vest in more than one project (i.e., contract with more than one borrower), it is not optimal for them to do so because they reap no gain from diversifying idiosyncratic risk while they incur higher expected monitoring costs. For ex- ample, suppose that an agent invests in two projects. Assume that the total outstanding debt of borrowers i = 1,2 is given by the contract {Rt{-), Ri), and let a, be the capital the lender invests in each project. If both projects do not fail, then the payoff is simply given by axRi + a2R2- Thus, there is no gain in the good state from investing in multiple projects. However, the probability that at least one of the two projects fail is strictly higher than the probability that only a single project fails.9

8See Gale and Hellwig (1985) or Williamson (1986) for a formal proof.

9For example, consider the random event to be the toss of a fair coin, where "head"

The direct investment problem between a borrower and lenders can now be stated. Let c denote the lenders' cost of monitoring a borrower.

Problem 3.1. Choose an incentive-compatible contract (/?(•), 5) to:

rT

max

subject to:

/ [x-R(x)]dF{x) Jo

a I R{x)dF{x)- I cdF{x)>ra. (2)

Jo Js

In Problem 3.1 (the direct investment problem), the expected utility of a rep- resentative borrower is maximized subject to a constraint that the lenders' expected return, net of monitoring costs (c), be at least as great as some reservation level (r). The first term in the lenders' constraint is multiplied by a in order to account for each individual lender's capital investment a. Without loss of generality we assume that each lender invests all of his/her endowment in a single project. Finally, Problem 3.1 reflects the assumption that credit markets are competitive. There are more lenders who wish to invest than investment opportunities. Thus, the supply of loans is inelas- tic, and the level of return necessary to attract lenders is driven down to the reservation level r, the return available on the alternative investment opportunity.

3.2 Intermediated Investment

Now consider an intermediated borrowing and lending problem. In the previ- ous section (i.e., the one-sided problem), lenders and borrowers wrote direct bilateral contracts and correlation among projects (as long as it was not trivial) was irrelevant. However, duplicative monitoring is inherent in the

is non-default, and "tail" is a default. Clearly, for a single coin toss the probability of a default is 0.5. If the agent "invests" in two coin tosses, the probability that at least one of the two projects fails is 0.75. Thus expected monitoring costs will be higher. This is true even if there is some correlation among projects, if idiosyncratic risk is non-trivial. If the idiosyncratic risk is trivial (i.e., X{ = 0 so only macroeconomic risk matters), then every agent could monitor only a single project (since the realization of all projects can be determined by the outcome of any one project) and the expected payoff from investing in one project or many projects is the same.

direct investment problem because each lender must verify each borrower with whom he/she contracts in certain states of nature. Thus, there may ex- ist gains from "delegated monitoring" (cf., Diamond (1984)), where lenders elect a monitor to perform the verification task and thereby eliminate some of the duplicative monitoring associated with direct investment. In contrast to previous delegated monitoring studies, our economy has an important feature which significantly complicates the "standard" delegated monitoring problem. The intermediary faces non-trivial default risk for two reasons: (i) Since there are only finitely many borrowers, it is not clear that the

intermediary can completely diversify idiosyncratic risk, (ii) Even if the intermediary can eliminate idiosyncratic risk, its portfolio is

still subject to non-diversifiable "macroeconomic" risk. Thus, in our economy the probability that the bank may default is non-zero (because at least the macroeconomic risk is non-diversifiable), so lenders must verify the bank with strictly positive probability (i.e., in some states).

We begin our analysis of the delegated monitoring problem by considering how agents select an intermediary. Since the loan market is competitive, any lender who wishes to act as an intermediary must offer contracts which max- imize the expected utility of the borrowers and assure the remaining lenders of at least the reservation level of utility (r), which is determined by the riskless rate of return on the alternative project. Otherwise, agents would trade directly or another intermediary would offer an alternative contract (i.e., there is free entry into intermediation) with terms that are preferable to the n borrowers and/or the remaining m — 1 lenders. Let (R(-), S) denote aspects of the two-sided contract which pertain to the borrower-intermediary relationship and (/?*(•), 5*) denote aspects of the two-sided contract which pertain to the intermediary-lender relationship.10 The intermediary's prob- lem clearly embodies optimization by all agents in the economy.

We next derive random variables which describe the income from the intermediary's portfolio. Recall that Rt(x) denotes the payoff by borrower i to the intermediary if output x is realized, X{ is the random variable which describes the output x of a particular borrower i in state x, and X{ = \\ + Z from equation (1), where the Yt are independent random variables but the

l0(R(),S) is also used in the direct investment problem in Section 3.1. We do not introduce additional notation in this Section because the structure of the problem is the same regardless of whether borrowers report to the lenders or to the bank.

10

Xi are not independent for Z/0. The intermediary's income from borrower z, given transfer R(-), can now be defined by

Gi(R(-);u) = R(X{(u)), (3)

where u> G H denotes the state of nature. Because the X, are not inde- pendent, it follows that in general the random variables G, are not inde- pendent. If the intermediary contracts with i = 1,2, ...,n borrowers,11 its average income per borrower under payment schedule R(-) is: Gn( R{-); u) = £ E?=i <?,-(#(•); u>)- Denote the distribution function of Gn(-) by Fn(-).

The two-sided contract between the intermediary and each borrower, and the intermediary and the lenders, can now be defined.

Definition 3. A two-sided contract is a four-tuple ((#(•), S), (Rm(-), S*))

with the following properties:

(i) R(-) is an integrable positive payment function from a borrower to the intermediary such that R(x) < x for every x G M+, and S is an open subset of JR+ which determines the set of all realizations of a borrower's project where the intermediary must monitor; (ii) R*(-) is an integrable positive payment function from the intermediary to the lenders such that R*{x) < x. For every realization x of Gn(-), the payment to an individual lender is given by ^-^R"(x);12 and S" is an open subset of 1R+ which determines the set of all realizations of the intermediary's income from the borrowers the lenders must verify.

We now derive the set of all incentive-compatible two-sided contracts. Each borrower will announce an output which minimizes its payment obli- gations to the intermediary. Let x = arg minx65 R(x) be the output that minimizes this payoff over all non-monitoring states, and recall that x is ob- served directly in the monitoring states S. Consequently, the announcement by a borrower is given by argminierxfi R(x). A similar condition holds for the intermediary-lender portion of the contract (i.e., R*(-),R*). As in

11 Note that n need not equal n. In fact, this paper shows that it in general it will not be optimal for a bank to become as large as possible.

12/?*() is the total payment by the intermediary to lenders per borrower. Since the intermediary has a positive initial endowment, rn — 1 lenders are sufficient to finance the m projects. Thus, to derive the payment to an individual lender, multiply this amount with -2-7.

m— 1

11

the one-sided problem, the following condition ensures that all contracts are incentive-compatible. There exist R, R' 6 IR+ such that S = {x: R{x) < R} and S* = {x:R'(x) < Rm}. The set of all incentive-compatible two-sided contracts is fully specified by the four-tuple {R(-), R), {R*(-), R"). A two-sided simple debt contract is then defined as follows:

Definition 4. A contract (R(-),R), (Rm(-),Rm) is a two-sided simple debt contract if:

(i) R(x) = x for x € S = {x < R} and R{x) = R if x G Sc = {x > R}; and (ii) R"{x) = x for x <= 5* = {x < R*} and R*(x) = R' if x € S'c = {x > R*}. We will often denote two-sided simple debt contracts by (R, R").

The intermediary's two-sided optimization problem can now be stated. Let c denote the intermediary's cost of monitoring the borrowers, and let c* denote the lenders' cost of monitoring the intermediary. In Section 4 these monitoring costs shall be discussed in more in detail.

Problem 3.2. Choose incentive-compatible contracts (R(-), R),(R"(-), R') to:

max/ [x — R(x)]dF(x) Jo

subject to:

?— I R*(x)dFn(R(-),R)(x) - I cndFn(R(-),R)(x) > r (4)

— 1 Jo Js*

m

I [x - R"{x)]dFn{R(-),R)(x) - I cdF(x)

> r. (5)

Problem 3.2 states that the intermediary maximizes the expected utility of each ex-ante identical borrower subject to two constraints. (4) states that the expected payoff to the m — 1 remaining lenders (i.e., those who did not become intermediaries) must be at least r, the level of utility available from the alternative project. (5) states that the profit from intermediation (i.e., net payoffs from the borrowers less the payoff to the lenders) must also be at least r. Note that the bank's decision variables are the loan contract /?(•), the deposit contract /?*(•), and the number of projects n. The number of lenders is determined by the choice of n.

12

4 Optimal Investment Arrangements

The structure of the optimal investment arrangement will depend crucially on the nature of the monitoring costs, c and c*, because default risk is non- trivial and monitoring will occur with positive probability. We now proceed to prove Theorem 1 which establishes that delegated monitoring is optimal when the lenders' costs of monitoring the intermediary are bounded and the variance of the nondiversifiable macroeconomic risk is sufficiently small. The proof of Theorem 1 depends on continuity of the constraints of Problem 3.2 in the face values R and R* of the two sided debt contract, which follows from Lemma 1 in Krasa and Villamil (1991a). The strategy of the proof of the Theorem is as follows. Let R be the simple debt contract which is optimal among all one-sided schemes described by Theorem GHW in Section 3.1. We show: (i) there exists an alternative two-sided debt contract (R, R") such that (4) is satisfied and binding; (ii) (5) is fulfilled but does not bind under (R, R*); and (iii) by increasing the face value of the lenders1 debt (say to B' > R") the payoff to the lenders increases.13 Then by continuity of the constraints in the face value of the lenders' debt, a two-sided contract (R, B") can be found such that both constraints are slack. Finally, by continuity of the constraints in R, the face value of the borrowers' debt, R, can be lowered, with both constraints still satisfied. Thus, the delegated monitor offers better contracts to agents than the best feasible direct investment contract, which proves the Theorem. The argument requires n, the number of borrowers, to be sufficiently large; a more precise characterization of n shall be provided in Section 5.

Theorem 1. Assume that c* is bounded and that the variance of Z is sufficiently small. Then delegated monitoring with two-sided debt contracts dominates direct investment.

Proof. Consider first the investors' cost of monitoring the bank. Recall that the bank faces two types of risk: a diversifiable, project-specific risk y,, and a non-diversifiable macroeconomic risk Z. Thus, the banks default probability will in general not converge to zero (even if it contracts with a

I3In general, the lenders' payoff does not increase monotonically with R' because the probability that lenders must verify the intermediary is an increasing function of/?". This is also true for one-sided schemes (cf., Gale and Hellwig (1985, p. 662)).

13

large number of borrowers). By Lemma 1 of the Appendix, the average payoff from borrowers to the intermediary converges to the expected conditional return E[R{XX) \ Z]:14

±JTR{Yi + Z)-*E[R{Y1 + Z)\Z}, (6)

as n — » oo. Further, by Lemma 2 in the Appendix, the probability that the return from borrowers is less than the lenders' fixed payment converges to the probability that the expected return E[R(X\) | Z] is less than the face value of the lenders' debt:

P ({^ £ R(yi + Z)< R'}) - P {{EiR(yi +Z)\Z]< R')}) ■ (7)

Now choose z{R') such that E[R{YX + z(Rm))] = R\ Note that z(R*) is independent of the distribution of Z. Furthermore,

E [R{Yx + z(R*)j\ = E [R{Yl + Z)\Z = z(BT)\ .15

Note that z(-) is a "cutoff value" in the distribution of Z which separates solvency states from insolvency states in the limit. Since R(-) is monotonic the right-hand-side of (7) equals P (<Z < z(R*)\), which we shall show is the bank's default probability in the limit.

Next, consider the bank's payoff to a lender when its portfolio is large. From (6) and from continuity of R'(-) it follows that

Rm [^ E R(yi + z)) - R' (E[R(Yl + Z) | Z\) asn-> oo. Lebesgue's dominated convergence Theorem therefore implies Hm JR* (^JTRiY^ + Ziu))^ dP(u)

= J R'(E[R(Yl + Z)\Z](u))dP(u) (8)

14 For all random variables X, Y denote by E[X | V] the conditional expectation of X with respect to Y (which is a random variable, measurable with respect to the information contained in Y). In particular, let w € ^ be an elementary event for which Y(u) = y. Then E[X \ Y = y] = E[X \ Y](u). If A' and Y have only a countable number of different values then this corresponds to the elementary definition of a conditional expectation.

15This follows from our strong independence assumption. See footnote 5 and the proof of Lemma 1.

14

Substituting the distribution of - Y^?=i R{Yt(uj) + Z(u)) and the distribution of Z for P in (8) yields

lim / R*(x) dFn{R(-)){x) = I R' (E[R(Y1 + Z) \ Z = z}) dH(z) (9)

n— ►oo J J

where H denotes the distribution of Z. For example, if there is no macroeco- nomic risk (i.e., Z = 0), the right-hand-side of (9) is given by R* (E[R(XX)]) so when R* < E[R(Xi)] the expected return in the limit is given by R* and the lenders receive the face value of the debt with certainty.

A two-sided contract which dominates the one-sided, direct investment contract can now be constructed. Let e > 0 be some arbitrary constant. First, choose B" such that r < B* < E[R(Xi)]. Then (7) implies that the bank's default probability is less than e for large n, if P({Z < z(B")}) < £, i.e., the probability that the realization is in the "tail" of the distribution of Z is sufficiently small. Note, that z(B') < 0.16 Thus, there exists a 6 > 0 such that whenever var(Z) < 8 we get P({Z < z(B*)}) < e.17 Thus, the lender's expected costs of monitoring the bank are bounded above by £c* for large n. For similar reasons, the lenders' payoff is bounded from below by ^"(1 — e) for sufficiently large n.

If £ is sufficiently small, (7) and (9) imply that constraint (4) is fulfilled, but does not bind for the two-sided contract (R,B*). By continuity of (4) with respect to B* (see Krasa and Villamil (1991, Lemma 1)), there exists a face value R* < B" such that (4) binds for the two-sided contract (R,R*). We next show that (5) is fulfilled under contract (R, R'), but does not bind. Recall that the bank's default probability is less than e. Thus, J5. c* dFn < £c* for the contract with face value B' . Since R* < B* , the bank's default probability is lower with R*. Thus, by choosing e sufficiently small and since c* is bounded we can ensure that fs cdF > fs. c* dFn for all sufficiently large n.18 This and the fact that (4) binds implies

it

( R'{x)dFn <(m-l)(r+ / cdF). (10)

Jo Js

16We normalize the mean of the macroeconomic shock Z to zero, so :(B') < 0 denotes a recession.

17This is possible since z(B') is independent of the distribution of Z .

18The inequality indicates that the intermediary's expected cost of monitoring the bor- rowers is higher than the lenders' expected cost of monitoring the intermediary.

15

Consequently,

j [x-R-{x)\dFn - I cdF Jo Js

> nE[R(Xi)] -ml cdF - (n - l)r > mr - (m - l)r = r.

Js

The first inequality follows from (10) and from the fact that / x dFn = E[Gn] = E[R{X\)\, which is the expected value of an aggregate version of equation (3). The second inequality follows because R must fulfill (2) by assumption. Now increase R' slightly. Then by the continuity of the con- straint and by the construction of R" the lenders' payoff increases and thus both constraints can be made slack. This proves Theorem 1 because there exists some surplus that can be redistributed to borrowers by lowering the face value of their debt R.

Theorem 1 establishes optimality of delegated monitoring schemes if n is sufficiently large and if the variance of the macroeconomic shock is sufficiently small. However, it does not follow that it is optimal for the bank to be as large as possible. Indeed, Theorem 1 suggests that an optimal bank size may exist because as the bank increases its portfolio size there are gains from default risk reduction but losses from increased monitoring costs. In Theorem 2 we characterize these gains and losses more precisely. However, before doing so we first relate Theorem 1 to the previous literature on delegated monitoring. Specifically, we focus on the bank size and industry structure predictions implicit in previous models.

Diamond (1984) and Williamson (1986) use a law of large numbers ar- gument to prove the optimality of delegated monitoring (i.e., financial inter- mediation) in an economy with bounded costs and no macroeconomic risk. Because the probability that a bank fails is zero in the limit in their mod- els, the lenders' expected costs of monitoring the bank are zero. Clearly, a bank that operates in such an environment can always reduce the expected monitoring costs borne by lenders by increasing its size. Thus, "big banks are always better," and the model predicts banks of large but indeterminate size.19 This size prediction is implicit in the Diamond and Williamson mod- els, and stems from the fact that increasing returns to scale are inherent in

19Because the set of borrowers is infinite, it is possible to get multiple banks that are indeterminately large in this framework. However, the argument requires that the infinite

16

the framework they consider. Specifically, in their models a bank can al- ways both decrease the riskiness of its portfolio and reduce monitoring costs by contracting with additional borrowers. An obvious question, therefore, is: Does delegated monitoring in an economy with non-diversifiable portfo- lio risk also give rise to increasing returns to scale in intermediation, and hence indefinitely large banks? The answer to this question depends on the specification of monitoring costs.

Consider first a "best case1' situation where lenders face a fixed cost of monitoring a bank in default states. Specifically, let c* = k, where k is a positive constant which is independent of the size of the bank. In this case, (7) from Theorem 1 implies that the bank's default probability converges to P{{Z < z}). This follows from the fact that a bank's default prob- ability decreases (in general) as it contracts with more borrowers because idiosyncratic risk is diversified away. The non-diversifiable macroeconomic risk obviously remains.20 To make this argument more precise, let p„ de- note the bank's default probability when it has a portfolio of size n, where n < n. The lenders' expected costs of monitoring the bank under this cost structure are PhC'n = Pnk. Since the bank's default probability when its port- folio size is h is at least as great as its default probability in the limit (i.e., Pn > limn_00pn), the lenders' expected costs of monitoring the intermediary are lower for larger banks. It follows from this observation that the delegated monitoring model with non-diversifiable portfolio risk and constant monitor- ing costs will generally also display increasing returns to scale in intermedi- ation (because increasing the bank's portfolio size does not raise monitoring costs but it may lower the bank's default probability). Consequently, like Diamond and Williamson, the optimal bank size under constant monitoring costs is indeterminately large.

Now consider a polar opposite "worst case" situation, where the lenders'

set of borrowers be partitioned into an infinite number of subsets where each infinite subset of borrowers contracts with a particular delegated monitor. This argument does not appear to be a plausible explanation of the observed co-existence of multiple banks. Of course, the models were not designed to explain this observation.

20Convergence in equation (7) need not be monotonic. Thus, there may exist points of non-monotonic convergence where even under constant monitoring costs the optimal bank size is finite if the macroeconomic risk is non-trivial. We therefore assume without loss of generality that the bank's probability of default is always bounded from below by P{{Z < z}), which is the default probability of a bank of infinite size with macroeconomic risk but no idiosyncratic risk.

17

monitoring costs are unbounded. Krasa and ViUamil (1991, Theorem 1) an- alyze this problem when there is no macroeconomic risk. They show that even if costs are unbounded but do not increase at an exponential rate, dele- gated monitoring with two-sided simple debt contracts still dominates direct investment. However, the non-trivial macroeconomic risk which we consider in this paper complicates this problem considerably. Specifically, in the limit the lenders' expected monitoring costs are given by limn_^oo pnc*. Clearly, if c* is unbounded this product converges to infinity when the bank's portfolio is subject to non-diversifiable macroeconomic risk (because lirr^^oo pn > 0). Thus, constraint (4) from Problem 3.2 is violated for sufficiently large n, and this implies that delegated monitoring is not feasible with unbounded costs, non-diversifiable macroeconomic risk, and a sufficiently large portfolio size. This argument suggests that an optimal portfolio (or bank) size may exist because the feasibility of delegated monitoring depends on n.

Consider now the problem of whether or not a bank of a given size (n) should contract with additional borrowers, thus increasing its scale. Suppose that monitoring costs are bounded but not constant. Theorem 1 establishes that delegated monitoring is optimal if the variance of the macroeconomic risk is sufficiently small, but this does not imply that the bank should be as large as possible. When there is non-trivial macroeconomic risk and the bank has a portfolio of size n, the bank must consider two factors when deciding whether or not to increase its scale.

(i) Adding additional projects to its portfolio decreases the bank's default risk. This decrease is given by the difference between the bank's proba- bility of default in the limit (i.e., lin^oo pn) and its probability of default at portfolio size h (i.e., p^); but for h sufficiently large, the gains from reducing default risk by adding additional projects are essentially zero, (ii) Adding additional projects to the banks' portfolio raises the lenders' costs

of monitoring the bank. Thus, the crucial question is: For what cost structures do the gains from reduced default risk dominate the losses from increased monitoring costs when additional projects are added?

18

5 Optimal Intermediary Size

We now obtain the main results of the paper: analytic predictions for both the optimal size of a financial intermediary and for the size distribution of banks in a Pareto efficent industry. In order to analyze the problem in detail (and answer the question posed above), we must provide a precise quantitative characterization of the gains from diversification as the size of the bank increases. To this end we construct a rate function which measures the speed at which the bank's idiosyncratic portfolio risk is eliminated when the size of its portfolio is increased. The Theory of Large Deviations (cf., Varadhan (1984)) provides the formal structure.

Consider first the bank's portfolio diversification problem for the case of a fixed realization z of Z. The argument will be generalized to permit any z by integrating over the distribution of all possible realizations of z. Then XI = Yt + z are independent random variables (given that z is fixed) since the Yi are independent. Since the random variables X* are independent, the law of large numbers holds. The large deviation principle gives a rate function (cf., Varadhan (1984), Theorem 3.1) which provides a measure of the speed of convergence in the law of large numbers. The rate function implies that for every z < EZ[R(X*)], the probability that a realization is in the tail of the distribution converges exponentially to zero. Formally

^({^EA7<«})<^(a)n, (ii)

where lz(a) > 0 is the rate function which gives the speed of convergence of the distribution (i.e., a measure of how rapidly contracting with additional borrowers reduces the bank's default risk).

The rate function Tz{-) is derived from the moment-generating function of a random variable. Let Mz(9) denote the moment- generating function of the distribution of R(X*), where p.z denotes the distribution. Then

Mz{6) = J e6x d^(x).21

The rate function is found by solving the following maximization problem:

1(a) = max0a- log Mz{0). (12)

9£ #t

21Mz(0) is called the moment generating function since the fcth derivative of M2(0) evaluated at 9 = 0 gives exactly the fcth moment of /i.

19

We now show that 1(a) > 0 for every a < E(X-). For fixed a let f(a,6) = 9a — log Mz(0). Since /(a,0) = 0, it is sufficient to show that ^/(a,0) ^ 0, which can be easily verified:

d fxeexdfiz(x)

da-f{a'9)=a- fe*dp,(x)-

Consequently, £/(a,0) = a - EX? ^ 0.22

We have shown that the probability P ({jE?=i R(Yi + z) < #*}) con- verges exponentially to E[R(Yt + 2)] for fixed z such that z > z(R") (i.e., if the macroeconomic shock is not too severe). We now make the conver- gence argument independent of z. The primary technical problem is that the behaviour of the rate function must be analyzed as z comes close to z(R*). Recall that for z = z(R"), the face value of the lenders' contract, R', is ex- actly the expected value of R(Yt + z). Clearly, the probability of observing realizations less than or equal to the expected value of independent random variables does not converge to zero. However, in Lemma 3 we show that for z > z(R*)i and 2 sufficiently close to 2, the rate function IZ(R") is bounded from below by k(z — z")2, where k > 0. In particular, from (A. 15) and (A. 16) in the Appendix it follows that we can choose for k any number smaller than 2v*rR(Y +z)-23 Thus, the smaller the variance of the idiosyncratic risk the faster convergence. Thus, the rate function converges to zero at the speed (z — z)2 for z — > z. Integrating over 2, the probability of default by a bank of size n, given that Z > z, is:

I

00

e-TAR')ndH(z). (13;

An important technical result, which is essential for proving that an op- timal bank size exists (Theorem 2), can now be stated.

Proposition 1. There exist constants k, > 0, i ' = 1,2 such that the proba- bility of default by a bank of size n, given that Z > z, is bounded from above

by^ + e-^.

22This follows from the fact that f xe8x d^iz(x) evaluated at 6 — 0 is the expected value of X* , since fxz is the distribution of X*. Furthermore f e0x dfiz(x) evaluated at 9 = 0 is

one

23

Apply Lemma 3 as in Proposition 1 and choose Xa — R(Y, + z).

20

The proof of Proposition 1, which provides a speed of portfolio conver- gence result, is in the Appendix. Note that in the bound in Proposition 1, the term e~nk2 converges to zero much faster than ^h. By the law of large numbers, a bank of infinite size will never default if Z > z. Thus, the bound implies that a bank of size n can lower its default probability by only ap- proximately -h if it becomes infinitely large (given that Z > z) because the second term is approximately zero for large n. Consequently, H~ is our de- sired measure of the gain from default risk reduction arising from additional diversification.

The main result of the paper can now be stated.

Theorem 2. Let c"n denote the cost of monitoring a bank of size n. Let c*^ — limn_00 c* ,24 and assume that c^ — c* converges to zero at a slower rate than -4-. Then it is never optimal for a bank to become infinitely large. Thus, there exists an optimal size for the bank.

Proof. Assume by way of contradiction that it is optimal for the bank to become infinitely large. By Proposition 1, the bank's probability of default is bounded above by

pn = i^ + e-*»» + P({Z<z})» (14)

\Jn

By Lemma 2, the bank's default probability in the limit is given by P{{Z < z}). Let n be arbitrary. Now compare the expected monitoring costs for a bank of size n with those of a bank of infinite size. By (14), the lenders' expected costs of monitoring a bank of size n (i.e., pnc*n) are at least

-j^ + e \Jn

c'n + P({Z<z})c'n. (15)

The expected costs of monitoring a bank of infinite size are at most

P({Z < z})^. (16)

24If c* is unbounded then c^ = oo, and clearly Theorem 2 holds.

25The first two terms are the bound given by Proposition 1 for all states where Z > z. Assuming that the probability of default in all states Z < z is one (which is clearly an upper bound for these states), we get the third term.

21

If it were optimal for the bank to become infinitely large, then at least for large n the expected costs of monitoring the bank per lender must decrease if the size of the bank is increased from n to infinity. By (15) and (16) the expected monitoring costs will decrease if26

4= + e-**n

<> P({Z < z}) (<£,-<) . (17)

By the assumption of the Theorem, c^ — c* converges to zero at a slower rate than -4jj. Thus, for every M > 0 there must exist an h such that c^—c^ > -t- for all n > n. This, and equation (17) yields

c'n>P({Z<z})^=. (18)

Since M can be chosen arbitrarily, there exist values such that inequality (18) is violated.27 The bank cannot be infinitely large, and the Theorem is thus proved.

Theorem 2 establishes that when the lenders' monitoring costs depend on the bank's portfolio size, an optimal determinate bank size can be computed, under the assumption that the rate of increase of the lenders' monitoring costs converges to zero at a sufficiently slow rate. This assumption is fairly weak. In particular, it is fulfilled if the rate of increase ofc* is of the order -y-r.28 Most economically plausible cost structures will satisfy this assumption. An alternative way to interpret Theorem 2 is that the case of constant lender monitoring costs (on which others have relied) is rather special. In particular, we argued in Section 4 that under constant monitoring costs a bank's size is indeterminate (because of increasing returns to scale in delegated monitor- ing). However, with a slightly changed (i.e., size dependent) cost structure an optimal bank size exists. Thus, the indeterminacy (and hence increasing returns to scale for all n) result does not seem to be very robust. Theorem 2

26This follows from (16) - (15) < 0.

"Multiply both sides of (18) by y/n to obtain [ki + v/ne-^"] c'n > P{{Z < *})M. Let n — ► oo. Then, kyc*^ > P{{Z < z})M, which cannot hold for every M . Thus (18) must be violated for all sufficiently large n.

Note that c^ — c* is approximately -j=. Differentiating with respect to n (ignoring that n is an integer) of course yields — fr-

22

also shows that even when the bank's portfolio is subject to non-diversifiable macroeconomic risk (as long as the variance of the risk is not too large so intermediation remains optimal), a bank of size n can only improve upon its default probability by at most 4jj. Increasing bank size after some critical h is not optimal because it leads to increased monitoring costs, but there are very limited gains from default risk reduction. Thus, even very "small" banks (given Z) may improve upon direct investment when intermediation is optimal.

6 Testable Implications of the Theory

Theorem 2 has the following testable implications. First, bank size is deter- minate and inversely related to the bank's exposure to macroeconomic risk. In a large economy like the U.S. where aggregate macroeconomic shocks have different effects on different regions of the country, our model predicts that banks of different sizes will coexist across locales. In particular, if different re- gions of the country have different effective macroeconomic shocks (i.e., Z's), the model predicts that across regions both large "money centre banks" that are very well diversified and smaller "local banks" that are less well diversi- fied will co-exist. "Money centre banks" may have been able to lower their exposure to macroeconomic risk, by evading portfolio restrictions via hold- ing companies or because they operate in regions of the country with better diversified economic bases. Our model predicts that these better diversified (i.e., low Z) banks will be larger than "local banks" (with higher Z's). This follows from (18) in the proof of Theorem 2 because P({Z < z}) is lower for a better diversified bank, since the bank's exposure to macroeconomic shocks is effectively less severe. Therefore, the bank size (n) that violates (18) is necessarily higher. Williamson (1989) provides an interesting discussion of stylized facts regarding the structure of U.S. versus Canadian banks. Histor- ically, Canadian banks have had fewer portfolio (e.g., branching restrictions) and hence lower Z's than U.S. banks. As our theory predicts, there have been fewer banks in Canada of larger size (adjusted for population differences).

The second testable implication of our theory pertains to industry struc- ture, i.e., limits on the number and size distribution of firms that are present in equilibrium. Theorem 1 establishes that intermediation (banking) im- proves social welfare. In other words, there is a some level of intermediation

23

services in an economy that is socially optimal (given preferences, costs, prob- ability distributions, and alternative opportunities). In contrast, Theorem 2 proves that there is a specific bank size that is optimal. Although the notions of bank size and industry structure are closely related, they need not be iden- tical. For example, suppose the socially optimal "industry" (i.e., economy) level of welfare improving intermediation services is twenty units of input capital and the optimal size for all banks is to provide two units of capital. Clearly, the optimal industry structure is then ten banks. What causes some banks to be of similar size in our model? The result that within a region (or among banks with similar effective portfolio restrictions) banks with similar "local" idiosyncratic risk characteristics will have a common size (given their Z) follows immediately from equation (18), because fci, M and P{{Z < z}) will be similar for such banks. Why might many moderately sized local banks coexist in the same region? This again follows from Theorem 2 because the Theorem establishes that after some critical size — increasing a bank's scale of operation further is not profit maximizing. Finally, why might we expect to observe many small "local banks" and fewer large "money centre banks?" The high frequency of smaller local banks (relative to large money centre banks) stems from the fact that banks with higher Z's cannot achieve suffi- cient portfolio diversification in their locale to justify the additional monitor- ing costs associated with increasing their scale of operation. Multiple banks operating at the efficient scale within a particular locale provide welfare im- proving intermediation services optimally (given the risk and cost structure in the economy). Differences in bank size and distribution across locales stem entirely from different effective Z's in (18) (i.e., different exposure to macroeconomic risk in local and money centre banks).

7 Concluding Remarks

In this paper we develop a theory of bank size distribution with testable implications. We regard the theoretical model to be useful for two reasons. First, there is a longstanding debate in the banking literature about the mar- ket structure of the financial industry. Some have argued that the banking industry is inherently non-competitive and hence must be regulated to pro- tect consumers from the evils of banks' market power. In contrast, others have argued that the banking system is fragile and hence must be protected

24

from the destabilizing forces of competition. Because we derive optimal fi- nancial contracts from Pareto problems, the allocations that we obtain are necessarily Pareto efficient. An interesting problem for future research is to compare the structure of a banking industry where firms have market power with the Pareto efficient structure implied by Theorem 2 (given parametric specifications for preferences, costs, probability distributions, and alternative assets).

Second, the model may prove useful in understanding changes in the Eu- ropean banking system that will undoubtedly result from the EMU. In the previous Section we discussed bank size and industry structure predictions for the U.S. economy. For example, our model predicts that largely agricul- tural sections of the U.S. like the Mid-west will have many moderate or small sized banks because this region is subject to marcoeconomic shocks that are difficult to diversify (especially given portfolio restrictions imposed by bank regulators). However, the model predicts money centre banks that operate in more economically diversified regions of the country (and that have been better able to evade portfolio restrictions) will be larger. In 1999 when the EMU begins, twelve countries will form a "federal Europe." What will be that nature of the EMU banking regulations imposed by the central bank or by the governments of the individual countries? Are the banks of some countries (specifically, those with better diversified economies) destined to become "money centre" banks while the banks of other (less well diversified) economies are destined to become small "local" banks? This question is important because — although our model permits existence of both multiple banks of the same size, and perhaps more interestingly the co-existence of banks of different sizes — larger (better diversified banks) have a lower de- fault probability than smaller banks. Are members of the EMU with less well diversified economies, that might wish to encourage their banks to lend domestically for development reasons, destined to become problematic mem- bers of the Union from the outset?

8 Appendix

We first derive a "law of large numbers" for random variables with correla- tion. The argument works essentially as follows: For any fixed realization z of Z we can apply the law of large numbers to the random variables \\ + z

25

and show that - X^"=i Yx + z converges to z except for a set Nz which has measure zero with respect to the probability Pi (recall that P is the product of the probabilities Px and P2). We then use Fubini's Theorem (i.e., integrate with respect to P2) to show that \JZ£rNz has measure zero with respect to the original probability P. We now state Lemma 1.

Lemma 1. Let Yt, i E W and Z be independent random variables. Assume that the Yt are identically distributed. Let Xt = Yx + Z , for every i and let R(-) be a simple debt contract. Then limn^oo J2?=i R{Xt){u) = E[R{Y\) \ Z] (uj) for almost every u) € 0.

Proof. Let 2 be a realization of Z. Let E[R{X\) \ Z] be the conditional probability of X\ with respect to Z. Let u2 £ 02. Define the probability space ( SlW2 , AW2 , P^ ) as follows. Recall that Q, = fij x Q.2. Then let QW2 = Q,i x {u>2}. Let Au,2 be the set of all events Ax{w2}, where A is an event in Cl\. Finally, let P^{A x {u2}) = PX{A). Let z = Z{u2). Then R(Xt) = R(Yi + z) are independent random variables on (QZ,AZ,PZ). We can therefore the apply the law of large numbers to the X- , i £ JN. Thus,

Jim£/2(X.-M)= / R(Xl(u) + z)dP^2(u) (A.l)

for all u> £ fl^ except for a set A^ £ AW2 with Puj2{NW2) — 0. Further,

R{X1(u))dPU2(u;) = £[/2(X0 I Z](w), (A.2)

for almost every u 6 £lW7, because Yi and Z are independent.29

•/n«,2

29Note that u h-> /n /?(Xi(u;)) dFW2(oi) is measurable with respect to Z. Thus, to show that we have a version of the conditional expectation it is sufficient to prove that

/ / R(X1(w))dPu,(u)= I R(Xx(u))dP{u)y J a Jn„2 •**

for every event A which is measurable with respect to Z (i.e., which is of the form Z~l(B) where B is a measurable subset of IR). However, since Z is constant on Qi, all such sets are of the form Q\ x C, where C is a measurable subset of Q2 - The result then follows from Fubini's Theorem since integrating over f2i and then over QW7 (which is essentially Q\) is the same as integrating over Q{ once.

26

Let H be the distribution of Z. It remains to show that N = Uu/2€n2 ^2 has measure zero. This follows from Fubini's Theorem. The set D of all u where ^ Yl?=i R{Xt) does not converge is given by

D = I u:\immt -Y.R(Xi) ^ f and limsup - Y R(Xt) ^/l,

where / = E\R(X\) \ Z\. Since the limsup and the liminf of a sequence of measurable functions is measurable, it follows that D is measurable. Further, D D QW2 — ^2 smce A^ ls exactly the set of all u E ftW2 where the sequence does not converge. Thus D = N and Fubini's Theorem implies

P(N)= I P1(NU2)dP2(u) = 0.

Jn2

This concludes the proof of the Lemma.

We next prove a technical result necessary for the proof of Theorem 1 (i.e., a convergence result for the bank's probability of default).

Lemma 2. Let X{, Yt and Z be as in Lemma 1. Assume that the distribution of Z is non-atomic, i.e., P{{Z = z}) = 0 for every z (£ IR. Let ft* be the face value of the lenders' simple debt contract. Then

P ( " E R(Yi + Z) < R'\ ^ P (E\R(YX + Z) I Z] < ft"

Proof. Let fn = i£?=i R{Yt + Z), and let / = E[R{Y1 + Z) \ Z\. By (6) we get /n(w) — * f(u) for almost every to. Thus, fn converges to / in distribution30 by Proposition 24.12 of Parthasarathy (1977). Let h be an indicator function of the interval (—00, .ft*), i.e., h(x) = 1 for every x G ( — 00, .ft") and h(x) = 0, otherwise. Note that h is only discontinuous at R* . Since the distribution of Z is continuous, the point {R*} has probability zero. Corollary 1 of Billingsley (1968, p. 31) therefore implies that h(fn) converges

30Convergence in distribution means the following: Let Fn denote the distribution func- tion of /„ for every n £ IN, and let F be the distribution function of/. Then /„ converges to / in distribution, if f g(x) dFn(x) = f g(x) dF(x) for every bounded and continuous function g on Wt.

27

in distribution to h(f). Let H(Fn) denote the distribution of h(fn), let H(F) denote the distribution of h(f), let Fn denote the distribution of /n, and let F be the distribution of /. Then

lim fh(x)dFn{x) = lim fxH(Fn)(x)

n— »oo J n—KX J

= lim fxdH{F)(x) = lim [h{x)dF{x),

n— »oo J n—'oo J

where the second equality follows from the fact that h(fn) converges to h(f) in distribution. This proves the Lemma 2.

We next prove a technical result on the convergence of the rate function

Lemma 3. Let Xa, a > a > 0, be a collection of random variables such that a is the expected value of Xa for every a > a. Assume that the moment gen- erating function Ma(a) is thrice continuously differentiate in a neighborhood [a, a) of a. Let fia be the distribution of Xa. Assume that the support of (ia is contained in the compact interval [T!,T2] for every a > a. Then there exists a constant k > 0 such that Xa{a) > k(a — a)2 for all a which are sufficiently close to a.

Proof. Define

f(a,0)=0a-\ogMa{0), (A3)

where Ma{9) is the moment generating function Ma(0) = / e6x dp,a(x). Note that /(a,0) = 0. Furthermore,

Let Fa denote the distribution function of p,a (i.e., Fa(t) = p*a{{— oo, t]). Partial integration of Ma(0) yields

M,

a(0) = I' eex dfia{x) =eBxFa{x)\% - 0 f ' e9xFa(x) dx

-9 ( 2 e9xFa{x)dx (A5)

;T7 ,e0T2

28

Furthermore, by partial intergration we also get

/ * Fa{x)dx = xFa(x)\%- I ' x dFa{x) = T2 - a, (A.6)

JTi ' JTi

and QxFa{x)dx = l-x2 Fa(x)\% -l-j\2dFa(x) = I (r22 - E(X2a)) . (A.7)

Let 0m(a) denote the value of 6 which maximizes (A. 3) for fixed a. Thus, from (A. 4) and the Implicit Function Theorem we get

— 6* a = d% V ' difa . (A.8)

da -aJLMa(0)-^Ma(6)

We now evaluate j^Om(a) at a = a. Thus, since 0*(a) = 0 by (A.4),31 we must evaluate the right-hand side of (A.8) at 0 = 0 and a = a. Taking the partial derivatives of Ma(6) in (A. 5), evaluating them at 6 = 0 and a = a and using (A. 6) proves that J^A/s(0) = 0. Furthermore,

d2 ., _ d rT*

-M,(0) = -— / 2Fa(x)<fx = l, a oa JT\

dOda da JTi

where the last equality follows from (A. 6). Thus, the numerator of (A.8) is — 1. We next derive the denominator of (A.8). Note that (A. 5) implies,

— Ma(0) = T2 - I ' F-a(x) dx = T2- (T2 -d) = a, dO Jtx

where the second equality follows from (A. 6); and

do2

M-M = T2 - 2 / 2 xF-a{x)dx = T2 - (T2 - E(X2a)) = E(X2),

JTi

where the second equality follows from (A.7). Thus, (A.8) implies da a2 - F(Xl) var(Aa)

'This follows since ^Ma(0) = EXa - a and Ma(0) = 1.

29

By (A. 5) and differentiation we get

^Ma(9*(a)) = (T2e°'W- j\™* Fa(x) dx^J ±6*(a)

-0'(a)4- I 2 t9'(a)xFa(x)dx (A10)

da JT\

Evaluating (A. 10) at a = a and using (A. 6) we get

±Md(9'(d)) = d^-9'(d). (All)

da da

Note that Ia = la(a) is the maximum of f{a,9) over 9 by (12) in Section 5. (A. 3), (A. 10), (A. 11), and the fact that M(9*{d)) = Af(0) = 1 immediately imply

i-Ig = A/(a,r(a))=0, (A12)

aa aa

Next we show that ^-Jd > 0. Using the fact that M(9"(d)) - 1, it follows that

^/(a,^(a)) = a^r(a)-^Ma(^(a))+^Ma(r(a))J . (A13)

Furthermore, by differentiating (A. 10) one more time at a = a and by (A. 6) and (A. 7) we get

^M&(0-(a)) = a^-26'{d) + (^-0'(d)) E(X?a) + 2^-9'(d). (AAA) da* da1 \da j da

Substituting (A.9), (A. 11), (A. 14) into (A. 13) we get

£/(«,*■(»)) = - (^-(5))2var.Va -2^-(S) = -L-, (A.15)

where the last equality follows from (A.9). Recall that the rate function Ta — (a, 6* (a)). Thus, (A. 15) implies j^la > 0. Since by assumption, Ma(9) is three times continuously differentiate in a neighborhood of (a,0), (A. 8) implies that 9*(a) is twice continuously differentiate for all a > a which are sufficiently close to a. Thus, Ta = f(a,6*(a)) is continuously differentiate

30

for all a > a which are sufficiently close to a. We can therefore find a constant

k > 0 such that

d2

—la > 2k > 0, (4.16)

daz

for all a in a neighborhood of a. Thus, developing Xa in a Taylor series at

a = a we get

d d2 (a - a)2

1a = I& + -r l*{a - a) + -j-rla , (>4.17)

aa acr 2

for an a between a and a. The Lemma now follows from (A. 12), (A. 16) and (A. 17) since Xa = 0 by (A. 4) and footnote 31.

We now use Lemma 3 to prove our main convergence result.

Proof of Proposition 1. Let Z > z. By (13) in Section 5.1, the bank's probability of default is bounded above by

J~e-W*)*dH(z).

Choose zx > z. Since 1Z(R*) is monotone increasing in z (cf., Stroock (1984, Lemma 3.3.) we get

/•oo

/ e-J*{R')ndH{z) < e-J^{R')nP{{Z > z,}). (4.18)

Clearly, this expression converges to zero exponentially as n — ► oo, i.e., it can by bounded above by a term e~k2n. It therefore remains to give an estimate for the rate of convergence if the realization of Z is between z and Z\. We can get such an estimate by using Lemma 3.32 Furthermore, note that E[R(Yi + z)\ converges to E[R(Yl + z)] at the same rate as z — > z. This and Lemma 3 implies that 2Z(R*) > k(z — z)2. It therefore remains to derive the rate of convergence of

/;

e

-k(z-z)2n

dH{z) (4.19)

32Let Xa be R(Yl + z) where a = E[R(Yi + *)], and let a = E[R{YX + z{R'))}. Then all assumptions of Lemma 3 are fulfilled. Differentiability follows since the func- tion z •— ► eR(Y*+Z) is differentiable except on a set of measure zero. Thus, Leibnitz's rule of differentiation under the integral can be applied.

31

as n — > oo. Let h denote the density function for H . Substituting z by *j£ in (A. 19) implies

\ e-k{z~'z)n dH{z) < — / e~kz h(z + -=) dz. (A.20)

iz v/n J -00 y/n

2 2

Note that e-z /i(s + -4«) converges to e~z h(z) as n — -> 00. Thus, there exists a constant k\ such that the right-hand side of (A.20) is bounded above by 4^. The result follows now from (A. 18) and (A.20).

32

References

P. Billingsley (1968), "Convergence of Probability Measures," John Wiley & Sons: New York.

D.W. Diamond (1984), "Financial Intermediation and Delegated Monitoring," Review of Economic Studies 51, 393-414.

K. Dowd (1991), "Optimal Financial Contracts," forthcoming Oxford Economic Papers.

D. Gale and M. Hellwig (1985), "Incentive-Compatible Debt Contracts: The One Period Problem," Review of Economic Studies 52, 647-663.

M. Gertler (1988), "Financial Structure and Aggregate Economic Activity: An Overview," Journal of Money, Credit and Banking 20(3), 559-587.

S. Krasa and A. P. Villamil (1991a), "Monitoring the Monitor: an Incentive Structure for a Financial Intermediary," forthcoming Journal of Economic Theory.

S. Krasa and A. P. Villamil (1991b), "Optimal Contracts with Costly State Verification: the Multilateral Case," BEBR Working Paper 91-0127, Uni- versity of Illinois.

J. Panzar (1989), "Technological Determinants of Firm and Industry Structure," Handbook of Industrial Organization I, Elsevier Science Publishers: Am- sterdam.

K. Parthasarathy (1977), "Introduction to Probability and Measure," McMillan Press: London.

E.C. PRESCOTT (1986), "Theory Ahead of Measurement," Quarterly Review, Min- neapolis Federal Reserve Bank: Minneapolis, MN.

J. Stroock (1984), "An Introduction to the Theory of Large Deviations," Univer- sitext, Springer.

R.M. TowNSEND (1979), "Optimal Contracts and Competitive Markets with Cost- ly State Verification," Journal of Economic Theory 21, 1-29.

R.M. Townsend (1988), "Information Constrained Insurance: The Revelation Principle Extended," Journal of Monetary Economics 21, 411-450.

S. Varadhan (1984), "Large Deviations and Applications," CBMS-NSF Regional Conference Series in Applied Mathematics, SIAM, Philadelphia, PA.

S.D. Williamson (1986), "Costly Monitoring, Financial Intermediation, and Equi- librium Credit Rationing," Journal of Monetary Economics 18, 159-179.

S.D. Williamson (1989), "Restrictions on Financial Intermediaries and Implica- tions for Aggregate Fluctuations: Canada and the United States 1870- 1913," NBER Macro Annual 4, 303-340.

33

HECKMAN

BINDERY INC.

JUN95