Skip to main content

Full text of "A theory of optimal bank size"

See other formats






The person charging this material is re- 
sponsible for its renewal or its return to 
the library from which it was borrowed 
on or before the Latest Date stamped 
below. You may be charged a minimum 
fee of $75.00 for each lost book. 

Theft, mutilation, and underlining of book* are reasons 
for disciplinary action and may result In dismissal from 
the University. 


AUG 1 1997 

When renewing by phone, write new due date below 
previous due date. LI 62 

Digitized by the Internet Archive 

in 2012 with funding from 

University of Illinois Urbana-Champaign 

Faculty Working Paper 92-0100 



S~F X 


-f OCJ-> ; 100 COPV 

A Theory of Optimal Bank Size 

FEB i / m-j 


Stefan Krasa 

Department of Economics 
University of Illinois 

Anne P. ViUamil 

Department of Economics 

University of Illinois 


Bureau of Economic and Business Research 

College of Commerce and Business Administration 

University of Illinois at Urbana-Champaign 



College of Commerce and Business Administration 

University of Illinois at Urbana-Champaign 

January 1992 

A Theory of Optimal Bank Size 

Stefan Krasa 
Anne P. Villamil 

Department of Economics 

A Theory of Optimal Bank Size 

Stefan Krasa Anne P. Villamil* 

First Draft: May 1991 This Draft: December 1991 


This paper provides a theory of optimal bank size determination 
with implications for the size distribution of banks in a model with 
asymmetric information and costly state verification. Production is 
subject to minimum scale requirements and two types of risk: diver- 
sifiable idiosyncratic project risk, and imperfectly diversifiable aggre- 
gate "macroeconomic" risk. We first show that delegated monitoring 
with two-sided simple debt contracts dominates direct investment if 
the cost of monitoring the intermediary is bounded and if the variance 
of the non-diversifiable macroeconomic risk is sufficiently small. We 
next show that: (i) banks are of finite size; (ii) bank size is inversely re- 
lated to the bank's exposure to macroeconomic risk, and (iii) multiple 
banks co-exist with the same size within a locale but with (possibly) 
different sizes across locales. 

'Address of the authors: Department of Economics, University of Illinois, 1206 South 
Sixth Street, Champaign, IL 61820. 

We gratefully acknowledge useful comments from Anthony Courakis and financial sup- 
port from the National Science Foundation (SES 89-09242). 

1 Introduction 

Recent research has studied how the structure of the financial system affects 
the transmission of business cycle shocks in an economy. See Gertler (1988) 
for an excellent survey of this emerging literature. An equally important but 
quite different problem is the following: How do business cycle fluctuations 
affect the structure of the financial system? This problem is important be- 
cause all developed countries are subject to regular and recurrent business 
cycle fluctuations. Further, banks in most countries operate under restrictive 
regulations which limit their ability to insure fully against macroeconomic 
risk. For example, in the U.S. there are branching restrictions which limit 
the geographic operation of banks and portfolio restrictions which limit the 
types of assets that banks may hold (e.g., Savings and Loan Institutions have 
been subject to such restrictions). It is obvious that these restrictions can 
lead to non-diversifiable portfolio risk, but little is known about the impli- 
cations for financial structure. This problem is of particular interest at the 
present time. The European Community is currently involved in a transition 
toward an economic and monetary union (EMU). To date, most discussions 
of the EMU have focused on the creation of a European central bank and its 
supervision of a single currency. However, once an administrative structure 
is in place, it will undoubtedly impose u pan European" restrictions. What 
are the likely implications of such restrictions for the future European bank- 
ing system? This paper proposes a theoretical model that can be used to 
analyze this important policy question. 

We propose a theory of optimal bank size determination which has im- 
plications for the size distribution of banks. We consider a costly state ver- 
ification model with finitely many borrowers and lenders where production 
is subject to: (i) minimum project scale requirements, (ii) diversifiable id- 
iosyncratic project risk, and (iii) a non-diversifiable aggregate (i.e., macroe- 
conomic) risk. Agents can undertake production in two ways. Borrowers and 
lenders can write direct bilateral investment contracts, or they can engage in 
intermediated investment by contracting with a bank that accepts deposits 
from lenders and grants loans to borrowers. Because there is non-trivial de- 
fault risk in the economy, the lenders must monitor either the borrowers (in 
the direct investment problem) or the bank (in the intermediated investment 
problem) in default states which occur with strictly positive probability. 

The minimum project scale requirement implies that it takes multiple 


lenders to finance the project of a single borrower. It is well known (cf., 
Diamond (1984) or Williamson (1986)) that "delegated monitoring" may 
be optimal under this requirement because it allows lenders to economize 
on monitoring costs. However, in an economy with non-trivial default risk 
lenders must "monitor the monitor" (i.e., bank) because the bank may mis- 
report the state to minimize its payments to lenders. Thus, the essential 
problem that lenders face is to provide the bank with an incentive to report 
truthfully to them. Krasa and Villamil (1991a) study this problem in a fi- 
nite economy that is subject to diversifiable default risk. They show that a 
particular type of contract commonly issued by banks (i.e., two-sided simple 
debt) solves this monitoring problem optimally. In contrast, in this paper 
we are concerned with the optimal investment arrangement (given two-sided 
simple debt contracts) when there is both diversifiable project risk and non- 
diversifiable aggregate or "macroeconomic" risk. When a bank is subject 
to non-diversifiable project (and hence default) risk, costly monitoring will 
necessarily occur in some states — regardless of the bank's size. 

In choosing an optimal portfolio size (i.e., scale of operation) a bank faces 
the following tradeoff for most monitoring cost structures. Increasing the size 
of the bank's portfolio (i.e., contracting with additional borrowers) given 
some initial bank size generally decreases the bank's default probability, but 
increases the lenders' cost of monitoring the bank. Thus, the crucial question 
that the bank faces in choosing an optimal portfolio size is: Under what 
circumstances do the gains from decreased default risk dominate the losses 
from increased monitoring costs when the bank compares its current scale of 
operation with an increased scale of operation? We begin our analysis of this 
question with Theorem 1 which shows that delegated monitoring with two- 
sided simple debt contracts (i.e., intermediated investment) dominates direct 
investment if the lenders' cost of monitoring the intermediary is bounded and 
the variance of the non-diversifiable macroeconomic risk is sufficiently small. 
We interpret the variance of the macroeconomic risk as the magnitude of 
business cycle fluctuations. For the U.S., Prescott (1986) reports that the 
standard deviation of output for the period 1872 to 1985 is only 1.8 percent. 

Theorem 2 provides the main result of the paper: A theory of optimal 
bank size with implications for the size distribution of banks. To our knowl- 
edge, this has been a neglected branch of the literature on financial inter- 
mediation. Of course, the problem of firm size distribution has been studied 
extensively in industrial organization theory. Panzar (1989, p. 33) summa- 

rizes the findings from this literature by noting that firm size "is determined 
in large part by the . . . cost function," while industry structure (i.e., limits 
on the number and size distribution of firms that are present in equilibrium) 
"is determined by the market demand curve." Our Theorem 1 implies that 
this traditional industrial organization analysis is inadequate for a theory of 
financial structure. In Theorem 2 we develop a measure of the rate of port- 
folio diversification accruing from increases in bank size, and then show that 
both risk and cost considerations are essential determinants of bank size. 
The theory has three predictions that in principle are empirically testable. 
First, banks will be of finite size with the precise scale dependent upon the 
structure of monitoring costs and the degree of portfolio diversification that 
the bank can attain. Second, banks that are better able to diversify risk (e.g., 
because they are subject to less stringent portfolio restrictions) will be larger 
in size than banks which are less able to diversify risk. Third, multiple banks 
with similar risk and cost characteristics may co-exist. The first prediction 
pertains to firm size, while the latter two are industry structure predictions. 
Clearly, firm size and industry structure are related; however, we shall return 
to a more precise discussion of this relationship in Section 6. 

2 The Model 

Consider an economy with finite numbers of two types of risk-neutral agents, 
borrowers and lenders. Each borrower i = 1, . . . ,n is endowed with a risky 
investment project which transforms one unit of a single input at time zero 
into x, units of output at time one, where x, is the realization of a random 
variable X{ on the probability space {Q,A, P). 1 For simplicity assume that 
borrowers have zero endowment of the input. Every lender j — 1, . . . ,m is 
endowed with a < 1 units of a homogeneous input, 2 but has no direct ac- 
cess to a productive technology. Thus, the project of a borrower cannot be 
financed by a single lender which implies that in the absence of intermedia- 
tion more than one lender would have to verify a single borrower. The total 
available supply of investment is larger than the input required by all borrow- 
ers, so m lenders can be accommodated by the h borrowers (i.e., rha > h). 

We will implicitly refer to this probability space when writing P for probability and 
E for expected value. 

2 For technical purposes assume that \/a is an integer. 

Also, assume there is a riskless alternative investment project available to all 
lenders that yields return r with probability one. 

All borrowers and lenders are fully informed about the distribution of 
Xi at time zero, but asymmetric information exists about the state of the 
project's actual realization ex-post: Only borrower i costlessly observes the 
realization x, of his/her project at time one. Let F,(x) denote the distribution 
of borrower Vs project and assume that the F{ are identical (i.e., all X, have 
the same distribution). 3 We now depart from the standard intermediation 
framework by considering an economy with non-trivial correlation among 
the X(. 4 Assume that each X t can be decomposed into independent random 
variables Yi, i — l...,n, and Z, where Y t is an idiosyncratic risk associated 
with borrower z's project and Z is a non-diversifiable "macroeconomic" risk 
common to all borrowers. Thus, 

X t = Y, + Z, 5 (1) 

Assume that the distributions of Y, and Z have continuous density functions, 
Xi > for every i (because borrowers can never produce "negative output" 
no matter how bad the macroeconomic shock), and that each Xi is bounded 
from above. 

Let a technology exist which can be used by agents other than borrower i 
to verify at time one the realization x t of project X t . Assume that this state 
verification technology is costly to use, and that when verification occurs, 
x, is privately revealed only to the individual who requests (deterministic) 
verification. Assume that the verification cost is comprised of both a pecu- 
niary component and an indirect "pecuniary equivalent" of a non-pecuniary 
cost. 6 These costs may be thought of as the money paid to an attorney to 
file a claim (a pecuniary cost) and the monetary value of time lost when 

3 This assumption simplifies the analysis but is not essential for the results. 

4 See Dowd (1991) for an excellent survey of the literature on financial intermediation. 

5 To simplify the analysis we make the following technical assumption: Let Q = fii x Q 2 
and let P = Pi x P 2 , where Pi is a probability on Qi for i = 1,2. Assume that the 
random variables Yj are independent of Q21 > e -, f° r every u>i G fii the mapping u> 2 t— ► 
Yi(u>i } u> 2 ) is constant on Q 2 . Similarly, assume that u>\ ■— ► Z(wi,w 2 ) is constant on Qi. 
This condition implies independence of Z and A", for every i € ^V, but is stronger than 
independence. In our analysis, standard independence would require conditions on Q which 
imply the existence of a regular conditional probability P{- \ Z) (cf., Parthasarathy (1977, 
Proposition 46.5)). 

6 The non-pecuniary costs permit negative utility but rule out negative consumption. 

visiting the attorney (a pecuniary equivalent). Because agents have asym- 
metric information, a key problem is to ensure that the borrower reports 
the realization truthfully. We, like Williamson (1986), use the costly state 
verification framework to solve this problem. This model was introduced by 
Townsend (1979). However, unlike in Townsend's model where x t is publicly 
announced after verification occurs, in our model x, is privately revealed only 
to the agent who requests verification. 7 This assumption is essential for our 
analysis since if all information could be made public ex-post, there would 
be no need to verify the bank in default states. However, it also appears to 
accurately describe the privacy and institutional features which characterize 
most lending arrangements. For example, Diamond (1984, p. 395) observes: 
"Financial intermediaries in the world monitor much information about their 
borrowers in enforcing loan covenants, but typically do not directly announce 
this information or serve an auditor's function." 

3 Contract Arrangements 

There are two types of basic investment arrangements, notably those involv- 
ing direct contracts between "primary" borrowers and lenders, and those 
involving an intermediary. In Section 3.1 we consider the direct investment 
problem. In Section 3.2 we consider intermediated investment where lenders 
and borrowers write contracts with a bank, and the bank is subject to non- 
trivial default risk. 

3.1 Direct Investment 

Let all direct, bilateral interactions between lenders and borrowers be regu- 
lated by a contract whose general form is defined as follows. 

Definition 1. A one-sided contract between lenders and borrowers is a 
pair (R(-),S), where R(-) is an integrable, positive payment function on M + , 

while pecuniary equivalents of non-pecuniary costs ensure that the costs can be shared by 
the contracting parties. 

7 See Krasa and Villamil (1991b) for an analysis of an economy with multiple het- 
erogeneous agents, costly state verification, public announcement, and deterministic or 
stochastic verification. 

such that R(x) < x for every x E 1R+ and S is an open subset of JR+ which 
determines the states where monitoring occurs. 

The contract (R{-), S) describes the total claims against the borrower 
by all lenders. If a lender invests 6 units of capital in a borrower's project, 
then his/her claim against the borrower is given by bR{x), where x is the bor- 
rower's announced wealth realization. Following standard practice in this lit- 
erature, we restrict the universe of contracts to the set of incentive-compatible 
contracts and denote this set by C = (/?(•), 5). Consequently, the realization 
announced by each borrower is the true realization. The following condition 
ensures that all contracts under consideration satisfy this restriction: There 
exists R E M + such that S = {x: R(x) < R}. The imposition of this restric- 
tion is without loss of generality because the Revelation Principle establishes 
that any arbitrary contract can be replaced by an incentive-compatible con- 
tract with the same actual payoff (cf., Townsend (1988, p. 416)). Therefore, 
the set of all incentive-compatible contracts is fully specified by the tuple 


We study a particular type of contract, called a simple debt contract, 
which is defined as follows: 

Definition 2. (R(-),R) is a simple debt contract if: R(x) = x for x E S = 
{x < R} and R{x) = R if x E S c = {x > R}. 

The payment schedules in Definition 2 resemble simple debt because: 
(i) When verification occurs the payment to the lender is state contingent 

(i.e., the borrower pays the entire realization for all outcomes below a 

cutoff level), where the verification set S is viewed as the set of bankruptcy 

(ii) When verification does not occur the payment to the lender is constant 

(i.e., the borrower pays a fixed amount R for all realizations of the state 

above the cutoff), where S c is the set of all realization where verification 

does not occur. 

Townsend (1979) proved that debt contracts are optimal responses to 
asymmetric information problems in economies with deterministic costly 
state verification technologies because such contracts minimize verification 
costs. Agents verify only low realizations of X{ and accept fixed payments 
(which do not require monitoring) in all other states. Gale and Hellwig (1985) 

and Williamson (1986) showed that simple debt is the optimal contract 
among all one-sided investment schemes. In contrast to debt contracts, where 
the borrower is the residual claimant in the default state (and hence may re- 
ceive a non-zero payment if the bankruptcy is not "too severe"), a simple 
debt contract requires the borrower's entire project realization to be trans- 
ferred to the lender in default states (i.e., see (i) above). This result will be 
useful in the analysis that follows, thus we state it formally. 

Theorem GHW. Simple debt is the optimal contract among all one-sided 
investment schemes. 

The strategy of the proof of Theorem GHW is as follows. 8 Consider two 
optimal contracts. Let (R(-),R) be a simple debt contract and (A(-),A) 
be some alternative contract. Since both contracts are optimal, both must 
provide borrowers with the same expected payoff. Under the simple debt 
contract lenders request costly state verification if x < R, and under the 
alternative contract lenders request verification if x < A. Clearly A > R 
(otherwise the contracts cannot have the same expected return to borrowers), 
thus the expected verification costs must be less for the simple debt contract. 

In our economy with correlation among project realizations, the direct 
investment problem is identical to the investment problem in an economy 
without correlation among projects. This follows from the fact that when 
borrowers and lenders write direct bilateral contracts, every lender must ver- 
ify the borrower with whom he/she contracts in default states. Hence, non- 
trivial correlation among projects is irrelevant (and simple debt contracts 
remain optimal). Note that even though lenders have the opportunity to in- 
vest in more than one project (i.e., contract with more than one borrower), it 
is not optimal for them to do so because they reap no gain from diversifying 
idiosyncratic risk while they incur higher expected monitoring costs. For ex- 
ample, suppose that an agent invests in two projects. Assume that the total 
outstanding debt of borrowers i = 1,2 is given by the contract {Rt{-), Ri), 
and let a, be the capital the lender invests in each project. If both projects 
do not fail, then the payoff is simply given by a x Ri + a 2 R2- Thus, there is 
no gain in the good state from investing in multiple projects. However, the 
probability that at least one of the two projects fail is strictly higher than 
the probability that only a single project fails. 9 

8 See Gale and Hellwig (1985) or Williamson (1986) for a formal proof. 

9 For example, consider the random event to be the toss of a fair coin, where "head" 

The direct investment problem between a borrower and lenders can now 
be stated. Let c denote the lenders' cost of monitoring a borrower. 

Problem 3.1. Choose an incentive-compatible contract (/?(•), 5) to: 



subject to: 

/ [x-R(x)]dF{x) 

a I R{x)dF{x)- I cdF{x)>ra. (2) 

Jo Js 

In Problem 3.1 (the direct investment problem), the expected utility of a rep- 
resentative borrower is maximized subject to a constraint that the lenders' 
expected return, net of monitoring costs (c), be at least as great as some 
reservation level (r). The first term in the lenders' constraint is multiplied 
by a in order to account for each individual lender's capital investment a. 
Without loss of generality we assume that each lender invests all of his/her 
endowment in a single project. Finally, Problem 3.1 reflects the assumption 
that credit markets are competitive. There are more lenders who wish to 
invest than investment opportunities. Thus, the supply of loans is inelas- 
tic, and the level of return necessary to attract lenders is driven down to 
the reservation level r, the return available on the alternative investment 

3.2 Intermediated Investment 

Now consider an intermediated borrowing and lending problem. In the previ- 
ous section (i.e., the one-sided problem), lenders and borrowers wrote direct 
bilateral contracts and correlation among projects (as long as it was not 
trivial) was irrelevant. However, duplicative monitoring is inherent in the 

is non-default, and "tail" is a default. Clearly, for a single coin toss the probability of a 
default is 0.5. If the agent "invests" in two coin tosses, the probability that at least one 
of the two projects fails is 0.75. Thus expected monitoring costs will be higher. This is 
true even if there is some correlation among projects, if idiosyncratic risk is non-trivial. 
If the idiosyncratic risk is trivial (i.e., X{ = so only macroeconomic risk matters), then 
every agent could monitor only a single project (since the realization of all projects can 
be determined by the outcome of any one project) and the expected payoff from investing 
in one project or many projects is the same. 

direct investment problem because each lender must verify each borrower 
with whom he/she contracts in certain states of nature. Thus, there may ex- 
ist gains from "delegated monitoring" (cf., Diamond (1984)), where lenders 
elect a monitor to perform the verification task and thereby eliminate some 
of the duplicative monitoring associated with direct investment. In contrast 
to previous delegated monitoring studies, our economy has an important 
feature which significantly complicates the "standard" delegated monitoring 
problem. The intermediary faces non-trivial default risk for two reasons: 
(i) Since there are only finitely many borrowers, it is not clear that the 

intermediary can completely diversify idiosyncratic risk, 
(ii) Even if the intermediary can eliminate idiosyncratic risk, its portfolio is 

still subject to non-diversifiable "macroeconomic" risk. 
Thus, in our economy the probability that the bank may default is non-zero 
(because at least the macroeconomic risk is non-diversifiable), so lenders must 
verify the bank with strictly positive probability (i.e., in some states). 

We begin our analysis of the delegated monitoring problem by considering 
how agents select an intermediary. Since the loan market is competitive, any 
lender who wishes to act as an intermediary must offer contracts which max- 
imize the expected utility of the borrowers and assure the remaining lenders 
of at least the reservation level of utility (r), which is determined by the 
riskless rate of return on the alternative project. Otherwise, agents would 
trade directly or another intermediary would offer an alternative contract 
(i.e., there is free entry into intermediation) with terms that are preferable 
to the n borrowers and/or the remaining m — 1 lenders. Let (R(-), S) denote 
aspects of the two-sided contract which pertain to the borrower-intermediary 
relationship and (/?*(•), 5*) denote aspects of the two-sided contract which 
pertain to the intermediary-lender relationship. 10 The intermediary's prob- 
lem clearly embodies optimization by all agents in the economy. 

We next derive random variables which describe the income from the 
intermediary's portfolio. Recall that R t (x) denotes the payoff by borrower i 
to the intermediary if output x is realized, X{ is the random variable which 
describes the output x of a particular borrower i in state x, and X{ = \\ + Z 
from equation (1), where the Y t are independent random variables but the 

l0 (R(),S) is also used in the direct investment problem in Section 3.1. We do not 
introduce additional notation in this Section because the structure of the problem is the 
same regardless of whether borrowers report to the lenders or to the bank. 


Xi are not independent for Z/0. The intermediary's income from borrower 
z, given transfer R(-), can now be defined by 

Gi(R(-);u) = R(X { (u)), (3) 

where u> G H denotes the state of nature. Because the X, are not inde- 
pendent, it follows that in general the random variables G, are not inde- 
pendent. If the intermediary contracts with i = 1,2, ...,n borrowers, 11 its 
average income per borrower under payment schedule R(-) is: G n ( R{-); u) = 
£ E?=i <?,-(#(•); u>)- Denote the distribution function of G n (-) by F n (-). 

The two-sided contract between the intermediary and each borrower, and 
the intermediary and the lenders, can now be defined. 

Definition 3. A two-sided contract is a four-tuple ((#(•), S), (R m (-), S*)) 

with the following properties: 

(i) R(-) is an integrable positive payment function from a borrower to the 
intermediary such that R(x) < x for every x G M + , and S is an open 
subset of JR+ which determines the set of all realizations of a borrower's 
project where the intermediary must monitor; 
(ii) R*(-) is an integrable positive payment function from the intermediary 
to the lenders such that R*{x) < x. For every realization x of G n (-), 
the payment to an individual lender is given by ^-^R"(x); 12 and S" is 
an open subset of 1R + which determines the set of all realizations of the 
intermediary's income from the borrowers the lenders must verify. 

We now derive the set of all incentive-compatible two-sided contracts. 
Each borrower will announce an output which minimizes its payment obli- 
gations to the intermediary. Let x = arg min x65 R(x) be the output that 
minimizes this payoff over all non-monitoring states, and recall that x is ob- 
served directly in the monitoring states S. Consequently, the announcement 
by a borrower is given by argmin ie r xf i R(x). A similar condition holds 
for the intermediary-lender portion of the contract (i.e., R*(-),R*). As in 

11 Note that n need not equal n. In fact, this paper shows that it in general it will not 
be optimal for a bank to become as large as possible. 

12 /?*() is the total payment by the intermediary to lenders per borrower. Since the 
intermediary has a positive initial endowment, rn — 1 lenders are sufficient to finance the 
m projects. Thus, to derive the payment to an individual lender, multiply this amount 
with -2-7. 

m— 1 


the one-sided problem, the following condition ensures that all contracts are 
incentive-compatible. There exist R, R' 6 IR+ such that S = {x: R{x) < R} 
and S* = {x:R'(x) < R m }. The set of all incentive-compatible two-sided 
contracts is fully specified by the four-tuple {R(-), R), {R*(-), R"). 
A two-sided simple debt contract is then defined as follows: 

Definition 4. A contract (R(-),R), (R m (-),R m ) is a two-sided simple debt 
contract if: 

(i) R(x) = x for x € S = {x < R} and R{x) = R if x G S c = {x > R}; and 
(ii) R"{x) = x for x <= 5* = {x < R*} and R*(x) = R' if x € S' c = {x > R*}. 
We will often denote two-sided simple debt contracts by (R, R"). 

The intermediary's two-sided optimization problem can now be stated. 
Let c denote the intermediary's cost of monitoring the borrowers, and let c* 
denote the lenders' cost of monitoring the intermediary. In Section 4 these 
monitoring costs shall be discussed in more in detail. 

Problem 3.2. Choose incentive-compatible contracts (R(-), R),(R"(-), R') 

max/ [x — R(x)]dF(x) 

subject to: 

?— I R*(x)dF n (R(-),R)(x) - I c n dF n (R(-),R)(x) > r (4) 

— 1 Jo Js* 


I [x - R"{x)]dF n {R(-),R)(x) - I cdF(x) 

> r. (5) 

Problem 3.2 states that the intermediary maximizes the expected utility of 
each ex-ante identical borrower subject to two constraints. (4) states that 
the expected payoff to the m — 1 remaining lenders (i.e., those who did not 
become intermediaries) must be at least r, the level of utility available from 
the alternative project. (5) states that the profit from intermediation (i.e., 
net payoffs from the borrowers less the payoff to the lenders) must also be at 
least r. Note that the bank's decision variables are the loan contract /?(•), 
the deposit contract /?*(•), and the number of projects n. The number of 
lenders is determined by the choice of n. 


4 Optimal Investment Arrangements 

The structure of the optimal investment arrangement will depend crucially 
on the nature of the monitoring costs, c and c*, because default risk is non- 
trivial and monitoring will occur with positive probability. We now proceed 
to prove Theorem 1 which establishes that delegated monitoring is optimal 
when the lenders' costs of monitoring the intermediary are bounded and the 
variance of the nondiversifiable macroeconomic risk is sufficiently small. The 
proof of Theorem 1 depends on continuity of the constraints of Problem 3.2 
in the face values R and R* of the two sided debt contract, which follows from 
Lemma 1 in Krasa and Villamil (1991a). The strategy of the proof of the 
Theorem is as follows. Let R be the simple debt contract which is optimal 
among all one-sided schemes described by Theorem GHW in Section 3.1. 
We show: (i) there exists an alternative two-sided debt contract (R, R") such 
that (4) is satisfied and binding; (ii) (5) is fulfilled but does not bind under 
(R, R*); and (iii) by increasing the face value of the lenders 1 debt (say to 
B' > R") the payoff to the lenders increases. 13 Then by continuity of the 
constraints in the face value of the lenders' debt, a two-sided contract (R, B") 
can be found such that both constraints are slack. Finally, by continuity of 
the constraints in R, the face value of the borrowers' debt, R, can be lowered, 
with both constraints still satisfied. Thus, the delegated monitor offers better 
contracts to agents than the best feasible direct investment contract, which 
proves the Theorem. The argument requires n, the number of borrowers, to 
be sufficiently large; a more precise characterization of n shall be provided 
in Section 5. 

Theorem 1. Assume that c* is bounded and that the variance of Z is 
sufficiently small. Then delegated monitoring with two-sided debt contracts 
dominates direct investment. 

Proof. Consider first the investors' cost of monitoring the bank. Recall 
that the bank faces two types of risk: a diversifiable, project-specific risk 
y,, and a non-diversifiable macroeconomic risk Z. Thus, the banks default 
probability will in general not converge to zero (even if it contracts with a 

I3 In general, the lenders' payoff does not increase monotonically with R' because the 
probability that lenders must verify the intermediary is an increasing function of/?". This 
is also true for one-sided schemes (cf., Gale and Hellwig (1985, p. 662)). 


large number of borrowers). By Lemma 1 of the Appendix, the average payoff 
from borrowers to the intermediary converges to the expected conditional 
return E[R{X X ) \ Z]: 14 

±JTR{Y i + Z)-*E[R{Y 1 + Z)\Z}, (6) 

as n — » oo. Further, by Lemma 2 in the Appendix, the probability that the 
return from borrowers is less than the lenders' fixed payment converges to 
the probability that the expected return E[R(X\) | Z] is less than the face 
value of the lenders' debt: 

P ({^ £ R ( y i + Z )< R '}) - P {{ E i R ( y i +Z)\Z]< R')}) ■ (7) 

Now choose z{R') such that E[R{Y X + z(R m ))] = R\ Note that z(R*) is 
independent of the distribution of Z. Furthermore, 

E [R{Y x + z(R*)j\ = E [R{Y l + Z)\Z = z(BT)\ . 15 

Note that z(-) is a "cutoff value" in the distribution of Z which separates 
solvency states from insolvency states in the limit. Since R(-) is monotonic 
the right-hand-side of (7) equals P (<Z < z(R*)\), which we shall show is 
the bank's default probability in the limit. 

Next, consider the bank's payoff to a lender when its portfolio is large. 
From (6) and from continuity of R'(-) it follows that 

Rm [^ E R ( y i + z )) - R ' (E[R(Y l + Z) | Z\) 
asn-> oo. Lebesgue's dominated convergence Theorem therefore implies 
Hm JR* (^JTRiY^ + Ziu))^ dP(u) 

= J R'(E[R(Y l + Z)\Z](u))dP(u) (8) 

14 For all random variables X, Y denote by E[X | V] the conditional expectation of X 
with respect to Y (which is a random variable, measurable with respect to the information 
contained in Y). In particular, let w € ^ be an elementary event for which Y(u) = y. 
Then E[X \ Y = y] = E[X \ Y](u). If A' and Y have only a countable number of different 
values then this corresponds to the elementary definition of a conditional expectation. 

15 This follows from our strong independence assumption. See footnote 5 and the proof 
of Lemma 1. 


Substituting the distribution of - Y^?=i R{Y t (uj) + Z(u)) and the distribution 
of Z for P in (8) yields 

lim / R*(x) dF n {R(-)){x) = I R' (E[R(Y 1 + Z) \ Z = z}) dH(z) (9) 

n— ►oo J J 

where H denotes the distribution of Z. For example, if there is no macroeco- 
nomic risk (i.e., Z = 0), the right-hand-side of (9) is given by R* (E[R(X X )]) 
so when R* < E[R(Xi)] the expected return in the limit is given by R* and 
the lenders receive the face value of the debt with certainty. 

A two-sided contract which dominates the one-sided, direct investment 
contract can now be constructed. Let e > be some arbitrary constant. 
First, choose B" such that r < B* < E[R(Xi)]. Then (7) implies that the 
bank's default probability is less than e for large n, if P({Z < z(B")}) < £, 
i.e., the probability that the realization is in the "tail" of the distribution of 
Z is sufficiently small. Note, that z(B') < 0. 16 Thus, there exists a 6 > 
such that whenever var(Z) < 8 we get P({Z < z(B*)}) < e. 17 Thus, the 
lender's expected costs of monitoring the bank are bounded above by £c* for 
large n. For similar reasons, the lenders' payoff is bounded from below by 
^"(1 — e) for sufficiently large n. 

If £ is sufficiently small, (7) and (9) imply that constraint (4) is fulfilled, 
but does not bind for the two-sided contract (R,B*). By continuity of (4) 
with respect to B* (see Krasa and Villamil (1991, Lemma 1)), there exists a 
face value R* < B" such that (4) binds for the two-sided contract (R,R*). 
We next show that (5) is fulfilled under contract (R, R'), but does not bind. 
Recall that the bank's default probability is less than e. Thus, J 5 . c* dF n < 
£c* for the contract with face value B' . Since R* < B* , the bank's default 
probability is lower with R*. Thus, by choosing e sufficiently small and since 
c* is bounded we can ensure that f s cdF > f s . c* dF n for all sufficiently large 
n. 18 This and the fact that (4) binds implies 


( R'{x)dF n <( m -l)(r+ / cdF). (10) 

Jo Js 

16 We normalize the mean of the macroeconomic shock Z to zero, so :(B') < denotes 
a recession. 

17 This is possible since z(B') is independent of the distribution of Z . 

18 The inequality indicates that the intermediary's expected cost of monitoring the bor- 
rowers is higher than the lenders' expected cost of monitoring the intermediary. 



j [x-R-{x)\dF n - I cdF 
Jo Js 

> nE[R(Xi)] -ml cdF - (n - l)r > mr - (m - l)r = r. 


The first inequality follows from (10) and from the fact that / x dF n = 
E[G n ] = E[R{X\)\, which is the expected value of an aggregate version 
of equation (3). The second inequality follows because R must fulfill (2) by 
assumption. Now increase R' slightly. Then by the continuity of the con- 
straint and by the construction of R" the lenders' payoff increases and thus 
both constraints can be made slack. This proves Theorem 1 because there 
exists some surplus that can be redistributed to borrowers by lowering the 
face value of their debt R. 

Theorem 1 establishes optimality of delegated monitoring schemes if n is 
sufficiently large and if the variance of the macroeconomic shock is sufficiently 
small. However, it does not follow that it is optimal for the bank to be as large 
as possible. Indeed, Theorem 1 suggests that an optimal bank size may exist 
because as the bank increases its portfolio size there are gains from default 
risk reduction but losses from increased monitoring costs. In Theorem 2 we 
characterize these gains and losses more precisely. However, before doing so 
we first relate Theorem 1 to the previous literature on delegated monitoring. 
Specifically, we focus on the bank size and industry structure predictions 
implicit in previous models. 

Diamond (1984) and Williamson (1986) use a law of large numbers ar- 
gument to prove the optimality of delegated monitoring (i.e., financial inter- 
mediation) in an economy with bounded costs and no macroeconomic risk. 
Because the probability that a bank fails is zero in the limit in their mod- 
els, the lenders' expected costs of monitoring the bank are zero. Clearly, a 
bank that operates in such an environment can always reduce the expected 
monitoring costs borne by lenders by increasing its size. Thus, "big banks 
are always better," and the model predicts banks of large but indeterminate 
size. 19 This size prediction is implicit in the Diamond and Williamson mod- 
els, and stems from the fact that increasing returns to scale are inherent in 

19 Because the set of borrowers is infinite, it is possible to get multiple banks that are 
indeterminately large in this framework. However, the argument requires that the infinite 


the framework they consider. Specifically, in their models a bank can al- 
ways both decrease the riskiness of its portfolio and reduce monitoring costs 
by contracting with additional borrowers. An obvious question, therefore, 
is: Does delegated monitoring in an economy with non-diversifiable portfo- 
lio risk also give rise to increasing returns to scale in intermediation, and 
hence indefinitely large banks? The answer to this question depends on the 
specification of monitoring costs. 

Consider first a "best case 1 ' situation where lenders face a fixed cost of 
monitoring a bank in default states. Specifically, let c* = k, where k is a 
positive constant which is independent of the size of the bank. In this case, 
(7) from Theorem 1 implies that the bank's default probability converges 
to P{{Z < z}). This follows from the fact that a bank's default prob- 
ability decreases (in general) as it contracts with more borrowers because 
idiosyncratic risk is diversified away. The non-diversifiable macroeconomic 
risk obviously remains. 20 To make this argument more precise, let p„ de- 
note the bank's default probability when it has a portfolio of size n, where 
n < n. The lenders' expected costs of monitoring the bank under this cost 
structure are PhC' n = Pnk. Since the bank's default probability when its port- 
folio size is h is at least as great as its default probability in the limit (i.e., 
Pn > lim n _ 00 p n ), the lenders' expected costs of monitoring the intermediary 
are lower for larger banks. It follows from this observation that the delegated 
monitoring model with non-diversifiable portfolio risk and constant monitor- 
ing costs will generally also display increasing returns to scale in intermedi- 
ation (because increasing the bank's portfolio size does not raise monitoring 
costs but it may lower the bank's default probability). Consequently, like 
Diamond and Williamson, the optimal bank size under constant monitoring 
costs is indeterminately large. 

Now consider a polar opposite "worst case" situation, where the lenders' 

set of borrowers be partitioned into an infinite number of subsets where each infinite 
subset of borrowers contracts with a particular delegated monitor. This argument does 
not appear to be a plausible explanation of the observed co-existence of multiple banks. 
Of course, the models were not designed to explain this observation. 

20 Convergence in equation (7) need not be monotonic. Thus, there may exist points 
of non-monotonic convergence where even under constant monitoring costs the optimal 
bank size is finite if the macroeconomic risk is non-trivial. We therefore assume without 
loss of generality that the bank's probability of default is always bounded from below by 
P{{Z < z}), which is the default probability of a bank of infinite size with macroeconomic 
risk but no idiosyncratic risk. 


monitoring costs are unbounded. Krasa and ViUamil (1991, Theorem 1) an- 
alyze this problem when there is no macroeconomic risk. They show that 
even if costs are unbounded but do not increase at an exponential rate, dele- 
gated monitoring with two-sided simple debt contracts still dominates direct 
investment. However, the non-trivial macroeconomic risk which we consider 
in this paper complicates this problem considerably. Specifically, in the limit 
the lenders' expected monitoring costs are given by lim n _^oo p n c*. Clearly, if 
c* is unbounded this product converges to infinity when the bank's portfolio 
is subject to non-diversifiable macroeconomic risk (because lirr^^oo p n > 0). 
Thus, constraint (4) from Problem 3.2 is violated for sufficiently large n, and 
this implies that delegated monitoring is not feasible with unbounded costs, 
non-diversifiable macroeconomic risk, and a sufficiently large portfolio size. 
This argument suggests that an optimal portfolio (or bank) size may exist 
because the feasibility of delegated monitoring depends on n. 

Consider now the problem of whether or not a bank of a given size (n) 
should contract with additional borrowers, thus increasing its scale. Suppose 
that monitoring costs are bounded but not constant. Theorem 1 establishes 
that delegated monitoring is optimal if the variance of the macroeconomic 
risk is sufficiently small, but this does not imply that the bank should be as 
large as possible. When there is non-trivial macroeconomic risk and the bank 
has a portfolio of size n, the bank must consider two factors when deciding 
whether or not to increase its scale. 

(i) Adding additional projects to its portfolio decreases the bank's default 
risk. This decrease is given by the difference between the bank's proba- 
bility of default in the limit (i.e., lin^oo p n ) and its probability of default 
at portfolio size h (i.e., p^); but for h sufficiently large, the gains from 
reducing default risk by adding additional projects are essentially zero, 
(ii) Adding additional projects to the banks' portfolio raises the lenders' costs 

of monitoring the bank. 
Thus, the crucial question is: For what cost structures do the gains from 
reduced default risk dominate the losses from increased monitoring costs 
when additional projects are added? 


5 Optimal Intermediary Size 

We now obtain the main results of the paper: analytic predictions for both 
the optimal size of a financial intermediary and for the size distribution of 
banks in a Pareto efficent industry. In order to analyze the problem in 
detail (and answer the question posed above), we must provide a precise 
quantitative characterization of the gains from diversification as the size of 
the bank increases. To this end we construct a rate function which measures 
the speed at which the bank's idiosyncratic portfolio risk is eliminated when 
the size of its portfolio is increased. The Theory of Large Deviations (cf., 
Varadhan (1984)) provides the formal structure. 

Consider first the bank's portfolio diversification problem for the case of 
a fixed realization z of Z. The argument will be generalized to permit any 
z by integrating over the distribution of all possible realizations of z. Then 
XI = Y t + z are independent random variables (given that z is fixed) since the 
Yi are independent. Since the random variables X* are independent, the law 
of large numbers holds. The large deviation principle gives a rate function 
(cf., Varadhan (1984), Theorem 3.1) which provides a measure of the speed 
of convergence in the law of large numbers. The rate function implies that 
for every z < E Z [R(X*)], the probability that a realization is in the tail of 
the distribution converges exponentially to zero. Formally 

^({^E A 7<«})<^ (a)n , (ii) 

where l z (a) > is the rate function which gives the speed of convergence of 
the distribution (i.e., a measure of how rapidly contracting with additional 
borrowers reduces the bank's default risk). 

The rate function T z {-) is derived from the moment-generating function 
of a random variable. Let M z (9) denote the moment- generating function of 
the distribution of R(X*), where p. z denotes the distribution. Then 

M z {6) = J e 6x d^(x). 21 

The rate function is found by solving the following maximization problem: 

1(a) = max0a- log M z {0). (12) 

9£ #t 

21 M z (0) is called the moment generating function since the fcth derivative of M 2 (0) 
evaluated at 9 = gives exactly the fcth moment of /i. 


We now show that 1(a) > for every a < E(X-). For fixed a let f(a,6) = 
9a — log M z (0). Since /(a,0) = 0, it is sufficient to show that ^/(a,0) ^ 0, 
which can be easily verified: 

d fxe ex dfi z (x) 

da- f{a ' 9)=a - fe*dp,(x)- 

Consequently, £/(a,0) = a - EX? ^ 0. 22 

We have shown that the probability P ({jE?=i R(Yi + z) < #*}) con- 
verges exponentially to E[R(Y t + 2)] for fixed z such that z > z(R") (i.e., 
if the macroeconomic shock is not too severe). We now make the conver- 
gence argument independent of z. The primary technical problem is that the 
behaviour of the rate function must be analyzed as z comes close to z(R*). 
Recall that for z = z(R"), the face value of the lenders' contract, R', is ex- 
actly the expected value of R(Y t + z). Clearly, the probability of observing 
realizations less than or equal to the expected value of independent random 
variables does not converge to zero. However, in Lemma 3 we show that for 
z > z(R*)i and 2 sufficiently close to 2, the rate function I Z (R") is bounded 
from below by k(z — z") 2 , where k > 0. In particular, from (A. 15) and (A. 16) 
in the Appendix it follows that we can choose for k any number smaller than 
2v*rR(Y +z) - 23 Thus, the smaller the variance of the idiosyncratic risk the 
faster convergence. Thus, the rate function converges to zero at the speed 
(z — z) 2 for z — > z. Integrating over 2, the probability of default by a bank 
of size n, given that Z > z, is: 



e- TAR ' )n dH(z). (13; 

An important technical result, which is essential for proving that an op- 
timal bank size exists (Theorem 2), can now be stated. 

Proposition 1. There exist constants k, > 0, i ' = 1,2 such that the proba- 
bility of default by a bank of size n, given that Z > z, is bounded from above 

by^ + e-^. 

22 This follows from the fact that f xe 8x d^i z (x) evaluated at 6 — is the expected value 
of X* , since fx z is the distribution of X*. Furthermore f e 0x dfi z (x) evaluated at 9 = is 



Apply Lemma 3 as in Proposition 1 and choose X a — R(Y, + z). 


The proof of Proposition 1, which provides a speed of portfolio conver- 
gence result, is in the Appendix. Note that in the bound in Proposition 1, 
the term e~ nk2 converges to zero much faster than ^h. By the law of large 
numbers, a bank of infinite size will never default if Z > z. Thus, the bound 
implies that a bank of size n can lower its default probability by only ap- 
proximately -h if it becomes infinitely large (given that Z > z) because the 
second term is approximately zero for large n. Consequently, H~ is our de- 
sired measure of the gain from default risk reduction arising from additional 

The main result of the paper can now be stated. 

Theorem 2. Let c" n denote the cost of monitoring a bank of size n. Let 
c*^ — lim n _ 00 c* , 24 and assume that c^ — c* converges to zero at a slower 
rate than -4-. Then it is never optimal for a bank to become infinitely large. 
Thus, there exists an optimal size for the bank. 

Proof. Assume by way of contradiction that it is optimal for the bank to 
become infinitely large. By Proposition 1, the bank's probability of default 
is bounded above by 

p n = i^ + e-*»» + P({Z<z})» (14) 


By Lemma 2, the bank's default probability in the limit is given by P{{Z < 
z}). Let n be arbitrary. Now compare the expected monitoring costs for a 
bank of size n with those of a bank of infinite size. By (14), the lenders' 
expected costs of monitoring a bank of size n (i.e., p n c* n ) are at least 

-j^ + e 

c' n + P({Z<z})c' n . (15) 

The expected costs of monitoring a bank of infinite size are at most 

P({Z < z})^. (16) 

24 If c* is unbounded then c^ = oo, and clearly Theorem 2 holds. 

25 The first two terms are the bound given by Proposition 1 for all states where Z > z. 
Assuming that the probability of default in all states Z < z is one (which is clearly an 
upper bound for these states), we get the third term. 


If it were optimal for the bank to become infinitely large, then at least for 
large n the expected costs of monitoring the bank per lender must decrease 
if the size of the bank is increased from n to infinity. By (15) and (16) the 
expected monitoring costs will decrease if 26 

4= + e-** n 

<> P({Z < z}) (<£,-<) . (17) 

By the assumption of the Theorem, c^ — c* converges to zero at a slower rate 
than -4jj. Thus, for every M > there must exist an h such that c^—c^ > -t- 
for all n > n. This, and equation (17) yields 

c' n >P({Z<z})^=. (18) 

Since M can be chosen arbitrarily, there exist values such that inequality 
(18) is violated. 27 The bank cannot be infinitely large, and the Theorem is 
thus proved. 

Theorem 2 establishes that when the lenders' monitoring costs depend on 
the bank's portfolio size, an optimal determinate bank size can be computed, 
under the assumption that the rate of increase of the lenders' monitoring costs 
converges to zero at a sufficiently slow rate. This assumption is fairly weak. 
In particular, it is fulfilled if the rate of increase ofc* is of the order -y-r. 28 
Most economically plausible cost structures will satisfy this assumption. An 
alternative way to interpret Theorem 2 is that the case of constant lender 
monitoring costs (on which others have relied) is rather special. In particular, 
we argued in Section 4 that under constant monitoring costs a bank's size is 
indeterminate (because of increasing returns to scale in delegated monitor- 
ing). However, with a slightly changed (i.e., size dependent) cost structure 
an optimal bank size exists. Thus, the indeterminacy (and hence increasing 
returns to scale for all n) result does not seem to be very robust. Theorem 2 

26 This follows from (16) - (15) < 0. 

"Multiply both sides of (18) by y/n to obtain [ki + v/ne - ^"] c' n > P{{Z < *})M. Let 
n — ► oo. Then, kyc*^ > P{{Z < z})M, which cannot hold for every M . Thus (18) must 
be violated for all sufficiently large n. 

Note that c^ — c* is approximately -j=. Differentiating with respect to n (ignoring 
that n is an integer) of course yields — fr- 


also shows that even when the bank's portfolio is subject to non-diversifiable 
macroeconomic risk (as long as the variance of the risk is not too large so 
intermediation remains optimal), a bank of size n can only improve upon its 
default probability by at most 4jj. Increasing bank size after some critical 
h is not optimal because it leads to increased monitoring costs, but there 
are very limited gains from default risk reduction. Thus, even very "small" 
banks (given Z) may improve upon direct investment when intermediation 
is optimal. 

6 Testable Implications of the Theory 

Theorem 2 has the following testable implications. First, bank size is deter- 
minate and inversely related to the bank's exposure to macroeconomic risk. 
In a large economy like the U.S. where aggregate macroeconomic shocks have 
different effects on different regions of the country, our model predicts that 
banks of different sizes will coexist across locales. In particular, if different re- 
gions of the country have different effective macroeconomic shocks (i.e., Z's), 
the model predicts that across regions both large "money centre banks" that 
are very well diversified and smaller "local banks" that are less well diversi- 
fied will co-exist. "Money centre banks" may have been able to lower their 
exposure to macroeconomic risk, by evading portfolio restrictions via hold- 
ing companies or because they operate in regions of the country with better 
diversified economic bases. Our model predicts that these better diversified 
(i.e., low Z) banks will be larger than "local banks" (with higher Z's). This 
follows from (18) in the proof of Theorem 2 because P({Z < z}) is lower for 
a better diversified bank, since the bank's exposure to macroeconomic shocks 
is effectively less severe. Therefore, the bank size (n) that violates (18) is 
necessarily higher. Williamson (1989) provides an interesting discussion of 
stylized facts regarding the structure of U.S. versus Canadian banks. Histor- 
ically, Canadian banks have had fewer portfolio (e.g., branching restrictions) 
and hence lower Z's than U.S. banks. As our theory predicts, there have been 
fewer banks in Canada of larger size (adjusted for population differences). 

The second testable implication of our theory pertains to industry struc- 
ture, i.e., limits on the number and size distribution of firms that are present 
in equilibrium. Theorem 1 establishes that intermediation (banking) im- 
proves social welfare. In other words, there is a some level of intermediation 


services in an economy that is socially optimal (given preferences, costs, prob- 
ability distributions, and alternative opportunities). In contrast, Theorem 2 
proves that there is a specific bank size that is optimal. Although the notions 
of bank size and industry structure are closely related, they need not be iden- 
tical. For example, suppose the socially optimal "industry" (i.e., economy) 
level of welfare improving intermediation services is twenty units of input 
capital and the optimal size for all banks is to provide two units of capital. 
Clearly, the optimal industry structure is then ten banks. What causes some 
banks to be of similar size in our model? The result that within a region (or 
among banks with similar effective portfolio restrictions) banks with similar 
"local" idiosyncratic risk characteristics will have a common size (given their 
Z) follows immediately from equation (18), because fci, M and P{{Z < z}) 
will be similar for such banks. Why might many moderately sized local banks 
coexist in the same region? This again follows from Theorem 2 because the 
Theorem establishes that after some critical size — increasing a bank's scale 
of operation further is not profit maximizing. Finally, why might we expect 
to observe many small "local banks" and fewer large "money centre banks?" 
The high frequency of smaller local banks (relative to large money centre 
banks) stems from the fact that banks with higher Z's cannot achieve suffi- 
cient portfolio diversification in their locale to justify the additional monitor- 
ing costs associated with increasing their scale of operation. Multiple banks 
operating at the efficient scale within a particular locale provide welfare im- 
proving intermediation services optimally (given the risk and cost structure 
in the economy). Differences in bank size and distribution across locales 
stem entirely from different effective Z's in (18) (i.e., different exposure to 
macroeconomic risk in local and money centre banks). 

7 Concluding Remarks 

In this paper we develop a theory of bank size distribution with testable 
implications. We regard the theoretical model to be useful for two reasons. 
First, there is a longstanding debate in the banking literature about the mar- 
ket structure of the financial industry. Some have argued that the banking 
industry is inherently non-competitive and hence must be regulated to pro- 
tect consumers from the evils of banks' market power. In contrast, others 
have argued that the banking system is fragile and hence must be protected 


from the destabilizing forces of competition. Because we derive optimal fi- 
nancial contracts from Pareto problems, the allocations that we obtain are 
necessarily Pareto efficient. An interesting problem for future research is to 
compare the structure of a banking industry where firms have market power 
with the Pareto efficient structure implied by Theorem 2 (given parametric 
specifications for preferences, costs, probability distributions, and alternative 

Second, the model may prove useful in understanding changes in the Eu- 
ropean banking system that will undoubtedly result from the EMU. In the 
previous Section we discussed bank size and industry structure predictions 
for the U.S. economy. For example, our model predicts that largely agricul- 
tural sections of the U.S. like the Mid-west will have many moderate or small 
sized banks because this region is subject to marcoeconomic shocks that are 
difficult to diversify (especially given portfolio restrictions imposed by bank 
regulators). However, the model predicts money centre banks that operate 
in more economically diversified regions of the country (and that have been 
better able to evade portfolio restrictions) will be larger. In 1999 when the 
EMU begins, twelve countries will form a "federal Europe." What will be 
that nature of the EMU banking regulations imposed by the central bank 
or by the governments of the individual countries? Are the banks of some 
countries (specifically, those with better diversified economies) destined to 
become "money centre" banks while the banks of other (less well diversified) 
economies are destined to become small "local" banks? This question is 
important because — although our model permits existence of both multiple 
banks of the same size, and perhaps more interestingly the co-existence of 
banks of different sizes — larger (better diversified banks) have a lower de- 
fault probability than smaller banks. Are members of the EMU with less 
well diversified economies, that might wish to encourage their banks to lend 
domestically for development reasons, destined to become problematic mem- 
bers of the Union from the outset? 

8 Appendix 

We first derive a "law of large numbers" for random variables with correla- 
tion. The argument works essentially as follows: For any fixed realization z 
of Z we can apply the law of large numbers to the random variables \\ + z 


and show that - X^"=i Y x + z converges to z except for a set N z which has 
measure zero with respect to the probability Pi (recall that P is the product 
of the probabilities P x and P 2 ). We then use Fubini's Theorem (i.e., integrate 
with respect to P 2 ) to show that \J Z £rN z has measure zero with respect to 
the original probability P. We now state Lemma 1. 

Lemma 1. Let Y t , i E W and Z be independent random variables. Assume 
that the Y t are identically distributed. Let X t = Y x + Z , for every i and let R(-) 
be a simple debt contract. Then limn^oo J2?=i R{X t ){u) = E[R{Y\) \ Z] (uj) 
for almost every u) € 0. 

Proof. Let 2 be a realization of Z. Let E[R{X\) \ Z] be the conditional 
probability of X\ with respect to Z. Let u 2 £ 2 . Define the probability 
space ( Sl W2 , A W2 , P^ ) as follows. Recall that Q, = fij x Q. 2 . Then let Q W2 = 
Q,i x {u> 2 }. Let A u , 2 be the set of all events Ax{w 2 }, where A is an event in Cl\. 
Finally, let P^{A x {u 2 }) = P X {A). Let z = Z{u 2 ). Then R(X t ) = R(Yi + z) 
are independent random variables on (Q Z ,A Z ,P Z ). We can therefore the 
apply the law of large numbers to the X- , i £ JN. Thus, 

Jim£/2(X.-M)= / R(X l (u) + z)dP^ 2 (u) (A.l) 

for all u> £ fl^ except for a set A^ £ A W2 with Puj 2 {N W2 ) — 0. Further, 

R{X 1 (u))dP U2 (u;) = £[/2(X0 I Z](w), (A.2) 

for almost every u 6 £l W7 , because Yi and Z are independent. 29 

•/n«, 2 

29 Note that u h-> / n /?(Xi(u;)) dF W2 (oi) is measurable with respect to Z. Thus, to 
show that we have a version of the conditional expectation it is sufficient to prove that 

/ / R(X 1 (w))dP u ,(u)= I R(X x (u))dP{u) y 
J a Jn„2 •** 

for every event A which is measurable with respect to Z (i.e., which is of the form Z~ l (B) 
where B is a measurable subset of IR). However, since Z is constant on Qi, all such sets 
are of the form Q\ x C, where C is a measurable subset of Q 2 - The result then follows 
from Fubini's Theorem since integrating over f2i and then over Q W7 (which is essentially 
Q\) is the same as integrating over Q{ once. 


Let H be the distribution of Z. It remains to show that N = Uu/ 2 €n 2 ^2 
has measure zero. This follows from Fubini's Theorem. The set D of all u 
where ^ Yl?=i R{X t ) does not converge is given by 

D = I u:\immt -Y.R(Xi) ^ f and limsup - Y R(X t ) ^/l, 

where / = E\R(X\) \ Z\. Since the limsup and the liminf of a sequence of 
measurable functions is measurable, it follows that D is measurable. Further, 
D D Q W2 — ^2 smce A^ ls exactly the set of all u E ft W2 where the sequence 
does not converge. Thus D = N and Fubini's Theorem implies 

P(N)= I P 1 (N U2 )dP 2 (u) = 0. 

Jn 2 

This concludes the proof of the Lemma. 

We next prove a technical result necessary for the proof of Theorem 1 
(i.e., a convergence result for the bank's probability of default). 

Lemma 2. Let X{, Y t and Z be as in Lemma 1. Assume that the distribution 
of Z is non-atomic, i.e., P{{Z = z}) = for every z (£ IR. Let ft* be the 
face value of the lenders' simple debt contract. Then 

P ( " E R (Yi + Z) < R'\ ^ P (E\R(Y X + Z) I Z] < ft" 

Proof. Let f n = i£?=i R{Y t + Z), and let / = E[R{Y 1 + Z) \ Z\. By 
(6) we get / n (w) — * f(u) for almost every to. Thus, f n converges to / in 
distribution 30 by Proposition 24.12 of Parthasarathy (1977). Let h be an 
indicator function of the interval (—00, .ft*), i.e., h(x) = 1 for every x G 
( — 00, .ft") and h(x) = 0, otherwise. Note that h is only discontinuous at R* . 
Since the distribution of Z is continuous, the point {R*} has probability zero. 
Corollary 1 of Billingsley (1968, p. 31) therefore implies that h(f n ) converges 

30 Convergence in distribution means the following: Let F n denote the distribution func- 
tion of /„ for every n £ IN, and let F be the distribution function of/. Then /„ converges 
to / in distribution, if f g(x) dF n (x) = f g(x) dF(x) for every bounded and continuous 
function g on Wt. 


in distribution to h(f). Let H(F n ) denote the distribution of h(f n ), let H(F) 
denote the distribution of h(f), let F n denote the distribution of / n , and let 
F be the distribution of /. Then 

lim fh(x)dF n {x) = lim fxH(F n )(x) 

n— »oo J n—KX J 

= lim fxdH{F)(x) = lim [h{x)dF{x), 

n— »oo J n—'oo J 

where the second equality follows from the fact that h(f n ) converges to h(f) 
in distribution. This proves the Lemma 2. 

We next prove a technical result on the convergence of the rate function 

Lemma 3. Let X a , a > a > 0, be a collection of random variables such that 
a is the expected value of X a for every a > a. Assume that the moment gen- 
erating function M a (a) is thrice continuously differentiate in a neighborhood 
[a, a) of a. Let fi a be the distribution of X a . Assume that the support of (i a is 
contained in the compact interval [T!,T 2 ] for every a > a. Then there exists 
a constant k > such that X a {a) > k(a — a) 2 for all a which are sufficiently 
close to a. 

Proof. Define 

f(a,0)=0a-\ogM a {0), (A3) 

where M a {9) is the moment generating function M a (0) = / e 6x dp, a (x). Note 
that /(a,0) = 0. Furthermore, 

Let F a denote the distribution function of p, a (i.e., F a (t) = p* a {{— oo, t]). 
Partial integration of M a (0) yields 


a (0) = I' e ex dfi a {x) =e Bx F a {x)\% - f ' e 9x F a (x) dx 

-9 ( 2 e 9x F a {x)dx (A5) 

;T 7 
, e 0T 2 


Furthermore, by partial intergration we also get 

/ * F a {x)dx = xF a (x)\%- I ' x dF a {x) = T 2 - a, (A.6) 

JTi ' JTi 

QxF a {x)dx = l -x 2 F a (x)\% - l -j\ 2 dF a (x) = I (r 2 2 - E(X 2 a )) . (A.7) 

Let m (a) denote the value of 6 which maximizes (A. 3) for fixed a. Thus, 
from (A. 4) and the Implicit Function Theorem we get 

— 6* a = d % V ' d if a . (A.8) 

da - a JLM a (0)-^M a (6) 

We now evaluate j^O m (a) at a = a. Thus, since 0*(a) = by (A.4), 31 we 
must evaluate the right-hand side of (A.8) at = and a = a. Taking the 
partial derivatives of M a (6) in (A. 5), evaluating them at 6 = and a = a 
and using (A. 6) proves that J^A/ s (0) = 0. Furthermore, 

d 2 ., _ d r T * 

-M,(0) = -— / 2 F a (x)<fx = l, 
a oa JT\ 

dOda da JTi 

where the last equality follows from (A. 6). Thus, the numerator of (A.8) is 
— 1. We next derive the denominator of (A.8). Note that (A. 5) implies, 

— M a (0) = T 2 - I ' F- a (x) dx = T 2 - (T 2 -d) = a, 
dO Jt x 

where the second equality follows from (A. 6); and 

do 2 

M-M = T 2 - 2 / 2 xF- a {x)dx = T 2 - (T 2 - E(X 2 a )) = E(X 2 ), 


where the second equality follows from (A.7). Thus, (A.8) implies 
da a 2 - F(Xl) var(A a ) 

'This follows since ^M a (0) = EX a - a and M a (0) = 1. 


By (A. 5) and differentiation we get 

^M a (9*(a)) = (T 2 e°'W- j\™* F a (x) dx^J ±6*(a) 

-0'(a)4- I 2 t 9 ' (a)x F a (x)dx (A10) 

da JT\ 

Evaluating (A. 10) at a = a and using (A. 6) we get 

±M d (9'(d)) = d^-9'(d). (All) 

da da 

Note that I a = l a (a) is the maximum of f{a,9) over 9 by (12) in Section 5. 
(A. 3), (A. 10), (A. 11), and the fact that M(9*{d)) = Af(0) = 1 immediately 

i-I g = A/(a,r(a))=0, (A12) 

aa aa 

Next we show that ^-J d > 0. Using the fact that M(9"(d)) - 1, it follows 

^/(a,^(a)) = a^r(a)-^M a (^(a))+^M a (r(a))J . (A13) 

Furthermore, by differentiating (A. 10) one more time at a = a and by (A. 6) 
and (A. 7) we get 

^M & (0-(a)) = a^- 2 6'{d) + (^-0'(d)) E(X? a ) + 2^-9'(d). (AAA) 
da* da 1 \da j da 

Substituting (A.9), (A. 11), (A. 14) into (A. 13) we get 

£/(«,*■(»)) = - (^-( 5 )) 2 va r .V a -2^-( S ) = -L-, (A.15) 

where the last equality follows from (A.9). Recall that the rate function 
T a — (a, 6* (a)). Thus, (A. 15) implies j^la > 0. Since by assumption, M a (9) 
is three times continuously differentiate in a neighborhood of (a,0), (A. 8) 
implies that 9*(a) is twice continuously differentiate for all a > a which are 
sufficiently close to a. Thus, T a = f(a,6*(a)) is continuously differentiate 


for all a > a which are sufficiently close to a. We can therefore find a constant 

k > such that 

d 2 

—l a > 2k > 0, (4.16) 

da z 

for all a in a neighborhood of a. Thus, developing X a in a Taylor series at 

a = a we get 

d d 2 (a - a) 2 

1 a = I& + -r l*{a - a) + -j-rla , (>4.17) 

aa acr 2 

for an a between a and a. The Lemma now follows from (A. 12), (A. 16) and 
(A. 17) since X a = by (A. 4) and footnote 31. 

We now use Lemma 3 to prove our main convergence result. 

Proof of Proposition 1. Let Z > z. By (13) in Section 5.1, the bank's 
probability of default is bounded above by 


Choose z x > z. Since 1 Z (R*) is monotone increasing in z (cf., Stroock (1984, 
Lemma 3.3.) we get 


/ e- J * {R ' )n dH{z) < e- J ^ {R ' )n P{{Z > z,}). (4.18) 

Clearly, this expression converges to zero exponentially as n — ► oo, i.e., it 
can by bounded above by a term e~ k2n . It therefore remains to give an 
estimate for the rate of convergence if the realization of Z is between z and 
Z\. We can get such an estimate by using Lemma 3. 32 Furthermore, note 
that E[R(Yi + z)\ converges to E[R(Y l + z)] at the same rate as z — > z. This 
and Lemma 3 implies that 2 Z (R*) > k(z — z) 2 . It therefore remains to derive 
the rate of convergence of 



-k(z-z) 2 n 

dH{z) (4.19) 

32 Let X a be R(Y l + z) where a = E[R(Yi + *)], and let a = E[R{Y X + z{R'))}. 
Then all assumptions of Lemma 3 are fulfilled. Differentiability follows since the func- 
tion z •— ► e R(Y*+Z) i s differentiable except on a set of measure zero. Thus, Leibnitz's rule 
of differentiation under the integral can be applied. 


as n — > oo. Let h denote the density function for H . Substituting z by *j£ 
in (A. 19) implies 

\ e - k{z ~' z)n dH{z) < — / e~ kz h(z + -=) dz. (A.20) 

iz v/n J -00 y/n 

2 2 

Note that e -z /i(s + -4«) converges to e~ z h(z) as n — -> 00. Thus, there exists 
a constant k\ such that the right-hand side of (A.20) is bounded above by 
4^. The result follows now from (A. 18) and (A.20). 



P. Billingsley (1968), "Convergence of Probability Measures," John Wiley & 
Sons: New York. 

D.W. Diamond (1984), "Financial Intermediation and Delegated Monitoring," 
Review of Economic Studies 51, 393-414. 

K. Dowd (1991), "Optimal Financial Contracts," forthcoming Oxford Economic 

D. Gale and M. Hellwig (1985), "Incentive-Compatible Debt Contracts: The 
One Period Problem," Review of Economic Studies 52, 647-663. 

M. Gertler (1988), "Financial Structure and Aggregate Economic Activity: An 
Overview," Journal of Money, Credit and Banking 20(3), 559-587. 

S. Krasa and A. P. Villamil (1991a), "Monitoring the Monitor: an Incentive 
Structure for a Financial Intermediary," forthcoming Journal of Economic 

S. Krasa and A. P. Villamil (1991b), "Optimal Contracts with Costly State 
Verification: the Multilateral Case," BEBR Working Paper 91-0127, Uni- 
versity of Illinois. 

J. Panzar (1989), "Technological Determinants of Firm and Industry Structure," 
Handbook of Industrial Organization I, Elsevier Science Publishers: Am- 

K. Parthasarathy (1977), "Introduction to Probability and Measure," McMillan 
Press: London. 

E.C. PRESCOTT (1986), "Theory Ahead of Measurement," Quarterly Review, Min- 
neapolis Federal Reserve Bank: Minneapolis, MN. 

J. Stroock (1984), "An Introduction to the Theory of Large Deviations," Univer- 
sitext, Springer. 

R.M. TowNSEND (1979), "Optimal Contracts and Competitive Markets with Cost- 
ly State Verification," Journal of Economic Theory 21, 1-29. 

R.M. Townsend (1988), "Information Constrained Insurance: The Revelation 
Principle Extended," Journal of Monetary Economics 21, 411-450. 

S. Varadhan (1984), "Large Deviations and Applications," CBMS-NSF Regional 
Conference Series in Applied Mathematics, SIAM, Philadelphia, PA. 

S.D. Williamson (1986), "Costly Monitoring, Financial Intermediation, and Equi- 
librium Credit Rationing," Journal of Monetary Economics 18, 159-179. 

S.D. Williamson (1989), "Restrictions on Financial Intermediaries and Implica- 
tions for Aggregate Fluctuations: Canada and the United States 1870- 
1913," NBER Macro Annual 4, 303-340.