Full text of "Forcing regression through a given point using any familiar computational routine"

See other formats

. Cadl Gig Ree. cl = ne

ws fia

Forcing Regression Through a Given Point

Using Any Familiar Computational Routine

by
Edward B. Hands

- / TECHNICAL PAPER NO. 83-1
: MARCH 1983

\ \ tae
NG et
ne

| Approved for public release;
| a distribution unlimited.
U.S. ARMY, CORPS OF ENGINEERS

COASTAL ENGINEERING
RESEARCH CENTER

sé Kingman Building
TU Fort Belvoir, Va. 22060

or republication of any of this material

Reprint
Army Coastal

shall give appropriate credit to the U.S.
Engineering Research Center.

Limited free distribution within the United States
of single copies of this publication has been made by
this Center. Additional copies are available from:

Nattonal Technical Information Service
ATIN: Operations Division

5285 Port Royal Road
Springfteld, Virginia 22161

The findings in this report are not to be construed
as an official Department of the Army position unless so
designated by other authorized documents.

fii

0 0301 0050007 2

UNCLASSIFIED
SECURITY CLASSIFICATION OF THIS PAGE (When Data Entered)

REPORT DOCUMENTATION PAGE BR COE ea

1. REPORT NUMBER 2. GOVT ACCESSION NO, 3. RECIPIENT'S CATALOG NUMBER
Less

4. TITLE (and Subtitle) 5. TYPE OF REPORT & PERIOD COVERED

FORCING REGRESSION THROUGH A GIVEN POINT Technical Paper

USING ANY FAMILIAR COMPUTATIONAL ROUTINE

7. AUTHOR(s) 8. CONTRACT OR GRANT NUMBER(s)

Edward B. Hands

9. PERFORMING ORGANIZATION NAME AND ADDRESS 10. PROGRAM ELEMENT, PROJECT, TASK
AREA & WORK UNIT NUMBERS
Department of the Army

Coastal Engineering Research Center (CEREN-GE) D31677
Kingman Building, Fort Belvoir, VA 22060

11. CONTROLLING OFFICE NAME AND ADDRESS 12. REPORT DATE
Department of the Army March 1983
Coastal Engineering Research Center 13. NUMBER OF PAGES

Kingman Building, Fort Belvoir, VA 22060 20
14. MONITORING AGENCY NAME & ADDRESS(if different from Controlling Office) 15. SECURITY CLASS. (of thie report)

UNCLASSIFIED

15a, DECLASSIFICATION/ DOWNGRADING
SCHEDULE

. DISTRIBUTION STATEMENT (of this Report)

Approved for public release; distribution unlimited.

. DISTRIBUTION STATEMENT (of the abstract entered in Block 20, if different from Report)
- SUPPLEMENTARY NOTES

- KEY WORDS (Continue on reverse side if necessary and identify by block number)

Coastal engineering Prediction equations
Data analysis Regression

20. ABSTRACT (Continue em reverse side if necessary and identify by block number) :
This report describes a simple method for obtaining the prediction equation

best fit to all data points (in the least squares sense) while forcing an exact
fit at any kmown point. The decision to constrain the solution at a point

should be justified on theoretical grounds without appeal to data. Examples

are given. When required any familiar regression program can be forced to

select the best line through a given point by simply adjusting and extending

the data entry. All necessary changes to the program results (test statistics
and estimates of regression parameters) can be accomplished without modifying the
computer program.

DD jan 7a 1473 EDITIon oF t Nov 65 1s OBSOLETE UNCLASSIFIED

SECURITY CLASSIFICATION OF THIS PAGE (When Data Entered)

avila eos Ug : aA i
\ fee iy a bie Hh 7 Pa i

f ay ea

ge, Naa

ie " t

ep eat Bee
i 25S ean!

dla ee Pest alte bl, hl

wy a

AY! off

y hi A hy i ; mi | ; ik
ae NE at oe

PREFACE

This report draws attention to the frequent, but often neglected, need to
force a regression line through a known point while obtaining the best possi-
ble fit to all experimental data points. A simple method is described for
solving this problem without modifying customary computational routines. This
method can be applied to many problems, but is especially useful when cali-
brating empirical prediction formulas to fit site-specific coastal conditions
or when choosing from among several theoretical prediction models. The work
was carried out under the U.S. Army Coastal Engineering Research Center's
(CERC) Shore Response to Offshore Dredging work unit, Shore Protection and
Restoration Program, Coastal Engineering Area of Civil Works Research and
Development.

The report was prepared by Edward B. Hands, Geologist, under the general
supervision of Dr. C.H. Everts, Chief, Engineering Geology Branch, and Mr.
N. Parker, Chief, Engineering Development Division. The author acknowledges
the helpful suggestions received from C.B. Allen, C.H. Everts, R.J. Hallermeier,
R.D. Hobson, and P. Vitale.

Technical Director of CERC was Dr. Robert W. Whalin, P.E.

Comments on this publication are invited.

Approved for publication in accordance with Public Law 166, 79th Congress,
approved 31 July 1945, as supplemented by Public Law 172, 88th Congress,

approved 7 November 1963.
TED E, BISHOP ;

Colonel, Corps of Engineers
Commander and Director

CONTENTS

Page
CONVERSION FACTORS, U.S. CUSTOMARY TO METRIC (SI). ........-- :
SNAMINOIS ANNUD) IDAIMIORITINIOINS 5" 6 0 0 0000000000000 656 00 0 6
TENN OIDIVICIEILON TWO IINENNSSIUCIN 5 5 56 00 0 Oo oO OOK ooo 9
JX TIO \RALet WHS, GWISINOMIAK? INDDROVNGEIS 56 5 6 6 5606600005000 11
SOLUTION TO THE PROBLEM. ... . ee eRe aha mmcaterh 5 Ciredh Yedbie el. Aebuet ta Oho 12
1. Regression Through the Orefien eats Behe a) leur vo. zo A le
2. Regression Through Any Arbitrary Point “Gs 'b) A” a) 2 ello a Ae
SIMICIIONG, WIOAKIIAON WOES I ANNI Ile 4g 666 60a.6 8 ogee ooo du6 ob» Leo
TIGER RAT URE ACUMCED ie Meare tien vos Vm eer a acl es Pe aa pee ae ot Gast gy) oo, 220)
TABLES

Adjustment of standard elements produced by programs using
ext endedudatagnra amc tpt the tet NWAw Tu Re tates, cat, the fe ellen me
Palelidicalitbrathionidatas ey si scty uses ines weirs fs) fs oe st ve et et felt 16
Extend cdudatama ct Noma Wem. lmrward matin Carmen amen utes en a) 2 (st rseeuee ets ve ee con 28g,
Extendedridatamscitr iNOk Wik cotm seis Yeo Ly ee nemnat fst anes, ah oW et cin cole cs fa el tomes 19

FIGURES
Application of Model I produces an intercept (a), which may be a
useful estimate of a component of longshore flow which is independ-

ent of wave conditions and presumably pervades the entire data set. . 10

Application of Model I identified a threshold value below which
WEVES CAUSE MO GAME 5 o 05000000 Fo 7 oo oOo ooo KO 10

Application of Model II forces a zero-intercept solution ....... 11

Model II estimates an increase in Y per unit increase in X that

is nearly twice that predicted using Model I. .-........... il
Real test data for example problem Hacer staiet trove tofaets@ sn ouster; is? Got ce Suite! aya ts 17
Real test data for example problem 2 and fitted equations. ...... 18

CONVERSION FACTORS, U.S. CUSTOMARY TO METRIC (SI) UNITS OF MEASUREMENT

U.S. customary units of measurement used in this report can be converted to
metric (SI) units as follows:

Multiply by To obtain

inches 25.4 millimeters
2.54 centimeters
Square inches 6.452 Square centimeters
cubic inches 16.39 cubic centimeters
feet 30.48 centimeters
0.3048 meters
square feet 0.0929 Square meters
cubic feet 0.0283 cubic meters
yards 0.9144 meters
Square yards 0.836 square meters
cubic yards 0.7646 cubic meters
miles 1.6093 kilometers
Square miles 259.0 hectares
knots 1.852 kilometers per hour
acres 0.4047 hectares
foot—pounds 1.3558 newton meters
millibars 1.0197 x 1073 kilograms per square centimeter
ounces 28.35 grams
pounds 453.6 grams
0.4536 kilograms
ton, long 1.0160 metric tons
ton, short 0.9072 metric tons
degrees (angle) 0.01745 radians
1

Fahrenheit degrees 5/9 Celsius degrees or Kelvins

lt) obtain Celsius (C) temperature readings from Fahrenheit (F) readings,
use formula: C = (5/9) (F -32).
To obtain Kelvin (K) readings, use formula: K = (5/9) (F -32) + 273.15.

yeu

SYMBOLS AND DEFINITIONS

The F-value may be produced by a multiple regression program and

is analogous to the t-value in simple regression (one independent varia-
ble). The F-value indicates the "significance" of r* and is useful

in selecting the most important independent variables.

p 2G =D" (a-2=2) abaecoen, /Misapie
E(y - y)? P Ls 62 P

height of breaking waves
size of the sample

total number of independent variables. Caution, several observed car-
riers may end up combined into a single independent variable; e.g.,

X= (gH,) 1/2 sin 2a, has two distinct carriers (Hj, and ap) but is
one independent variable (see example problem 1). The value of p will
be one less than the number of constants to be estimated in Model I,

and is equal to the number of constants in Model II.

sample correlation coefficient. The r-value produced by regression
partially measures the closeness of fit between the linear predictor and
data. Its square is called the coefficient of determination.

ia NS me ane

eB = Boas = 2y y) & a) (Model 1)
(y - y)? i(y - y)* U(x - x)?

r2 = see ES caus (Model IT)

(Sy) 2(2x) 2

sum of squares of x may be produced by the regression program
and is useful for computing other values, e.g., Sg.

= =) 2
SS. = Gs = 2)
standard error of the estimated slope,

: S$ ox
B scm
The larger Sg, the less reliable is the estimate of slope.

unbiased estimator of the variance of the random component €, e.g.,

ig = y)2
2 eo eet Atal .
Sy ox Apel in Model I

The number of independent variables, p, is 1 in simple regression
with Model I. The mean square deviation from regresston corresponds to

the simple variance used to measure the spread of values in a single

data set. It is also sometimes called the standard error of the estimate.
The value produced by regression to indicate uncertainty of the esti-
mated y; the value Boose depends on the variances of all the estimated
coefficients.

The t-value produced in simple regression to test whether the estimated
regression coefficient is "significantly" different from zero.

longshore current velocity
independent variable in regression

observed values of X. A string of n-values in simple regression; a
n by p matrix in multiple regression

dependent variable to be estimated

n observed values of Y

estimated value of Y for given values of X

Y-intercept in a regression model

angle between the crest of the breaking wave and the shoreline

estimated regression coefficients in multiple regression or the slope of
the line in simple regression

5 3G =o DG =p

8 = ——————————_ (Model I)
I(x - x)2

a del IT
= ee (Mode )

zero-mean random component of Y assumed by both regression models

atti ae yo Wes - :
Aveta, aia brat was tesa gph.
ie ta wat ee) sibaraien ol

- Pre).

he
Lod

pie = wh mb ae “oka vey vn ba weve bog badness
o RM Ph a
re ee oboe ar bao at a St ‘seston a
ee ae Ar ca
ieiteonlld rin tos rie eat any css se i cabs ae) ats a

er a. wecknaasgsi Sythe ak eae Mand: (utes stant “ator ses
2 ' ; Pale yeti! ae 2 RRR a> wd aiid
y - f a a A - S

Se a eee
— Rye e.g
oF ae Pes A

Pay Oe
Ok, 1 Daeboh): Sle

View

"Sh Tipowtidess aolincs gaaeaes

FORCING REGRESSION THROUGH A GIVEN POINT USING ANY
FAMILIAR COMPUTATIONAL ROUTINE

by
Edward B. Hands

I. INTRODUCTION TO REGRESSION

The engineer frequently needs to estimate some response or dependent variable
Y (e.g., sand transport rate, change in shoreline position, or structural dam-
age), when given the magnitude of other factors, or independent variables X
(e.g., longshore wave energy flux, storm frequency, elevation of storm surges,
etc.). A common approach is to assume a linear model,

Y = a + 8X + € (Model I)

then adopt the principle of least squares; and use sample data to estimate the
unknown parameters, a and 8. Both 8 and X can be considered as strings
of numbers in the case of multiple regression with several independent varia-
bles; e¢ indicates that the response is not being thought of as an exact linear
function of X. The e€ represents random and unpredictable elements in Y;
therefore, e¢ does not appear in the prediction equation: y = a+ 8x, where
eCard 8 are estimates of the corresponding components in the conceptual
Model I. The assumption that e has an expected value of zero indicates that
the “average'' response is considered linear. If e varies widely, Model I,
though conceptually correct, may have only limited predictive value. In such

a case the estimated mean value of Y would frequently be thrown off by noise
in the data. If ce varies only slightly, good predictions will be possible
provided good estimates of a and 8 are available. Adopting the principle
of least squares means one is willing to define the best estimates of a and

8 as those that minimize the sum of the squares of the deviations between the
observed and predicted values (i.e., y and y).

Customarily, no constraints are placed on the contenders for the best fit
line. Of all possible lines in the XY plane, the prediction equation is
chosen because it has the least sums of squares of deviations in y's from the
data points. The y-intercept, a, is the point where the best fit line inter-
sects the Y-axis. The a may be of Special interest, e.g., in the regression
of current speed against longshore wave energy flux measured in a field test
(Fig. 1). An intercept substantially above zero would suggest that during the
test a component of the longshore current was driven by mechanisms other than
waves (e.g., tides or winds). In this case, the nonzero intercept would not
only be meaningful, but would also provide a good estimate of the velocity of
any steady, nonwave-generated coastal current during the test.

An additional example of unconstrained regression would be where greater
and greater structural damage occurs as the wave forces exceed an undetermined
threshold value. Again Model I applies and produces the correct regression
coefficient (8). In the process it produces a meaningless response intercept
well below zero (Fig. 2). In contrast with the previous example, the interest
here is strictly in the prediction of future damage for given wave forces, not
in the value of the intercept itself. The resulting linear relationship applies
only to values of the independent variable above the threshold of wave effect.

Biveuisey iy

Figure 2.

Flow Rate

Wave Energy Flux

Application of Model I produces an intercept (a),
which may be a useful estimate of a component of
longshore flow which is independent of wave con-
ditions and presumably pervades the entire data
set.

Wave Forces

Vi. s—Negative Intercept

Application of Model I identified a thresheld value
below which waves cause no damage. A negative inter-
cept is produced, but is of no interest in this
particular problem.

Although the negative intercept (a) is in itself meaningless, Model I is
correct because there is no basis for constraining 4.

II. A PROBLEM WITH THE CUSTOMARY APPROACH

There are many cases where the logic of the application dictates the
response at a particular value of X. For example, if the response is some
change that is regressed against time then the response must be 0 when X= 0
(Fig. 3). If there is no elapse time, there can be no change. If the linear
assumption is valid, the appropriate conceptual mode is

Y = 8 X + ©€ (Model IT)

and the customary predictive equation (based on Model I) is inappropriate and
May give poor estimates of 8 (see Fig. 4). Yet the vast majority of regres-—
sion programs (e.g., SPSS, IMSL, IBM's 5110 package, and TI-59) do not allow
specification of a zero intercept or any constraint through a known point.
Statistical texts usually do not cover this topic either. However, formulas
for the zero-intercept case are given by Brownlee (1965) and Krumbein (1965).

Figure 3. Application of Model II forces a zero-intercept solution.

A
Model 11> 8 =0.63
“_= Model 1 > 8 =0.34

Figure 4. Model II estimates an increase in Y per unit increase in X
that is nearly twice that predicted using Model I. The phy-
sical relationship between X and Y dictates which model
should be adopted. If Model II is appropriate the solution can
be obtained using a simple artifice described in this report
to modify results of standard computer programs intended for
Model I.

The value of Y may be known for a single value of X (mot necessarily 0).
The best prediction should then be sought from among the limited subset of
lines through this point. All these lines will have a larger sum of squares
(Z[y - y]?) than the line that would have been selected by Model I. A simple
procedure is described herein for picking from among these restricted candidates
the one with the smallest ‘Z[y - y]*. Thus, regressing through the origin is
but one specific case that can be solved by a general model forcing regression
at an arbitrary point.

III. SOLUTION TO THE PROBLEM

This report describes a method for getting the best fit to all data points
(in the sense of least squares) while forcing an exact fit at any known point.
A simple procedure for forcing regression through the origin was described by
Hawkins (1980), who indicated the procedure was not well known. The author of
this report knows of no references to the general case of an exact fit to an
arbitrary point. However, if a fit can be constrained through the origin, then
a simple transform of variables can force the line through any given point.
The details of the through-the-origin solution will be explained first.

1. Regression Through the Origin.

For each set of measured dependent and independent variables observed
(yj, x4), also enter, or program, a mirror-image set (-y,, =x; ))- Thus), the
computer is given an extended data set consisting of 2n data points, only n
of which were observed. By definition of this extended data set, the depend-
ent and all the independent variables each individually sum to zero, forcing
a zero intercept:

qa by the principle of least squares

u
<
|
DR
bea

a = 0 because ‘x and Yy = 0 and thus
xX = y = O on the extended data set

Thus a zero-intercept solution is obtained. Is it still the least squares
solution for the observed data set? The principle of least squares by defini-
tion minimizes the sum of the squares of the deviations of the observed from
the predicted values. Because each squared deviation from the observed data
set generates an identical squared deviation in the extended data set, the sum
of these two positive sequences is minimized over the extended data set only
if it is also minimized over both the observed and the mirror-image sets.
Thus, the regression coefficient produced in this manner; not only the least
Squares solution for the artificially extended data set, but for the observed
data set as well. By this artifice the proper estimate is obtained for the
regression coefficient (8) with the prediction forced through the origin.

2. Regression Through Any Arbitrary Point (a, b).

If the predicted response (Y) must be a when the independent variables
(X) are b, then regress an extended data set u on v, where u=x-a
and v=y-b. If (a, b) = (0, 0), then this collapses to the exact
situation described above. If (a, b) # (0, 0), the direct results, wu = Bv,
should be unraveled to produce the y prediction:

G>
tl

y - b = B(x - a)

(b - aB) + Bx

M<
ll

NOTE: The proper estimate of the regression coefficient (8) now forces the
prediction through the point (a, b) as desired. By using this procedure the
correct regression coefficient is obtained by using any familiar computational
routines. The second most frequently reported output from regression programs,
the correlation coefficient (r), is also the correct, unbiased estimator for
Model Il.

If additional information is provided by the regression program, then
corrections may be necessary before adopting them for the real data set. The
estimate of the residual variance will be correct for simple regression (one
independent variable) and can be easily adjusted for multiple regression (see
Table 1). Any sums of squares, cross products, and F-values produced by the
program will be exactly twice the correct values. The standard error of the
estimated slope will be too small by a factor of V2. Therefore, the t-value,
for testing the zero slope hypothesis, will be too large by the same factor.

Table 1 indicates the corrections for most of the elements produced by
various .cgression programs. However, employing the described extended data
procedures does not require consideration of any part of the output beyond that
used in the standard unconstrained approach.

IV. SELECTING BETWEEN MODELS I AND II

If either the true or mean value (whichever interpretation fits the situa-—
tion) of the dependent variable (Y) is unknown for all values of the independ-
ent variable in the range of concern, then the customary model (I) may be
appropriate. However, if the postulated physical relationship between X and
Y dictates constraint through any point (a, b) and the relationship is linear
from the maximum observed x to x = a, then Model II should be used. To pro-
ceed with the customary evaluation of Model I would be equivalent to ignoring
what is already known about the relationship between X and Y and, instead,
relying totally on the limited information available in the sample data. The
objective should be to obtain the best interpretation of the data, which does
not override any more firmly established understanding of the situation.

Assuming Model II applies, it may still be useful to evaluate Model I to
test in the conventional way (Draper and Smith, 1966) the significance of the
estimated nonzero intercept. If this test fails to provide enough evidence to
reject the strawman hypothesis (H,: a= 0) then this failure may be cited as
additional evidence strictly from the data, substantiating the choice of Model
II to estimate 8. The results of this formal test of hypothesis should not,
however, be relied on as the criterion for selecting Model II. It should serve
only as a source of auxiliary information clarifying the extent to which the
sample data will support the model choice. The choice should be made on the
basis of functional insight and understanding of the relationship between X
and Y.

Comparing the correlation coefficients or r-values, produced using the
real data and the extended data, is likewise not a valid method for choosing

°(996T) YaFUS pue todeag UT eTqe[TeAe st uoTSsseaZea1 Jo sjUSUETA

prepueys 9yi SsuTJeAdzeq{UE_ UO UOTJeEWIOJUT [TeUOTITPpY ‘peaNseeuUl e19M Jey seTqeTIeA JUepuedepuT jo

Jequnu ey} st d ‘(sjUsWeINsPell sulos Jo UOTJeENTeAS UO peseq eeWTISe SI]— s— g pue Tepow Tenzdaouos
eB UE JoqJewesed umouyun ue st g ‘*8°9) saoqoweaed peqjeyun oni ey. Jo seqewTyzse ose senqea peqieH,

OSIeT
AjTe_eiepow st = ane Obs Ajejeupxoidde
pue [=d “go ok ATQ0exXe ST YOTYM
d-u
‘ ) teh eS oURTIPA [TeENpPTsel pejeurisy

Z/l (e/ =o = u)

¢
5B A anTqTea-
Fj TeA-d
“al al oTIsTIeIs 3S99-3
eg ee ee Sa Glo Ota sezenbs jo ung
(tig Age 2gig
d esie, AjTeqeiepow
Ss} — 3: & ;28 AjTerewrxoidde pue
ped ar @ 1s ATWIeX® ST yoPYM
(d - u) g J
dS ws g jo 201x709 piepueqjs
aft | Cr = & = ee) |e $ Ms
5 BA d :jUeTOTJJOoO UoTIeTeI1I09
1g 9 g <sqUeTITFJO0O uoTSselrsey
eqep pepue}xe B8utsn
eUTINOA UOCTSSeIZeA pouTeazsuoOD SOTISTIeIS 3S0}
Tepow peupTerzjysu0d TOF saqeMTIse 49987109 Aq peutejigo sozewLqis7 pue siojowereg

SSS ee eee ee

“e1e8p pepueixe Buysn sueis0id Aq peonpoad sjuseweTe prepueqjs jo quewqysnf{py -T eTqeL

between Models I and II. The value of r2 using Model I (observed data only)

is often referred to as the reduction in variance of the estimator made possible
by using the apparent association between X and Y. A value of 0)
indicates that knowledge of the X-values makes no improvement in the prediction
of Y and using the mean value of the y's as the estimator would not increase
the sum of the squares of the deviations. At the other extreme if r2=1, all
sample points lie on a sloping straight line implying a strong predictive value.
Similarly with Model II, higher r* values indicate improved fit of the data;
but comparing r* values between Models I and II does not reveal which is
correct or even preferable. There is a slight conceptual and a substantial
computational difference between the r* values for the two models. The two
values should not be compared; both indicate the relative fit of various data

to their own particular model. Either value can be used to measure "goodness

of fit" in particular applications; or even to indicate the usefulness of several
versions of the particular model chosen, For example comparison of r-values
would indicate whether taking logs of the measurements, or raising them to a
given power prior to regression, improved the fit. But comparison of the r-
value would not be a valid basis for choosing between Models I and Il.

V. EXAMPLES

The following problems illustrate a frequent need to constrain the regres-—
sion line in coastal engineering applications. The problems also illustrate
the usefulness of r2 to rank different predictors in terms of how well they
fit data. Before initially applying the described method to an actual problem,
it may be helpful to reanalyze one of the smali data sets used in these examples
and compare the results with those published in this report.

kok wk OK kK KK & OK O&O K KOR & & * EXAMPLE PROBLEM 1 * * * * * * KK KR RK KK K

Consider the requirement to simulate a long-term history of wave-induced
longshore currents for a particular coastal site. Assume hindcasted wave data
are available, but that current measurements were not made over the period of
interest. According to the Shore Protection Manual (U.S. Army, Corps of
Engineers, Coastal Engineering Research Center, 1977), the longshore current
(v) can be calculated as a function of the beach slope (m), the gravitational
acceleration (g), and the angle and height of breaking waves (ap, Hp,
respectively).

v= 20.7 m (gip,) 1/2 sin 2ap (1)

The coefficient of proportionality (20.7) is based on typical mixing and fric-
tional factors for the surf zone. Empirical formulas, like equation (1) can be
adjusted by regression analysis of test data from the specific site of intended
application. This will customize the formula to fit site-sensitive conditions.
The longshore velocity also varies laterally within the surf zone. The problem
of estimating the spatial structure of flow across the surf zone may be avoided
by obtaining current measurements at the exact point where the long-term flow
must be reconstructed, then regressing the test measurements against simul~
taneously determined breaker conditions. Steps in such an analysis are given
below. Only a few data points are used in the example to encourage the reader
to go through the computations and check the results. The data are taken from
a frequently referenced field study done at Nags Head, North Carolina (Galvin
and Savage, 1966).

Ks)

GIVEN: Longshore current velocities (v), breaker heights (H,), breaker
angles (ap), and the beach slope (m) determined onsite during a short
field evaluation (see Table 2).

Table 2. Field calibration data (from
Galvin and Savage, 1966).

Obsn. Hp m Vv
(£t) (c/5)
1 2 0.03 2.42
2 3.2 0.026 4.33
3 1.8 0.029 1.96
4 8 0.026 1.26 .

REQUIRED: An equation that will predict wave-induced longshore currents for
the test site.

ANALYSIS: Because the linearity expressed in equation (1) has a firm theoreti-
cal basis in the concept of radiation stress (Longuet-Higgins, 1970), and
because according to this concept, v = 0 whenever Hp = 0 or op = 0, the
prediction line must pass through the origin (0, 0). So Model II must be used.

Let
Yay

and
xX

m(gHp) !/2 sin 20,

Regress Y on X to determine the best estimate of the coefficient of
proportionality between X and Y.

CORRECT RESULTS:

Regression coefficient 8 = 17
Correlation coefficient r = 0.91
Standard error of 8 SB = 4.6
Test statistic for 8 t S 367
Estimated residual variance SGox = 1.8

CONCLUSION: The version of the Longuet-Higgins type equation that best fits
this problem site (based on available current data) is:

v= 17 m (gh) 1/2 sin 20

NOTE: Fitting the equation to the data in this example produces results closer
to those obtained with larger data sets (eq. i) if the line is forced through
the origin rather than being fit strictly to the data without this constraint
(see Fig. 5).

Measured Velocity

| :

X=m(gH,) "8Sin 2ap

Figure 5. Real test data for example problem 1. Compare
the correct fit through the origin with the

customary fit.

koe ek eee KK KK RK KOK & ® EXAMPLE PROBLEM 2% * * & ¥ kX RR RK K XX KK

At least 10 equations relating the velocity of longshore currents to wave
characteristics have appeared in the literature. Presumably more will appear
as knowledge increases or theory is adapted to specific wave or bathymetric
conditions (i.e., specialized for breaker type or bar dimensions). A recent
article (Komar, 1979) questions the value of including a measure of beach slope
in the general prediction equation and claims better results for

v = 0.585(gHp) 1/2 sin 2op
GIVEN: The same situation and data as in example problem 1.
REQUIRED: Determine the best fit version of the type

v= (gh) 1/2 sin 2a

and compare the results with those obtained in example problem 1 to see if
the beach slope is indeed of any value at this particular site.

ANALYSIS: For the same reasons stated in example problem 1, regression
should require the prediction line to pass through the point (0, 0).

Let

Y=v

Ke (gh) 1/2 sin 20

and regress Y on X using Model II with its extended data set (Fig. 6).

Measured Velocity

X= (gH)? Sin 20,

Figure 6. Real test data for example problem 2 and fitted
equations. Compare the correct fit through the
origin with the customary fit.

CORRECT RESULTS:

Regression coefficient 8 = 0.46
Correlation coefficient Te = 0.90
Standard error of 8 Se = 0.13
Test statistic for 8 fe = 3.6
Estimated residual variance Sosg ie Ms)

CONCLUSION: The best predictor of the Komar type is:
v= 0.46(gH,) 1/2 sin 2ap

It would be surprising to find a clear indication of whether beach slope should
be included in the predictor for longshore currents by evaluating such a
limited data set as chosen here to encourage reader computation. Indeed a
comparison of Tables 3 and 4 reveals no significant differences between the
correlation coefficients or any other test statistics. However, significant
differences would be expected if a large reliable data set covering a wider
range of conditions were compared by the methods illustrated in this report.

Table 3. Extended data set No. 1.

Obsn. x We

(ft/s) (ft/s)
1 0.152 2.42
-0.152 -2.42
2 0.162 4.33
-0.162 -4.33
3 0.0827 L9G
-0.0827 -1.96
4 0.170 Po2l
-0.170 -1.27

Table 4. Extended data set No. 2.

Obsn. x 4

(ft/s) (ft/s)

il 5.05 2 on
-5.05 -2.42
2 6.25 4.33
-6.25 -4.33
3 2.85 1.96
-2.85 -1.96
4 6.53 1.27
-6.53 -1.27

LITERATURE CITED

BROWNLEE, K.A., Statistical Theory and Methodology tn Setenee and Engineering,
2d ed., John Wiley & Sons, Inc., New York, 1965.

DRAPER, N.R., and SMITH, H., Applted Regression Analysts, John Wiley & Sons,
Inc., New York, 1966.

GALVIN, C.J., Jr., and SAVAGE, R.P., "Longshore Currents at Nags Head, North
Carolina," Bulletin 11, U.S. Army, Corps of Engineers, Coastal Engineering
Research Center, Washington, D.C., 1966, pp. 11-29.

HAWKINS, D.M., "A Note on Fitting a Regression Without an Intercept Term,"
The Amertcan Stattsttetan, Vol. 34, Nov. 1980, p. 233.

KOMAR, P.D., "Beach-Slope Dependence of Longshore Currents," Journal of Water-
ways, Port, Coastal and Ocean Divisions, Vol. 105, Nov. 1979.

KRUMBEIN, W.C., and GRAYBILL, F.A., An Introduction to Stattsttcal Models in
Geology, McGraw-Hill Book Co., New York, 1965.

LONGUET-HIGGINS, M.S., “Longshore Currents Generated by Obliquely Incident
Seawaves," Parts I and II, Journal of Geophystcal Research, Vol. 75, No. 33,
Nov. 1970.

U.S. ARMY, CORPS OF ENGINEERS, COASTAL ENGINEERING RESEARCH CENTER, Shore

Protectton Manual, 3d ed., Vols. I, II, and III, Stock No. 008-022-00113-1,
U.S. Government Printing Office, Washington, D.C,, 1977, 1,262 pp.

129 I-€8 ‘ou d31gsn° €07OL
*T-€8 *ou £((*S*N) JequeD yoIResey
Suy~iveutTsuq Teqseoj) aeded yTeopuyoel, :seties “III “(*S*N) 1eqUaD
yoieesoy Buyiveutsuq Teyseo) “TI ‘eTAIFL “I “uoTsseisey *y “suoT?
-enbe uotiotTpeid *¢€ “*sTSsATeue ejeg *Z “*BuTyieeuT~sue Te,seoD *T
*sauTInor [euoT Ae yndwos Arewojysnd BuytAzTpow no
-Y1TA peyst[dwos0er aie sq[nsei werZ01d ayq 03 seBueyd Aiessadeu TTV
*uaATS o1e sotdmexqy *qutod umouy Aue 4e ATF JOexXe ue BuyoIoZ aTTYyM
(asuas saienbs qseayT ay ut) squtod ejep [Te 02 ATF 3Seq uoTIeNbe
uotTjOtTpead ayq Buputeqqo 10z poyjew oTdwts e saqtaossep qaiodoy
. €861 YOIeH,,
*2TITI I2A0D
(1-€8 ‘ou £ aeque9 YyoIResay
Butissauytsug [Teqseop / aeded Teotuydey)--"wo gz f *TTE : *d [07]
“€861 “SIIN Woaz atqeTTeae : ‘eA ‘ppetzButads
£ zaque9g yoieesoy BSutTiseutsuq [eqyseog ‘saeeuTSuq jo sdaog ‘Awasy
"Ss" : "BA f1fOATOg J10q--*spueH *g psreMpy Aq / aUTINOA TeuoTIeIWNd
-wod iJet{~wezy Aue Sutsn qutod ueatTs e yBno1y I uotTsseifZe1 Buyo10g
“q paempy ‘spueyH

Lt9 I-€8 ‘ou da1gsn* €07OL
*T-€8 *ou £((°S°N) JeqUueD yo1eessy
Sutiseutsuq Teqyseoj) aaded [eoTuy.eL :seTieg “III ‘(*S*N) izequeD
yoiessoy Buptaveutsuq Teqyseo) “II “eTITL “I “uoTsseis0y “hy “suOoT
-enba uoTqtpedid *¢€ “*sTSsATeue ejeg *Z *BuTiseutsue Te IseOD *]
*soutqnoi [Teuotjzeqndwod Aiewojsnd BuyAFTpow no
-YITh peystTdwoooe eae sqynsei weis0ad oy 073 sadueyo Aaessa20u [TV
*uaATs o1e sotdwexqy *qutod umouy Aue je ATF JOeXa ue BuTIIOZ VTTYM
(esuas saienbs 4seayT 9yq ut) squtod ejep [Te 03 3TJ 3Ssaq uoTIeNbes
uotqo2tTpead ayq Bututeqqo 10x poyjew otduts e saqTraosap jioday
» €861 YOIeW,,
“27TIT2 41aA0p
(1-€8 *ou £ Jajque9g yoIRessay
Suyteouzsug ~Teqyseop / teded Teotuyosey)--*wo gz £ *TTET : “d [02]
"€861 “SILN Worx atqeTzeae : *eA ‘pyToTzsutads
£ Jaqueg yo1eesey Buyiseutsuy Teqseog ‘saveuTsuq jo sdiog ‘Away
*s*l : *BA ‘ApToOATeg J10q--*spueyH *gq paempg Aq / aut nor TeuoTIeAnd
—wod Jet TTwezy Aue Butsn yutod usatTs e ysno1y. uoTsseise1 BupII0g
*q paempyg ‘spuey

Le9 T-€8 “ou da1gsn° £072OL
“T-€8 ‘ou ‘((*S*N) Jeque9 yoreesey
SuTiseutsuyq Teqyseop) aeded [eoTuydeL :seties “III *(*S*N) 1equUeD
yoaeesay ButAeeuTsuq TeIseoD “II “STIL “I “uoTsseizey *y ‘*suoTz
-enbe uotTjoTpedid *¢€ ‘“*stTSsATeue ejeqd *Z “*BuTIesuTSsue TeqIseOD *]
*sout}NoI TeuoT Ze4ndwos Aiemojsnd BuyAFTpow no
-4J—TM peyst[dwosoe aie sq[nsei weis0ad ayq 03 sasueyo Azessedou TTy
*uaaT3 oie setTdwexq ‘*jutod umouy Aue je AT JOexe ue BuyoI0OZ aTTYyM
(esues seienbs 4seetT ey} UT) squtTod eJep [Te 03 3TF 3Seq uoTIeENbs
uotTjoTpead ayq BututTe;qo 10% poyjow eTdwts e saqtiosep j1odsay
«861 YOIeH,,
*aT3T2 1eA09
(1-€8 ‘ou £ daqueg yoreessy
Suptivoutsuq [Teqyseog / ieded ~TeoTuyoe],)--*wo gz * “TIT : “d [07]
“€861 “SILN Woaz eTqeTyeae : *eA ‘pTezzZutads
£ tequeg yoieessy ButiveuT3uq Teq3seop ‘siaeuzTZug jo sdiop ‘Away
*s°n : ‘eA SATOATOq JIOq¥--*spueyH “gq paempy Aq / sUTINOI TeuoT eINnd
-wod iezt[Twey Aue 3utsn jutod uaats e ySnoiyq uofsseiZe1 BuToI104
“q paeapy ‘spueH

129 I-€8 “ou d3tgsn° €0ZOL
*T-€8 °ou £((°S*°N) 1teqUeD YyYoIeEeSsy
SuTiseutsuq Teqseop) aeded Teop~uyoeL, :seties “III °*(°S*N) AequeD
yoaeesey BupasouTsuq [eqIseoD “II “eTITL “I “uoTsseisey *h *suoTI
-enba uotTjoTpedd *€ ‘“‘stTsATeue ejeq *Z “SuTIeseuTsue TeqseoD *[
*sautjNnoI Teuot Zeqndwod Arewojsnd BuyA;Tpow yno
-Y7TM peystTdwoose aie sq[nsei weasoid ey} 03 sesueyo Aiessedeu [TV
*‘uaAT3 are setTdwexy ‘*jutod umouy Aue je 4TjJ JOexe ue BuTOIOF aTTYyA
(esuas saaenbs qseayT ey) UT) squtod ejep [Te 03 ITF 3Sseq uoTIenbs
uoTjoTpeiad ayq Buptuteqqo 10z poyjew etdwts e seqyaosep j10dey
u £861 YOIeW,,
“97312 41eA0D
(1-€8 ‘ou £ JequeD yoreasey
Sutieeuzsuq Teqseog / aeded Teopfuyoey)--*wo gz +: “TTT: *d [07]
“€861 ‘SILN Worx atqeTTeae : ‘eA ‘pTetysutids
{ agaqueg yoaeessy Butaseutsuq [Teqyseog ‘sieseuTsuq jo sdaop ‘Away
"s*n : ‘eA SafTOATOg Jaog—--*spuey “gq paempy Aq / aut InoI TeuoTIeAnd
-—wod ief,{Twey Aue Zuzsn qutod ueazs e ySnoizy} uoTsserze1 Bupo10g
*q paempa ‘spueHq

eal
=

i i 6

[Te a Se eg Tes a Wan a
a = = > Phi - _ Oe i ee = :

We _ :

~~—m &. @ i aon =s

od Gea = P@ > as 6): ' PY *
ae goat 5) a a We ay i ah i

Internet Archive Audio

Featured

Top

Images

Featured

Top

Software

Featured

Top

Books

Featured

Top

Video

Featured

Top

Mobile Apps

Browser Extensions

Archive-It Subscription

Save Page Now

Full text of "Forcing regression through a given point using any familiar computational routine"

See other formats