research design

 

Jane, a military psychologist, wants to examine two types of treatments for depression in a group of military personnel who have suffered the loss of their legs. She has only 20 men to work with.

  • What would be the best research design for the study and why?
  • What are some issues that Jane needs to consider before starting the study?
  • What is a longitudinal study? What are the benefits and challenges associated with a longitudinal study?
  • Using the Online Library find two peer-reviewed articles (one that has used a between study design and one that has used a within study design) Summarize both of these articles. Make sure you discuss the research design specifically.
  • Explain what practice and carryover effects are in the context of the within subjects design study that you found. What steps did the researchers take to reduce these effects?

Justify your answers with appropriate reasoning and research

Methods for Policy Analysis Rebecca A. Maynard,
Editor

Kenneth A. Couch,
Guest Editor

Authors who wish to submit manuscripts for all sections except Book Reviews
should do so electronically in PDF format through Editorial Express.

STRENGTHENING THE REGRESSION DISCONTINUITY DESIGN USING ADDITIONAL
DESIGN ELEMENTS: A WITHIN-STUDY COMPARISON

Coady Wing and Thomas D. Cook

Abstract

The sharp regression discontinuity design (RDD) has three key weaknesses compared
to the randomized clinical trial (RCT). It has lower statistical power, it is more de-
pendent on statistical modeling assumptions, and its treatment effect estimates are
limited to the narrow subpopulation of cases immediately around the cutoff, which is
rarely of direct scientific or policy interest. This paper examines how adding an un-
treated comparison to the basic RDD structure can mitigate these three problems.
In the example we present, pretest observations on the posttest outcome measure
are used to form a comparison RDD function. To assess its performance as a sup-
plement to the basic RDD, we designed a within-study comparison that compares
causal estimates and their standard errors for (1) the basic posttest-only RDD, (2)
a pretest-supplemented RDD, and (3) an RCT chosen to serve as the causal bench-
mark. The two RDD designs are constructed from the RCT, and all analyses are
replicated with three different assignment cutoffs in three American states. The re-
sults show that adding the pretest makes functional form assumptions more trans-
parent. It also produces causal estimates that are more precise than in the posttest-
only RDD, but that are nonetheless larger than in the RCT. Neither RDD version
shows much bias at the cutoff, and the pretest-supplemented RDD produces causal
effects in the region beyond the cutoff that are very similar to the RCT estimates
for that same region. Thus, the pretest-supplemented RDD improves on the standard
RDD in multiple ways that bring causal estimates and their standard errors closer to

Journal of Policy Analysis and Management, Vol. 32, No. 4, 853–877 (2013)
C© 2013 by the Association for Public Policy Analysis and Management
Published by Wiley Periodicals, Inc. View this article online at wileyonlinelibrary.com/journal/pam
DOI:10.1002/pam.21721

854 / Methods for Policy Analysis

those of an RCT, not just at the cutoff, but also away from it. C© 2013 by the Association
for Public Policy Analysis and Management.

INTRODUCTION

A carefully executed regression discontinuity design (RDD) is now widely considered
a sound basis for causal inference. The design was introduced in Thistlewaite and
Campbell (1960), and Goldberger (1972a, 1972b) showed that RDD produces causal
estimates that are unbiased, but less efficient than those produced by a comparable
randomized clinical trial (RCT). Recent work has clarified the assumptions that
support parametric and nonparametric identification in the RDD (Hahn, Todd, &
Van der Klauuw, 2001; Lee, 2008), and has examined the statistical properties of
common estimators (Lee & Card, 2008; Porter, 2003; Schochet, 2009). In addition, a
growing literature compares RDD estimates to benchmark estimates from an RCT,
and these within-study comparisons show that RDD and RCT estimates have been
similar in various applied settings (Cook & Wong, 2008; Green et al., 2009; Shadish
et al., 2011). Despite this recent work, the basic elements of the design have not
changed. An RDD requires an outcome variable, a binary treatment, a continuous
assignment variable, and a cutoff-based treatment assignment rule. The assignment
rule is crucial: In a successful RDD, individuals with assignment scores on one
side of the cutoff receive one treatment and individuals on the other side receive
another treatment, usually a no-treatment control condition. An RDD is sharp when
all individuals receive the intended treatment, and it is fuzzy when compliance is
partial. This paper deals only with sharp RDD studies.

The analysis of an RDD is not complicated in principle. Researchers estimate treat-
ment effects by comparing mean outcomes among people with assignment scores
immediately below and immediately above the cutoff. The difference between these
two conditional means can be understood as a discontinuity in the regression func-
tion that links average outcomes across subpopulations defined by the assignment
variable. A basic assumption in the RDD is that in the absence of a treatment effect,
the regression would be a smooth function near the cutoff; conversely, a sudden
break or discontinuity at the cutoff is evidence of a treatment effect. The size of the
discontinuity measures the magnitude of the effect.

The RDD has at least three important limitations relative to an RCT. The first
involves the amount of statistical modeling required to identify and estimate causal
effects. In an RCT, treatment effects are nonparametrically identified so that as-
sumptions about the underlying statistical model are not required to interpret the
data. Moreover, there is usually a close connection between the research design
and the statistical tools used to perform the analysis.1 In RDD, on the other hand,
treatment effects are nonparametrically identified, but fully nonparametric anal-
ysis requires very large sample sizes that cannot always be attained. In practice,
researchers often proceed by specifying a parametric or semiparametric functional
form of the regression and allowing for an intercept shift at the cutoff (Lee & Card,
2008). Choosing the wrong functional form can lead to biased treatment effect
estimates, so it is good practice for analysts to use flexible methods to estimate
functional forms before evaluating how sensitive the results are to alternative speci-
fications. Although many techniques for sensitivity analysis exist, it would be a boon

1 Of course, analysts often employ parametric regression models in the analysis of experimental data
either to improve the statistical precision of the treatment effect estimates or to adjust for chance
imbalances in observable covariates. But this additional modeling is usually not central to the study’s
findings.

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

Methods for Policy Analysis / 855

in RDD studies to have better methods for validating functional form assumptions.
We present one such method here.

A second limitation of standard RDD is that treatment effect estimates are less
statistically precise than in an RCT, reducing the statistical power of key hypothesis
tests (Goldberger, 1972b; Schochet, 2009). Some of the efficiency loss is due to the
multicollinearity between assignment scores and treatment variable that is inherent
in the RDD assignment rule. RDD estimates that rely on nonparametric estima-
tion methods may also have lower power because they employ a bandwidth that
decreases the study’s effective sample size. Lower statistical power is a secondary
concern in RDD studies with large administrative databases, but it is more central
when investigators prospectively design a study and collect their own data directly
from respondents. In this last circumstance, adding more cases may be costly and
tempt researchers into favoring alternative designs with greater power, but a weaker
identification strategy (Schochet, 2009).

A third limitation concerns the generality of RDD results. RCTs produce treatment
effect estimates averaged across all members of the study population. In contrast,
RDD estimates are limited to average treatment effects among members of the
narrow subpopulation located immediately around the cutoff. For example, if a
treatment is given to students scoring above the 75th percentile on an achievement
test, then RDD results can only be generalized to students near that point. Unfor-
tunately, social science and public policy debates usually are concerned with the
effects of treatments in broader subpopulations, such as all students, or all students
in the upper quartile of the test score distribution. Constructing estimates of these
more general parameters in an RDD setting requires making extrapolations beyond
the cutoff score. Researchers often are reluctant to make such extrapolations be-
cause there is rarely a firm theoretical basis for the assumption that the functional
form of the regression is stable beyond the range of the observed data. The crux of
the problem is that no one knows what the treatment group functional form would
have looked like in the absence of the treatment. The absence of this counterfactual
regression function is why it is standard practice to limit causal inferences to the
cutoff subpopulation, even though this narrow applicability of the estimates reduces
the value of the standard RDD as a practical method for policy analysis (Manski,
2013).

This paper explores an RDD variant that can improve on all three of these limita-
tions. It requires supplementing the conventional posttest-only RDD with a pretest
measure of the outcome variable. In what follows, we refer to the conventional RDD
as a “posttest RDD” because it only requires posttest information. We refer to the
pretest-supplemented design as a “pretest RDD,” noting that it makes use of both
pretest and posttest outcome data. The key idea is that the pretest data provides
information about what the regression function linking outcomes and assignment
scores looked like in the absence of the treatment in an earlier time period. If the
functions are stable over time, then the pretest data can inform the analysis of the
posttest data. Minor differences between the pretest and posttest functional forms
in the untreated part of the assignment variable, such as intercept differences, are
easily accommodated. But functional forms that are observed to be very dissimilar
over time in the untreated part of the assignment variable would cast doubt on the
results of a pretest-supplemented RDD.

The core of this paper is a within-study comparison that evaluates the performance
of the pretest and posttest RDDs relative to each other and to a benchmark RCT.
LaLonde (1986) and Fraker and Maynard (1987) were the first to use this method
to examine whether econometric adjustments for selection bias in an observational
study could reproduce the results of job-training RCTs. Since then, researchers have
used the method to study the performance of RDD (Green et al., 2009; Shadish et al.,
2011), intact group and individual case matching (Bifulco, 2012; Cook, Shadish, &

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

856 / Methods for Policy Analysis

Wong, 2008; Wilde & Hollister, 2007), and alternative strategies for covariate se-
lection (Cook & Steiner, 2010). The implementation details of within-study com-
parisons vary, but the basic idea is always to test the validity of a nonexperimental
method by comparing its estimates to a trustworthy benchmark from an RCT. Meth-
ods for conducting a high-quality within-study comparison have evolved over time,
and Cook, Shadish, and Wong (2008) describe the current best practices that we
follow in this paper.

Our within-study comparison is based on data from the Cash and Counseling
Demonstration Experiment (Dale & Brown, 2007). In the original study, disabled
Medicaid beneficiaries in Arkansas, Florida, and New Jersey were randomly as-
signed to obtain home- and community-based health services through Medicaid
(the control group), or to receive a spending account that they could use to procure
home- and community-based services directly (the treatment group). The origi-
nal study examined the effects of the program on a variety of health, social, and
economic outcomes. But for the purposes of our within-study comparison, the
outcome variable we focus on is a measure of individual Medicaid expenditures
in the 12 months after the study began.

To construct pretest and posttest RDDs from the RCT, we used baseline age as the
assignment variable and sorted the RCT treatment-group and control-group cases
by baseline age. Then, we defined a cutoff age for treatment assignment, selecting
three of them for replication purposes—ages 35, 50, and 70. Next, we systematically
deleted control cases from above the cutoff and treatment cases below the cutoff.
Since we had data from Florida, New Jersey, and Arkansas, a total of nine posttest
and nine pretest RDDs resulted—three age cutoffs crossed with three states. At each
age cutoff, we compared the pretest and posttest RDD estimates to each other and to
the corresponding RCT estimate. In the pretest RDD, we also used the comparison
data to compute an estimate of the average treatment effect for everyone older than
the cutoff, which is the average treatment effect on the treated (ATT) parameter
that is often of interest in program evaluation research and is usually out of reach in
RDD studies. We compared these extrapolated estimates to the corresponding RCT
benchmarks.

The results of our analysis indicate that the pretest RDD can shore up all three
key weaknesses of the posttest RDD. First, our comparisons show that the pretest
and posttest functional forms are similar below the cutoff, thus providing some
support for the proposition that the pretest data could be informative about the
counterfactual untreated regression function in the posttest period. Second, we
found that adding the pretest led to more statistically precise estimates than the
conventional posttest RDD, although the estimates are still not quite as precise as in
the RCT. And finally, the pretest RDD produced unbiased treatment effects relative
to the RCT, not only at the cutoff, but also beyond the cutoff. In the within-study
comparison considered in this paper, the multidimensional superiority of the pretest
RDD over the posttest RDD is clear.

THE RCT DATA

The Cash and Counseling Demonstration and Evaluation is described in detail else-
where (Brown & Dale, 2007; Dale & Brown, 2007a; Doty, Mahoney, & Simon-
Rusinowitz, 2007; Carlson et al., 2007). Study participants were disabled elderly
and nonelderly adult Medicaid beneficiaries who agreed to participate and lived in
Arkansas, New Jersey, or Florida from 1999 to 2003. The study employed a rolling
enrollment design in which new enrollees completed a baseline survey and then were
randomly assigned to treatment or control status, after which the state agency was
informed of the assignments. The treatment condition was a “consumer-directed

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

Methods for Policy Analysis / 857

Table 1. Descriptive statistics for the variables and samples used to form within-study
comparisons.

Arkansas Florida New Jersey

Variable Control Treatment Control Treatment Control Treatment

Pretest Medicaid
expenditures

$6,358 $6,439 $14,300 $14,377 $18,779 $18,215

Posttest Medicaid
expenditures

$7,583 $9,443 $18,088 $19,944 $20,100 $21,299

Mean age 70 70 55 55 62 63

N 1,004 1,004 906 907 869 861

budget” program. It allowed disabled Medicaid beneficiaries to procure their own
home- and community-based support services and providers using a Medicaid-
financed spending allowance. The control group received home- and community-
based support services procured by a local Medicaid agency from Medicaid certified
providers, which is the status quo policy. In both groups, Medicaid pays for the ser-
vices. The key difference is whether the Medicaid enrollee or the Medicaid agency
makes the micro-level spending decisions. In the new program, the personal al-
lowance was set to the amount the agency would have allocated in the absence of
the new program because the intervention was meant to be revenue neutral. So,
the study outcome we analyze—how much was actually spent for services—tests
whether individuals or Medicaid officials spent more of the same allocated total.

Our methodological study used a small subset of the measures collected in the
original study. For each member of the study, we retained information on age at
baseline, state of residence, and randomly assigned treatment status. We created a
measure of annual Medicaid expenditures by adding up six categories of monthly
expenditures across the 12 months before random assignment (pretest) and after
random assignment (posttest). The six expenditure categories were Inpatient Ex-
penditures, Diagnosis-Related Group Expenditures, Skilled Nursing Expenditures,
Personal Assistance Services Expenditures, Home Health Services Expenditures,
and Other Services Expenditures.2 Throughout, we refer to this six-item index as
“Medicaid expenditures,” and it is the sole outcome in our study.

The summary statistics in Table 1 show that in the RCT, Arkansas had 1,004 par-
ticipants in each of the treatment and control arms, Florida had 906 control and 907
treatment participants, and New Jersey had 869 control and 861 treatment-group
members. In Arkansas, the average participant was 70 years old, compared to 55
in Florida, and 62 in New Jersey. Within each state, average pretest expenditures
were similar in the treatment and control groups, but the level of spending varied by
state. The average person in Arkansas had pretest expenditures of $6,400 compared
to $14,300 in Florida and $18,500 in New Jersey. Mean posttest expenditures were
consistently higher in the treatment groups. Simple intent-to-treat (ITT) compar-
isons imply that the intervention increased average expenditures by about $1,860

2 The claims data included a small number of cases with very high levels of expenditures that could
be either real or data entry errors. To reduce concerns that these outliers would skew our regression
estimates, we top coded the pretest and posttest Medicaid expenditures variable at the 99th percentile of
the pooled distribution of posttest expenditures, which was equal to $78,273. The top coding procedure
affected 89 posttest observations and 79 pretest observations.

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

858 / Methods for Policy Analysis

Table 2. Sample sizes in the nine constructed posttest regression discontinuity designs.

State Age cutoff Below the cutoff Above the cutoff Total

Arkansas 35 59 944 1003
Florida 35 296 609 905
New Jersey 35 106 770 876

Arkansas 50 143 868 1011
Florida 50 417 496 913
New Jersey 50 224 650 874

Arkansas 70 361 623 984
Florida 70 555 359 914
New Jersey 70 491 387 878

(P < 0.01) in Arkansas, $1,856 (P = 0.01) in Florida, and $1,200 (P = 0.09) in New
Jersey. Thus, the Cash and Counseling treatment increased Medicaid expenditures
relative to when Medicaid officials controlled the expenditures.

WITHIN-STUDY RESEARCH DESIGN

To implement the within-study comparison, we created 21 different subsets of the
original RCT data. The first three are the state-specific RCT treatment and control
groups, for which sample sizes and basic descriptive statistics are in Table 1. The
next nine subsets represent state and age specific posttest RDDs based on three
states and three age cutoffs (35, 50, and 70). To create the posttest RDD samples,
we removed from the RCT data all treatment group members younger than the rel-
evant age cutoff and all control group members at least as old as the cutoff. Table 2
shows the sample sizes for the nine posttest RDD subsets. The number of obser-
vations below the cutoff increases with cutoff age. With the cutoff set at 35, there
are many more observations above the cutoff than below; at age 50, observations
are more balanced; and at age 70 balance is best overall. The different age cut-
offs also determine how much extrapolation is required to compute average effects
for everyone above the cutoff. For example, estimating the average effect among
everyone older than 35 requires an extrapolation from 36 to 90. In contrast, esti-
mating the average effect for people over 70 only requires an extrapolation from 71
to 90.

Next, we used Medicaid expenditures from the pre-randomization year to create
nine pretest RDD data subsets based on the same cutoff values and states. With
the pretest and posttest RDD subsets in hand, we created a long-form data set by
stacking the pretest and posttest RDD data, and defined an indicator variable to
identify which observations were from each time period. These stacked data sets
form the pretest RDD. They combine data from the pretest period when no one
of any age had received the treatment, with data from the posttest period when
treatment was available above a specified age cutoff. Stacking the data in this way
results in twice as many observations in the pretest RDD compared to the posttest
RDD because each participant is observed twice.

These procedures resulted in an RCT, a posttest RDD, and a pretest RDD, each
replicated across three states and three age cutoffs. The basic goal of our analysis is
to construct estimates of the same causal parameters using each of these research
designs. Interpreting the RCT estimates as internally valid allows us to measure the
performance of the RDD estimates relative to each other and to the best estimate of
the true effect.

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

Methods for Policy Analysis / 859

METHODS

Implementing the within-study comparison requires (1) defining treatment effects
of interest, (2) specifying estimators for each effect in each design, and (3) developing
measures of performance by which to judge the strengths and weaknesses of each
design.

Parameters of Interest

Throughout the paper, we use i to index individuals, and t = [0, 1] to denote the
pretest and posttest periods. Ai is a person’s (time invariant) age at baseline, and
Preit = 1(t = 0) is a dummy variable that identifies observations made during the
pretest time period. We adopt a potential outcomes framework in which Y(1)it
denotes the ith person’s treated outcome at time t, and Y(0)it denotes the person’s
untreated outcome at time t. The outcome variable in all of our analysis refers to
the person’s Medicaid expenditures over the 12 months prior to period t.Dit is an
indicator set to 1 if the person has received the treatment at time t. In the Cash
and Counseling data, a person is treated if she has the option to control her own
Medicaid-financed home care budget. Since no one received the treatment in the
pretest time period, Di0 = 0 for everyone in the sample. A person’s realized outcome
is Yit = Y(1)it Dit + Y(0)it (1 − Dit ).

To estimate treatment effects at the conventional RDD cutoff and also beyond
it, we define treatment effects conditional on specific ages and age ranges. In
our notation, the average treatment effect in the posttreatment time period for
people who are, say, 70 years old is written as �(70) = E[Y(1)it | Preit = 0, Ai =
70] − E[Y(0)it | Preit = 0, Ai = 70]. If the cutoff value in an RDD was set at age 50,
then �(50) = �(RDD) is the average treatment effect in the cutoff subpopulation
for that particular RDD.

In a conventional RDD, inference is limited to the average treatment effect at the
cutoff. Since part of our analysis is concerned with extrapolating beyond the cutoff,
it is also useful to describe average treatment effects in broader subpopulations.
One way to do this is to consider averages effects across a range of age groups as
relative frequency weighted averages of age-specific treatment effects. For example,
If the cutoff value in an RDD was set at age 50 then would be the average treatment
effect in the cutoff subpopulation for that particular RDD cutoff is

�(m ≥ 50) =
M∑

m=50
�(m) × Pr( Ai = m| Preit = 0)

Pr( Ai ≥ 50 | Preit = 0)
.

In a sharp RDD with a cutoff set at c, the parameter �(m ≥ c) represents the
average treatment effect above the cutoff, which might also be called the ATT:
�(m ≥ c) = �( AT T ). Estimating the ATT parameter requires extrapolation away
from the cutoff, so the ATT parameter is not immediately identified in a standard
RDD. The pretest RDD that we propose provides one mechanism for making credible
extrapolations beyond the cutoff.

Estimation

To estimate the quantities of interest, we used regression methods that account for
unknown functional forms either with kernel weighting or a polynomial series in the
age variable—the two most common methods used in the modern RDD literature.
The use of these flexible models meant that we could not specify a single polynomial
model or a single bandwidth for all the designs and states in the analysis. Instead, we

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

860 / Methods for Policy Analysis

specified a method of selecting polynomial specifications and bandwidth parameters
that was applied uniformly across the designs. In what follows, we describe the
general approach to estimation employed with the RCT, posttest RDD, and pretest
RDD. Then, we explain the model selection algorithm used to guide our choice of
smoothing parameters like bandwidths and polynomial series lengths. The details
regarding the bandwidths and polynomial specifications employed in the analysis
are reported in the Appendix.3

Estimation in the RCT

We estimated age-specific treatment effects using two methods. First, we estimated
local linear regressions of Medicaid expenditures on age separately for the treatment
and control groups. Then, we computed age-specific treatment effects as point-
wise differences in treatment and control regression functions for each age. To
calculate average treatment effects above the cutoff, we weighted these age-specific
differences according to the relative frequency distribution of ages among all of the
treatment and control observations from each state. We computed the frequency
weights separately for each state to account for differences in the age distribution
of each state’s study population.

Since many applied researchers prefer to work with flexible polynomial specifica-
tions rather than kernel-based regressions, we also estimated ordinary least squares
(OLS) regressions of Medicaid expenditures on a polynomial series in age, a treat-
ment group indicator, and interactions between the polynomial series and age for
each state. Treatment effect estimates were computed using the coefficients on the
treatment indicator and the appropriate interaction terms. Average treatment ef-
fects above the cutoff were taken as weighted averages of age-specific differences
with weights equal to the relative frequency of each age in the state sample.

Estimation in the Posttest RDD

We estimated treatment effects in the posttest RDDs using both kernel and poly-
nomial series regression methods. To implement the kernel regression approach,
we estimated treatment effects at the cutoff using local linear regressions applied
separately to the data from above and below the cutoff in each state. Treatment
effects at the cutoff were calculated using the difference in the estimates of mean
Medicaid expenditures at the cutoff.

To implement the polynomial series methods, we pooled data from above and
below the cutoff and estimated OLS regressions of Medicaid expenditures on a
polynomial in age, a dummy variable set to 1 for observations above the cutoff, and
interactions between the age polynomial series and the cutoff dummy variable. In
these posttest RDD analyses, we computed treatment effects only at the cutoff. We
did not make extrapolations based on the functional form implied by the polynomial
regression coefficients because of the well-known tendency of polynomial series
estimates to have very poor out-of-sample properties.

Estimation in the Pretest RDD

The pretest RDD combines pretest and posttest RDD data, and for our purposes
the key idea is that information about the relationship between the assignment

3 All appendices are available at the end of this article as it appears in JPAM online. Go to the pub-
lisher’s Web site and use the search engine to locate the article at http://www3.interscience.wiley.com/
cgi-bin/jhome/34787.

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

Methods for Policy Analysis / 861

variable and the outcomes during the pretest period may provide a sound basis for
extrapolation beyond the assignment cutoff in the posttest time period. To put the
idea into practice, we specify a flexible model of the untreated outcome variable in
the pretest and posttest periods that accounts for simple nonequivalencies between
the two periods. In particular, we consider models in which pretest and posttest
untreated outcome regression functions differ by a constant across all ages:

Y(0)it = PreitθP + g( Ai ) + νit.
In this model, θP represents the fixed difference in conditional mean outcomes
across the pretest and posttest periods, and g(.) is an unknown smooth function that
is assumed to be constant across the two periods. We assume that E[νit | Preit, Ait ] =
0. In essence, our model assumes that the difference between the mean untreated
potential outcome in the pretest and posttest time periods does not vary across
subpopulations defined by the assignment variable. The assumption that there is
an assignment variable invariant time period effect is important. It implies that,
after adjusting for the constant period effect, the underlying regression relationship
between the outcomes and the assignment variable function can be recovered across
the entire range of the assignment variable in the pretest time period, and then
applied to the posttest period. This implication is what makes extrapolation possible.

Similar fixed effect restrictions are widely used in the analysis of longitudinal data
(Wooldridge, 2011), though with standard panel data models the assumptions are
somewhat stronger than in RDD because such models usually pair a fixed effects
assumption with a specific functional form assumption for a vector of time-varying
covariates. The point here is that the pretest RDD model is agnostic with respect
to the functional form associated with the assignment variable, but it does impose
the restriction that the shape of the function does not change across the two time
periods except for a change in level that is attributable to the time period effect.
Clearly, the accuracy of extrapolations away from the cutoff depends on the validity
of the assumption that the time period effect is age invariant. In the next section, we
present evidence that this particular assumption is credible in the Cash and Coun-
seling data, so our within-study comparisons represent a test of the performance of
the pretest RDD method in a situation where the core assumptions appear plausi-
ble. Readers should note, of course, that applying our methods in situations where
the constant period effect assumption is implausible would likely lead to very poor
performance.

With the basic pretest RDD model of the untreated outcomes defined, we turn
to methods for estimating treatment effects using the pretest RDD. The first task
is to estimate the untreated outcome regression function. As usual, one approach
involves approximating the unknown smooth function, g( Ai ), using a polynomial
series. For instance, one might specify a Kth order polynomial series and estimate
model parameters using and OLS regression such as

Y(0)it = PreitθP +
K∑

k=0
δk A

k
i + νit.

The equation can be estimated by applying OLS to all of the untreated cases in
the sample. The key point is that the untreated sample includes pretest Medicaid
expenditures from the full range of ages and also the posttest Medicaid expenditures
of people under the design’s age cutoff. In this setting, ĝ(a) = ∑Kk=0 δ̂kak represents an
estimate of E[Y(0)i | Preit = 0, Ai = a]. The extrapolations beyond the cutoff are now
made with what might be called partial empirical support. Rather than extrapolate
outside the range of the data, extrapolations are made to the posttest outcomes on

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

862 / Methods for Policy Analysis

the support of pretest data under the assumption that estimates of θP are sufficient
to account for any between period nonequivalence.

This method provides estimates of the untreated outcome function, but to form
treatment effect estimates, we also require estimates of the treated outcome func-
tion. An obvious strategy is to estimate polynomial regressions of expenditures on
age using posttest data from those sample members who are above the age cutoff.
Then, treatment effects can be computed at the cutoff using differences between
the fitted value of the treated and untreated regression functions. Average treatment
effects among all observations above the cutoff can be formed by computing age-
specific treatment effects for each age above the cutoff and then forming a weighted
average of these differences based on the relative frequency of the ages above the
cutoff.

A second way to implement the pretest RDD model is to estimate a version of
Robinson’s (1988) partial linear regression model. The model exploits the same as-
sumptions that g(.) is a smooth function and that E[νit | Preit, Ai ] = 0, but it also
requires a support condition so that the pretest indicator in the parametric com-
ponent of the model is not a deterministic function of the assignment variable.
Formally, the requirement is that V ar (Preit | Ai ) > 0. The support condition fails by
definition in the full sample because there are no untreated RDD observations above
the cutoff determining treatment. Our solution is to estimate the parametric period
effect using only observations that fall on the common support of the different time
periods. In practice, this means that we estimate the period effect using only the
below the cutoff data from the two time periods. Then, with estimates of θ p in hand,
we estimate the nonparametric component using the full sample of observations
both above and below the cutoff value.

We calculated treatment effects at the cutoff using differences in the predicted val-
ues from local linear regressions among treated observations from the posttest time
period and predicted values from the partially linear model. And, we constructed
average treatment effects above the cutoff by taking age-specific differences be-
tween the predicted values from the two models and weighting them by the relative
frequency of each age in each state sample.

The Validity of the Pretest RDD Assumption

A key issue for the analysis of the pretest RDD is the empirical validity of the assump-
tion that the period effect is age-invariant, or that—equivalently—the cross-sectional
age-expenditure profile in our sample would not have changed over a one-year time
horizon in the absence of a treatment effect. We conducted two simple tests of this
assumption. First, we computed the pre–post change in Medicaid expenditures for
each person in the control group from each state. Figure 1 shows scatter plots of
these person-specific time-period effects against age for the three control groups.
Because the plots are based only on the control group, the change scores represent
pure time effects that have nothing to do with the treatment. The scatter plots reveal
no evidence of an age-biased pattern of time-period effects in any state. To test the
time-invariant age-effect assumption more formally and using a modeling frame-
work that is closer to the one we use in our analysis, we regressed control group
Medicaid expenditures on a cubic function of age, a postperiod indicator variable,
and interactions between the age terms and the postindicator separately for each
state. The estimated coefficients are reported in Table 3. The coefficients on the in-
teraction terms are not statistically different from zero in any of the states implying
that functional relationship between age and Medicaid expenditures did not change
between the two periods under analysis. Figure 1 and Table 3 suggest that, in our
data, the central assumption in the pretest RDD is a reasonable one. Although these

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

Methods for Policy Analysis / 863

-50,000

0

50,000

100,000

-50,000

0

50,000

100,000

-50,000

0

50,000

100,000

20 40 60 80 100

Arkansas

Florida

New Jersey

C
h
a

n
g

e
i
n

M
e

d
ic

a
id

E
x
p

e
n

d
it
u

re
s

Baseline age

Note: The graph plots within-person expenditure changes against age in the three experimental control
groups. The graphs are consistent with the assumption of a small period effect that affected each age
group in the same way.

Figure 1. One-Year Changes in Expenditures by Age in the Control Groups.

empirical results are encouraging for our work, we think researchers are best ad-
vised to evaluate the credibility of the pretest RDD assumptions on conceptual and
theoretical grounds rather than a program of statistical testing. In the application
at hand, we think the assumption is plausible because it is simply unlikely that the
cross-sectional relationship between age and medical expenditures changes much
over a one-year time horizon.

Although statistical tests and graphical evidence are always welcome, it is impor-
tant to note that in real-world applications researchers will only be able to conduct
such an analysis using data from the region of the assignment variable that falls
below the cutoff. For instance, if the cutoff was set at age 50, then a researcher
would only be able to inspect the change scores for people under age 50. If the
assumption that the period effect did not vary with age seemed reasonable below
the cutoff, the researcher would still be forced to accept the additional assumption
that the invariance assumption continued to hold above the cutoff.

Procedures for Choosing Smoothing Parameters

Each of the methods described above depends on assumptions about the degree of
smoothing to allow across the different age groups. In the local linear regressions,
smoothing is controlled by a kernel function and a bandwidth parameter. In the
partially linear model, a separate bandwidth is required for the two preliminary
regressions and for the ultimate residualized regression model. And in the polyno-
mial series regressions, the amount of smoothing is determined by the degree of the
polynomial function. The point of these flexible modeling approaches is to allow the

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

864 / Methods for Policy Analysis

Table 3. A regression-based test of whether the age–expenditure relationship in the control
group differed in the pretest and posttest samples.

Variable Statistic Arkansas New Jersey Florida

Age Coefficient 175.54 1,297.3 147.22
SE 592.36 722.52 491.32
P 0.77 0.07 0.76

Age squared Coefficient −5.86 −23.06 −4.56
SE 9.77 13.14 9.13
P 0.55 0.08 0.62

Age cubed Coefficient 0.04 0.12 0.03
SE 0.05 0.07 0.05
P 0.47 0.11 0.6

Post Coefficient 1,300.71 −8,718.18 11,410.03
SE 6,336.86 9,516.4 5,005.54
P 0.84 0.36 0.02

Age × post Coefficient −106.59 417.28 −373.14
SE 353.41 580.57 331.67
P 0.76 0.47 0.26

Age squared × post Coefficient 3.46 −6 5.38
SE 6.05 10.63 6.39
P 0.57 0.57 0.4

Age cubed × post Coefficient −0.03 0.03 −0.03
SE 0.03 0.06 0.04
P 0.42 0.61 0.5

Intercept Coefficient 9,849.19 −700.2 15,545.24
SE 11,173.13 12,119.55 7,775.39
P 0.38 0.95 0.05

N 2,008 1,738 1,812
R2 0.074 0.018 0.052

data to determine the functional form specification; however, some arbitrariness is
inevitably associated with selecting these smoothing parameters, and so a model
selection protocol needs to be specified in advance of data analysis.

To this end, we selected bandwidth parameters by using least-squares cross-
validation to evaluate a grid of candidate bandwidths ranging from 1 to 90 years
in width. We then inspected the function produced by using the bandwidth that
minimized the cross-validation statistic.4 When visual inspection revealed that the
bandwidth chosen by cross-validation led to a function that appeared jagged and
undersmoothed, we increased the bandwidth to produce a more regular function.
Details about the selected bandwidth for each of the research design are in the
Appendix.5

To choose a polynomial functional form, we used least-squares cross-validation
to evaluate a set of candidate models that included linear, quadratic, cubic, and
quartic polynomial functions, and also models that fully interacted the polynomial
terms with a treatment-group indicator variable. In the within-study comparisons,

4 The cross-validation statistic we worked with is the mean squared out of sample prediction error
formed by predicting the value of each observation when it is left out of the estimation.
5 All appendices are available at the end of this article as it appears in JPAM online. Go to the pub-
lisher’s Web site and use the search engine to locate the article at http://www3.interscience.wiley.com/
cgi-bin/jhome/34787.

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

Methods for Policy Analysis / 865

we always worked with the polynomial models that minimized the cross-validation
function. That is, we conducted the cross-validation for each of the candidate spec-
ifications and chose the specification that produced the smallest mean square error
of out-of-sample predictions. The specific functional forms used in each part of the
within-study comparison are reported in the Appendix.6

The need to choose smoothing parameters is one area of RDD that seems always
to leave room for investigator manipulation. We worked hard to define a selection
procedure that was separate from the within-study comparison component. Impor-
tantly, we did not assess the performance of the alternative estimators until after
settling on the bandwidths and polynomial series lengths. It is also worth noting
that the bandwidth dependent methods (which supplemented the data-driven cross-
validation procedure with more subjective assessments of undersmoothing) led to
treatment effect estimates that were very similar to the effects estimated by the poly-
nomial series methods (which relied exclusively on the data-driven cross-validation
procedure). The consistency between the two sets of results gives us some confi-
dence that the bandwidth selection procedure was not an important determinant of
our results. We also conducted a small sensitivity analysis to better understand the
extent to which our results were dependent on the chosen bandwidths. We reesti-
mated the RCT benchmark estimates that were used throughout the analysis using
bandwidths that were half the size of our preferred bandwidth in each state. Across
the 18 RCT benchmark parameters of interest, the average difference in point esti-
mates between the preferred bandwidth models and the half-sized bandwidths was
$11, and the average difference in standard errors was −$90. We pursued a similar
exercise with the pretest RDD by reducing the bandwidth used in the residualized
regression stage of the partially linear model to half of our preferred size. Here, we
found that average difference in point estimates between the results produced using
our preferred bandwidth and the half-sized bandwidths was −$72 and the average
difference in estimated standard errors was $21. This small sensitivity analysis sug-
gests that our main results are unlikely to depend heavily on bandwidth parameters
within a reasonable range of what we ultimately considered optimal.

Estimating Standard Errors

To ensure comparability across the different designs, we used a nonparametric boot-
strap to estimate standard errors for all treatment effect estimates. We always used
500 bootstrap replications. Point estimates were recalculated for each replicate, and
the standard deviation of the point estimates across the 500 replicates was used as
the bootstrap estimate of the standard error. Bandwidths, polynomial functional
forms, and relative frequency weights for computing above the cutoff averages were
fixed across bootstrap replicates. In the pretest RDD designs, we resampled individ-
ual participants rather than individual observations to account for within-person
dependencies in the error structure.

Measuring Performance

It is straightforward to compare the point estimates and standard errors from the
posttest RDD and pretest RDD to those of the RCTs, but to draw conclusions across
the different treatment effect parameters in each design, we also examined two stan-
dardized performance measures. The first is a measure of the standardized bias of

6 All appendices are available at the end of this article as it appears in JPAM online. Go to the pub-
lisher’s Web site and use the search engine to locate the article at http://www3.interscience.wiley.com/
cgi-bin/jhome/34787.

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

866 / Methods for Policy Analysis

the quasi-experimental point estimate. Let πq be the point estimate of a given param-
eter produced by quasi-experimental estimator, q. And let πRCT be the point estimate
of the same parameter produced by the RCT. Finally, let σRCT be the standard devia-
tion of posttest Medicaid expenditures observed in the RCT. The standardized bias
measure that we worked with is SBq = (πq − πRCT ) × 1σRCT . Essentially, SBq measures
the magnitude of the bias in a particular quasi-experimental estimate in standard
deviation units. We computed this measure for each parameter estimated with each
cutoff age, state, and research design. It provides a uniform account of the size of
the bias across different causal parameters and research designs, but it does not in-
corporate any information about the statistical precision of the different estimates.

The second performance measure combines bias and variance estimates in a
mean squared error framework. To compute the mean square error statistic, we
centered the point estimates from each bootstrap replicate around the experimen-
tal benchmark. Then, we squared these deviations and computed the average of
the squared deviations across the 500 bootstrap replicates. Formally, the statistic
we work with isMSE(πq ) = 1B

∑B
b=1(πq(b) − πRCT )2, where b = 1 . . . B indexes the boot-

strap replicates. To keep the scale of the statistic interpretable in dollar terms, we
report the square root of the root mean square error (RMSE) in the results. The
basic idea is that any particular estimate of the treatment effect will differ from the
RCT benchmark because of both bias and statistical precision. The RMSE statistics
measure the size of the typical error associated with a given estimation technique
by using the RCT to form a benchmark and the bootstrap replications to evaluate
variability. Estimators with smaller levels of error provide answers that are closer
to the truth on average than estimators with higher levels of error, and the RMSE
statistic formalizes this logic. As a performance metric, the RMSE statistic gives
equal weight to improvements in correspondence of the quasi-experimental esti-
mates with the RCT benchmarks that come from changes in both the bias and the
variance of the estimator. Of course, some researchers may value improvements in
correspondence that come from bias reduction differently from improvements that
come from variance reduction. In presenting the results, we take care to present
estimates of standardized bias, standard errors, and RMSE for each research design
so that readers can form their own conclusions about performance.

RESULTS

Causal Benchmarks from the RCT

Figures 2 to 4 plot estimates of average Medicaid expenditures by age in the treat-
ment and control groups from each state across all ages. Each graph includes es-
timates based on the polynomial series estimator and the local linear regressions,
and the two estimation approaches yield very similar results. It is also clear that the
expenditure-age profile varies across the states: The relationship is linear in Florida,
highly nonlinear in New Jersey, and somewhat nonlinear in Arkansas.

Age-specific treatment effects can be constructed by forming point-wise differ-
ences between the treatment and control group regression functions. Table 4 reports
RCT estimates of treatment effects for each of the three age cutoffs and also for the
total group above each age cutoff. We treat the RCT point estimates as unbiased
benchmarks, so the standardized bias metric is not reported in Table 4. Note that the
RMSE estimates allowed for the possibility that the average of the bootstrap point
estimates differed from the original point estimates due to finite sample bias. This
means that the RMSE statistics are not theoretically identical to the standard error
of the effect estimates. However, as might be expected, there was very little evidence
of finite sample bias in the estimates from the RCT data, so the RMSE statistics

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

Methods for Policy Analysis / 867

5,000

10,000

15,000

20,000

25,000

P
o

s
tt
e
s
t
M

e
d
ic

a
id

E
x
p

e
n

d
it
u

re
s

20 40 60 80 100
Baseline age

Control LLR Treated LLR

Control polynomial Treated polynomial

Arkansas

Note: The graph plots local linear regression and polynomial series regression estimates of the average
posttest Medicaid expenditures by age for RCT treatment and control participants in Arkansas. These
estimates form the basis for the causal benchmarks used in the within-study comparisons.

Figure 2. Benchmark Estimates from the RCT in Arkansas.

16,000

18,000

20,000

22,000

24,000

P
o

s
tt
e
s
t
M

e
d
ic

a
id

E
x
p

e
n

d
it
u

re
s

20 40 60 80 100
Baseline age

Control LLR Treated LLR

Control polynomial Treated polynomial

New Jersey

Note: The graph plots local linear regression and polynomial series regression estimates of the average
posttest Medicaid expenditures by age for RCT treatment and control participants in New Jersey. These
estimates form the basis for the causal benchmarks used in the within-study comparisons.

Figure 3. Benchmark Estimates from the RCT in New Jersey.

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

868 / Methods for Policy Analysis

10,000

15,000

20,000

25,000

30,000
P

o
s
tt
e
s
t
M

e
d
ic

a
id

E
x
p

e
n

d
it
u

re
s

20 40 60 80 100
Baseline age

Control LLR Treated LLR

Control polynomial Treated polynomial

Florida

Note: The graph plots local linear regression and polynomial series regression estimates of the average
posttest Medicaid expenditures by age for RCT treatment and control participants in Florida. These
estimates form the basis for the causal benchmarks used in the within-study comparisons.

Figure 4. Benchmark Estimates from the RCT in Florida.

for the RCT benchmarks were essentially identical to the bootstrap SEs. To avoid
reporting a redundant column of results, we report the RMSE and SE statistics for
the RCT benchmarks in a single column in the tables throughout the paper.

Three things stand out in Table 4. First, the polynomial and local linear estimates
are very similar in all three states. Second, the age-specific RCT treatment effects
have large RMSE scores—an expected finding since the Cash and Counseling RCT
was not designed to estimate treatment effects in one-year age brackets. And third,
the estimates of average effects across all participants older than the cutoff are more
stable than estimates at the cutoff, particularly for the younger age cutoffs where
most observations fall above the cutoff. For the age 70 cutoff, the RMSE statistics
are almost the same at the cutoff as above it.

Treatment Effects at the Cutoff

Table 5 compares the performance of the three research designs at the cutoff in
terms of standardized bias, standard error, and RMSE as described earlier.7 Both
the posttest RDD and the pretest RDD performed quite well in terms of standardized
bias at the cutoff. For instance, 12 of the 18 posttest RDD estimates had absolute
standardized bias of less than 0.2 standard deviations. In comparison, 13 of the 18
pretest RDD were biased by less than 0.2 standard deviations. Table 5 suggests that
both the posttest RDD and the pretest RDD performed better in terms of bias in
within-study comparisons with older age cutoffs where the sample size was more
balanced and more concentrated near the cutoff.

7 The standardized bias measure is suppressed for the RCT because it serves as the causal benchmark
so that the standardized bias is always zero.

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

Methods for Policy Analysis / 869

Table 4. Benchmark treatment effect estimates at and above the cutoff for each design type
based on the RCT.

Panel A: RCT benchmarks for designs with a cutoff at age 35
ATE at age 35 ATE above age 35

State Estimation method
Point

estimate RMSE/SE
Point

estimate RMSE/SE

Arkansas Polynomial series 2,980 1,334 1,703 258
New Jersey 3,202 1,469 622 718
Florida 3,329 1,590 788 729
Arkansas Local linear regression 2,738 1,331 1,772 256
New Jersey 3,529 1,594 755 724
Florida 3,456 1,139 741 562

Panel B: RCT benchmarks for designs with a cutoff at age 50
ATE at age 50 ATE above age 50

State Estimation method
Point

estimate RMSE/SE
Point

estimate RMSE/SE

Arkansas Polynomial series 1,467 616 1,671 250
New Jersey 822 1,155 387 728
Florida 2,034 978 357 795
Arkansas Local linear regression 1,547 887 1,752 239
New Jersey 611 1,235 529 727
Florida 2,191 774 224 604

Panel C: RCT benchmarks for designs with a cutoff at age 70
ATE at age 70 ATE above age 70

State Estimation method
Point

estimate RMSE/SE
Point

estimate RMSE/SE

Arkansas Polynomial series 1,134 410 1,872 249
New Jersey −84 902 562 786
Florida 739 732 0 901
Arkansas Local linear regression 1,294 335 1,934 238
New Jersey 466 788 679 843
Florida 625 569 −180 677

To facilitate comparisons across the designs, the first panel of Figure 5 plots
the standardized bias of the pretest and posttest RDD estimates for each of the 18
within-study comparisons. In the graph, the dashed 45◦ line in the graph marks
equality of bias in the two designs. The circles above the line are within-study
comparisons in which the pretest RDD was more biased than the posttest RDD. The
circles below the line are comparisons in which the posttest RDD had lower bias.
The results are distributed around the 45◦ line, which shows that in some cases
the pretest RDD reduced bias slightly and in other cases it increased bias slightly.
One important point, however, is that most of the points in the graph are in the
bottom left corner, which makes it clear that both the posttest RDD and the pretest
RDD produce estimates of the treatment effect at the cutoff with very little bias.
Since both the RCT and non-RCT parameters are estimated with error and exact
point correspondences are therefore very likely, these results confirm conclusions
from other within-study comparisons that the usual posttest-only RDD provides
estimates that are quite close to the results from a comparable RCT (Cook, Shadish,

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

870 / Methods for Policy Analysis

Table 5. Performance of the posttest and pretest RDD at the cutoff.

Panel A: Age 35 cutoff
Posttest RDD Pretest RDD

State
Estimation

method
RCT

RMSE/SE SE
SD
bias RMSE SE

SD
bias RMSE

Arkansas Polynomial
series

2,980 2,583 0.49 4,573 2,949 −0.07 3,019
New Jersey 3,202 2,338 0.24 5,391 2,120 0.21 4,962
Florida 3,329 3,715 −0.12 4,388 3,313 −0.22 5,351
Arkansas Local linear

regression
2,738 3,563 0.45 4,804 2,114 0.01 2,150

New Jersey 3,529 5,010 0.2 6,629 1,634 0.15 1,546
Florida 3,456 4,159 −0.06 4,299 3,226 −0.15 3,338
Panel B: Age 50 cutoff

Posttest RDD Pretest RDD

State
Estimation

method
RCT

RMSE/SE SE
SD
bias RMSE SE

SD
bias RMSE

Arkansas Polynomial
series

616 1,267 0.04 1,293 941 0.99 7,563
New Jersey 1,155 3,676 −0.1 4,355 2,292 0.11 3,226
Florida 978 3,359 0.25 6,035 2,743 0.22 5,073

Arkansas Local linear
regression

887 2,300 0.32 3,340 2,018 0.51 1,889
New Jersey 1,235 3,689 −0.03 3,778 2,662 0.05 2,695
Florida 774 4,705 0.06 4,863 3,943 −0.04 3,746
Panel C: Age 70 cutoff

Posttest RDD Pretest RDD

State
Estimation

method
RCT

RMSE/SE SE
SD
bias RMSE SE

SD
bias RMSE

Arkansas Polynomial
series

410 894 −0.03 911 536 0.06 721
New Jersey 902 2,322 0.04 2,455 1,937 0.04 2,093
Florida 732 1,785 −0.07 2,357 1,387 −0.17 3,703
Arkansas Local linear

regression
335 857 −0.06 976 521 0.07 532

New Jersey 788 2,377 0.01 2,378 1,645 0.08 1,676
Florida 569 2,642 0.08 2,968 2,223 −0.02 2,196

& Wong, 2008; Green et al., 2009; Shadish et al., 2011). They also suggest that efforts
to supplement the posttest RDD with data from a pretest time period does not lead
to much in the way of additional bias.

As for standard error comparisons, Table 5 shows that both the pretest and
posttest RDD estimates are larger than the RCT estimates of standard error. In the-
ory, the age-specific treatment effects from the RCT are estimated more precisely
partly because the way we constructed the RDD reduced the sample size relative
to the RCT, and partly because the RDD assignment rules induce a correlation be-
tween the age variable and the treatment variable (Shochet, 2009; Goldberger, 1972a,
1972b). Table 5 also shows that the pretest RDD estimates have smaller standard
errors than the posttest RDD in all but one within-study comparison. The efficiency
gains from the pretest RDD can arise because of the larger sample sizes and also
because the pretest RDD assignment rule reduces the correlation between the age
variable and the treatment variable so that variance inflation from multicollinearity
may be less in the pretest RDD than in the RDD. All of the designs are more precise

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

Methods for Policy Analysis / 871

0

.25

.5

.75

1

P
re

te
s
t

R
D

D
:

A
b

s
o

lu
te

S
ta

n
d

a
rd

iz
e

d
B

ia
s

0 .25 .5 .75 1

Posttest RDD: absolute standardized bias

pretest RDD vs. RDD

Standardized bias at the cutoff

0

1,000

2,000

3,000

4,000

5,000

6,000

7,000

8,000

P
re

te
s
t

R
D

D
:

R
M

S
E

0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000

Posttest RDD: RMSE

pretest RDD vs. RDD

Mean squared error at the cutoff

Note: The graphs plot measures of the performance of the pretest and posttest RDD estimators across
the different within-study comparisons in our analysis. Standardized bias is shown in the left panel and
RMSE statistics are in the right panel. The y-axis measures the performance of the pretest RDD strategy.
The x-axis reports measures performance in the corresponding posttest RDD. The dashed 45◦ line marks
points at which the two designs have equal performance. Points that fall above the 45◦ line represent
within-study comparisons in which the posttest RDD outperformed the pretest RDD.

Figure 5. Comparative Performance of the Pretest and Posttest RDD at the Cutoff.

in within-study comparisons with older age cutoffs because the observations are
more densely distributed near the cutoff.

One reaction to the idea of supplementing the RDD with pretest data is that
researchers may face a trade-off between the efficiency and extrapolation gains from
the new data and the possibility that bias will arise because the stronger identifying
assumptions may fail to hold. The mean squared error criterion provides a way of
gauging the net effect of changes in the bias and variance of estimates that arise
from different estimation strategies. The RMSE statistics that are reported in Table 5
measure the size of the average error in dollars that is associated with the posttest
RDD and pretest RDD estimates of the treatment effect at the cutoff. In the RCT, the
RMSE ranges from $335 at the oldest age cutoffs to around $3,500 at the youngest
age cutoff. The RMSE of the posttest RDD is larger than the corresponding RCT
estimate in all 18 within-study comparisons. In contrast, the pretest RDD actually
has a slightly lower RMSE in the three within-study comparisons in which the
cutoff was set at age 35, and the local linear regression approaches were used for
estimation. In general, all three designs had less error when the cutoff was fixed at
older ages.

The right-hand panel of Figure 5 compares the RMSE statistic from the pretest
RDD and posttest RDD in each of the within-study comparisons. As before, the 45◦
line marks points at which the two designs have the same average error. The circles
above the line indicate comparisons where the pretest RDD had more error on av-
erage than the posttest RDD, and the circles below the line are from comparisons in
which the pretest RDD more reliably replicated the RCT benchmark than the posttest
RDD. In the majority of within-study comparisons, supplementing the posttest RDD

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

872 / Methods for Policy Analysis

Table 6. Performance of the pretest RDD at extrapolation above the cutoff.

Panel A: Age 35 cutoff
Pretest RDD

State Estimation method RCT RMSE/SE SE SD bias RMSE

Arkansas Polynomial series 258 1,312 0.19 1,859
New Jersey 718 1,394 0.2 4,351
Florida 729 841 −0.11 2,344
Arkansas Local linear regression 256 1,096 0.07 1,201
New Jersey 724 1,181 0.19 4,056
Florida 562 704 −0.09 1,961
Panel B: Age 50 cutoff

Pretest RDD

State Estimation method RCT RMSE/SE SE SD bias RMSE

Arkansas Polynomial series 250 877 0.11 1,203
New Jersey 728 1,176 0.09 2,349
Florida 795 738 −0.09 1,920
Arkansas Local linear regression 239 709 0.05 838
New Jersey 727 1,002 0.13 2,890
Florida 604 704 −0.08 1,755
Panel C: Age 70 cutoff

RDD

State Estimation method RCT RMSE/SE SE SD bias RMSE

Arkansas Polynomial series 249 469 0.09 822
New Jersey 786 933 0.14 3,116
Florida 901 891 −0.05 1,461
Arkansas Local linear regression 238 449 0.04 544
New Jersey 843 886 0.12 2,644
Florida 677 640 −0.04 992

with data from the pretest time period improved the correspondence between the
quasi-experiment and the RCT in terms of estimating the treatment effect at the
cutoff. Overall, our within-study comparisons provide considerable support for ef-
forts to supplement the standard posttest RDD with data from a pretest time period.
Adding the pretest led to only small changes in bias that would be of little substan-
tive interest. And the results from the RMSE statistics suggest that the reductions in
variance swamp the changes in bias in most cases so that adding the pretest leads
to better correspondence with the RCT.

Extrapolation Beyond the Cutoff

Adding the pretest does more than improve correspondence with the RCT at the
cutoff: It also provides a way to extrapolate beyond the cutoff subpopulation. Table 6
reports the performance of the estimates of the average treatment effect among all
subjects above the cutoff based on the pretest RDD extrapolations. The results are
again presented in terms of standardized bias, standard errors, and RMSE statistics
from each of the within-study comparisons. Since no theorists argue that causal
estimates beyond the cutoff are warranted for the posttest RDD, discussion is limited
below to comparison of results from the pretest RDD and the RCT.

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

Methods for Policy Analysis / 873

All of the extrapolations had standardized bias of less than 0.2 standard deviations.
Indeed, the median standardized bias of the estimates across all 18 within-study
comparisons was only 0.09 standard deviations. Even within the narrow range of
standardized bias estimates we observed, the extrapolations beyond the cutoff were
usually less biased in the comparisons with older age cutoffs. This makes sense
because less extrapolation is required when the cutoff is fixed at an older age. The
main finding, though, is how consistently low the pretest RDD bias estimates are
above the cutoff.

The standard error of the extrapolated treatment effects are larger in the pretest
RDD than in the RCT, particularly for the youngest age cutoff. For instance, on
average, the standard errors from the pretest RDD are about 2.5 times larger than
the standard errors from the RCT when the cutoff is set at age 35. But the average
ratio is only 1.9 when the cutoff is set at 50 and only 1.3 when the cutoff is set at 70.
The RMSE of the extrapolations based on the pretest RDD follow a similar pattern.

Evaluating the performance of the pretest RDD at extrapolations beyond the
cutoff is difficult because there is no uniform measure of what constitutes good
performance. One line of thinking goes as follows. There is a growing consensus
among applied researchers that the posttest RDD provides high-quality estimates
of treatment effects at the cutoff. If we accept that the performance of the RDD is
a reasonable standard of good performance, then we can compare the properties
of the pretest RDD extrapolations to the properties of the posttest RDD estimates
at the cutoff. Across the 18 within-study comparisons presented here, the average
standardized bias at the cutoff in the posttest RDD estimates was about 0.095 stan-
dard deviations. In contrast, the average standardized bias of the extrapolations
away from the cutoff based on the pretest RDD was only 0.05 standard deviations.
If the posttest RDD is lauded as an estimator with an acceptably low level of bias,
then these extrapolations beyond the cutoff seem to meet an even higher standard
of fidelity to the RCT results. Similarly, the average RMSE was about $3,655 across
all of the posttest RDD estimates at the cutoff, and it was only $2,017 across all of
the pretest RDD extrapolations beyond the cutoff. These arguments provide some
support for our claim that the extrapolations beyond the cutoff are high quality
relative to the standards that researchers apply to quasi-experimental research.

DISCUSSION

The results from the within-study comparisons show some specific ways in which
the pretest RDD can shore up key weaknesses of the standard RDD. First, adding
the pretest data led to more statistically precise estimates of the treatment effect
at the cutoff without incurring a substantial penalty in terms of bias. Second, the
pretest design improves the justification for extrapolations away from the cutoff
by providing information about the relationship between the untreated outcome
and the assignment variable across the full range of the assignment variable. And
finally, supplementing the basic RDD with a pretest measure of the outcome led to
estimates of the average treatment effect above the cutoff that were very similar to
those produced by the RCT.

The key risk in supplementing the posttest RDD with data from a pretest time
period is the possibility of additional bias that might arise if the fixed period effect
assumption fails to hold. This does not seem to have been an important consequence
in these within-study comparisons. Indeed, when both bias and variance are consid-
ered together in the RMSE statistic, the pretest RDD performed much better than
the posttest RDD at the cutoff, and the extrapolations beyond the cutoff met an even
higher standard of performance. Of course, our within-study comparisons also sup-
port the general superiority of the RCT in terms of both bias and statistical precision.

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

874 / Methods for Policy Analysis

The present study has some inevitable external validity limitations. The paper is
framed in terms of a sharp RDD. To extend our basic approach to the fuzzy case
requires additional assumptions both about the joint distribution of heterogeneous
treatment effects and compliance status, and also about the stability of that joint
distribution over time and across levels of the assignment variable. These additional
assumptions may or may not be credible in particular instances of applied work. We
leave to future work the application of within-study comparisons to fuzzy designs.
Another limitation comes from the fact that only one data set is used here, and we
have no guarantee that similar results would be achieved with other data sets with
different characteristics than ours. This issue is an important concern with within-
study comparisons and with case studies more generally. The weakness disappears
to some extent when the literature is considered as a whole and when one attaches
value to the insights into analytical choices and modeling assumptions that may
be produced simply by engaging in the often complicated effort to evaluate the
performance of a particular research design in a particular setting.

It is important not to confuse our use of pretest assessments in RDD with other
legitimate ways that a pretest may be used. We treated the pretest as a comparison
data set rather than as covariates. Using the pretest as a covariate in a regression
framework will increase statistical power (Lee & Lemieux, 2010), but will not facil-
itate the extrapolation beyond the cutoff that we have emphasized here. The pretest
RDD estimator we used also shares some important features with the difference-
in-differences (DID) design. But the two are not identical. Our approach exploits
the smoothness assumptions that are central to RDD and uses pretest information
about functional form to justify extrapolating beyond the cutoff.

One of the basic findings of our study is that supplementing the RDD with data
from a pretest time period can improve the precision of the estimates without
substantial costs in terms of bias. There are at least two mechanisms through which
the pretest RDD achieves these efficiency gains. The first is an increase in the number
of observations available for analysis. These gains may seem trivial, but they should
not be ignored since pretest assessments are relatively common and, as we have
seen, can improve statistical power and causal credibility without much threat of
bias. In contrast, data generated according to a standard RDD assignment rule are
relatively rare, and researchers may find it difficult to collect additional RDD data
in the pursuit of improved statistical power.

A second way that the pretest RDD could improve statistical power is by altering
the design effect associated with the RDD (Schochet, 2009). For example, combin-
ing data from a pretest and posttest time period may change the variability in the
treatment variable across the pooled sample, and it may also alter the degree of
collinearity between the treatment variable and the assignment variable. In empiri-
cal settings where the pooled data set increases variability in treatment and reduces
collinearity, the pretest RDD will produce efficiency gains that are independent of
the sample size.

Pretest measures of the outcome are not the only design elements (Corrin &
Cook, 1998) that improve functional form estimation, statistical power, and causal
generalization. Repeated cross-sectional samples are also possible. Lohr’s study
reported in Cook and Campbell (1979) concerned how the introduction of Medicaid
affected the number of doctor visits in a nationally representative sample. Household
income was the assignment variable, an income threshold adjusted for family size
was the cutoff, and the number of doctor visits in the year after the introduction
of Medicaid was the posttest. The supplemental RDD element was a representative
sample of families and their doctor visits from the year before Medicaid became
available. Lohr did not perform all the analyses presented in Cook and Campbell
paper, but he demonstrated that functional forms were very similar in the untreated

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

Methods for Policy Analysis / 875

part of the assignment variable in both the pretest and posttest samples, suggesting
that this cohort-based design supplement may also perform well beyond the cutoff.

Other RDD supplements that could be considered include contemporaneous but
nonequivalent comparison groups of people who were not offered the treatment.
Depending on context, nonequivalent control groups might come from a different
geographical area or from institutions where the treatment was not available, thus
from another city, state, school, or workplace, and even groups matched on pre-
treatment covariates to make the regression functions more comparable. In the
early stages of this study, we explored pairing the RCT control group from one state
with the nonequivalent comparison group for another state, but soon decided this
strategy was not viable because, as Figure 4 makes clear, the regression functions are
very different by state. Recent work by Dong and Lewbel (2011) considers extrapola-
tion beyond the cutoff using information about the local derivative of the regression
function at the cutoff, and work by Angrist and Rokkanen (2012) combines this idea
with a conditional independence assumption that facilitates extrapolation. The spirit
of these approaches to supplementing the RDD with additional design elements is
similar to the approach we consider in this paper.

Nonequivalent dependent variables offer a third kind of supplement to the basic
posttest RDD. These are variables that should be affected by the most plausible alter-
native interpretations operating at the cutoff, but that are not related to treatment.
An example is from Ludwig and Miller (2007) who did a long-term evaluation of
Head Start where help in writing application proposals was originally given to the
300 poorest U.S. counties. They showed that spending on other poverty programs
did not differentially occur at this cutoff, making it possible to use these other pro-
grams as comparison RDD functions since there was little if any reason to believe
that outcomes should change at the 300th poorest county.

COADY WING is Assistant Professor of Health Policy and Administration, University
of Illinois at Chicago, 1603 West Taylor Street, 754 SPHPI, Chicago, IL 60612-4394.

THOMAS D. COOK, Joan and Sarepta Harrison Chair of Ethics and Justice and
Professor of Sociology, Psychology, and Education and Social Policy, Institute for
Policy Research, Northwestern University, 2040 Sheridan Road, Evanston, IL 60208.

ACKNOWLEDGMENTS

Several people deserve our thanks. The editor and three anonymous reviewers provided
thoughtful comments and suggestions. Vivian Wong, Peter Steiner, Kelly Hallberg, Will
Shadish, and Dan Black provided helpful feedback on an earlier draft. In addition, work
on this project was facilitated by IES Grant R305D100033.

REFERENCES

Angrist, J. D., & Rokkanen, M. (2012). Wanna get away? RD identification away from the
cutoff. NBER Working Paper 18662. Cambridge, MA: National Bureau of Economic Re-
search.

Bifulco, R. (2012) Can non-experimental estimates replicate estimates based on random
assignment in evaluations of school choice? A within study comparison. Journal of Policy
Analysis and Management, 31, 729–251.

Brown, R. S., & Dale, S. B. (2007). The research design and methodological issues for the
Cash and Counseling evaluation. Health Services Research, 42, 414–445.

Carlson, B. L., Foster, L., Dale, S. B., & Brown, R. (2007). Effects of Cash and Counseling on
personal care and well-being. Health Services Research, 42, 467–487.

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

876 / Methods for Policy Analysis

Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design and analysis for field
settings. Chicago, IL: Rand McNally.

Cook, T. D., & Steiner, P. M. (2010). Case matching and the reduction of selection bias in
quasi-experiments: The relative importance of covariate choice, unreliable measurement
and mode of data analysis. Psychological Methods, 1, 56–68.

Cook, T. D., & Wong, V. C. (2008). Empirical tests of the validity of the regression discontinuity
design. Annals of Economics and Statistics, 91/92, 127–150.

Cook, T. D., Shadish, W. R., & Wong, V. C. (2008). Three conditions under which experi-
ments and observational studies often produce comparable causal estimates: New find-
ings from within-study comparisons. Journal of Policy Analysis and Management, 27,
724–750.

Corrin, W., & Cook, T. (1998). Design elements of quasi-experimentation. Advances in Edu-
cational Productivity, 7, 35–57.

Dale, S. B., & Brown, R. S. (2007). How does Cash and Counseling affect costs? Health
Services Research, 42, 488–509.

Dong, Y., & Lewbel, A. (2011). Regression discontinuity marginal threshold treatment effects.
Working Paper. Boston, MA: Boston College.

Doty, P., Mahoney, K. J., & Simon-Rusinowitz, L. (2007). Designing the Cash and Counseling
demonstration and evaluation. Health Services Research, 42, 378–396.

Fraker, T., & Maynard, R. (1987). Evaluating comparison group designs with employment-
related programs. Journal of Human Resources, 22, 194–227.

Goldberger, A. S. (1972a). Selection bias in evaluating treatment effects: Some formal illus-
trations. Unpublished manuscript.

Goldberger, A. S. (1972b). Selection bias in evaluating treatment effects: The case of interac-
tion. Unpublished manuscript.

Green, D. P., Leong, T. Y., Kern, H. L., Gerber, A. S., & Larimer, C. W. (2009). Testing the
accuracy of regression discontinuity analysis using experimental benchmarks. Political
Analysis, 17, 400–417.

Hahn, J., Todd, P., & van der Klaauw, W. (2001). Identification and estimation of treatment
effects with a regression-discontinuity design. Econometrica, 69, 201–209.

LaLonde, R. (1986). Evaluating the econometric evaluations of training with experimental
data. American Economic Review, 76, 604–620.

Lee, D. S. (2008). Randomized experiments from non-random selection in U.S. House elec-
tions. Journal of Econometrics, 142, 675–697.

Lee, D. S., & Card, D. (2008). Regression discontinuity inference with specification error.
Journal of Econometrics, 142, 655–674.

Lee, D. S., & Lemieux, T. (2010). Regression discontinuity designs in economics. Journal of
Economic Literature, 48, 281–355.

Ludwig, J., & Miller, D. L. (2007). Does Head Start improve children’s life chances? Ev-
idence from a regression discontinuity design. Quarterly Journal of Economics, 122,
159–208.

Manski, C. (2013). Public policy in an uncertain world. Cambridge, MA: Harvard University
Press.

Porter, J. (2003). Estimation in the regression discontinuity model. Mimeo. Madison, WI:
Department of Economics, University of Wisconsin.

Robinson, P. (1988). Root-N-consistent semi-parametric regression. Econometrica, 56, 931–
954.

Schochet, P. Z. (2009). Statistical power for regression discontinuity designs in education
evaluations. Journal of Educational and Behavioral Statistics, 34, 238–266.

Shadish, W., Galindo, R., Wong, V., Steiner, P., & Cook, T. (2011). A randomized ex-
periment comparing random and cut-off-based assignment. Psychological Methods, 16,
179–191.

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

Methods for Policy Analysis / 877

Thistlewaite, D. L., & Campbell, D. T. (1960). Regression-discontinuity analysis: An alternative
to the ex-post facto experiment. Journal of Educational Psychology, 51, 309–317.

Wilde, E. T. & Hollister, R. (2007) How close is close enough? Evaluating propensity score
matching using data from a class size reduction experiment. Journal of Policy Analysis and
Management, 26, 455–477.

Wooldridge, J. M. (2010). Econometric analysis of cross section and panel data, 2nd ed.
Cambridge, MA: MIT Press.

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

Methods for Policy Analysis

APPENDIX: SELECTION OF BANDWIDTHS AND POLYNOMIAL SERIES LENGTHS

We adopted the following procedures for selecting smoothing parameters:

1. Use cross-validation to evaluate a grid of candidate bandwidths from 1 to 90
years. For the polynomial models evaluate grid of four possible functions each
for the treatment and control data: linear, quadratic, cubic, and quartic.

2. Select the candidate with smallest mean integrated squared error.
3. Visually inspect the fitted values from each of the models chosen in step 2 and

make adjustments to resolve concerns about undersmoothing.

Table A1. Smoothing parameters for the RCT benchmarks.

Parameter State

Cross-
validation
treatment

Cross-
validation

control
Preferred
treatment

Preferred
control

Bandwidth Arkansas 11 11 20 20
New Jersey 9 13 25 25
Florida 90 90 90 90

Polynomial Arkansas Interacted quadratic
New Jersey Interacted quadratic
Florida Interacted linear

Table A2. Smoothing parameters for the posttest RDD.

Panel A: cross-validation parameters
Cross-validation

Parameter State Cut = 35 Cut = 50 Cut = 70
Bandwidth Arkansas 90 above/19 below 90 above/2 below 90 above/70 below

New Jersey 14 above/90 below 90 above/15 below 13 above/9 below
Florida 90 above/17 below 90 above/4 below 90 above/11 below

Polynomial Arkansas Quadratic Linear Linear
New Jersey Linear Quartic Quartic
Florida Quartic Cubic Interacted Linear

Panel B: preferred parameters (used in the analysis)
Preferred

Parameter State Cut = 35 Cut = 50 Cut = 70
Bandwidth Arkansas 90 above/19 below 90 above/20 below 90 above/70 below

New Jersey 14 above/90 below 90 above/15 below 19 above/13 below
Florida 90 above/17 below 90 above/11 below 90 above/11 below

Polynomial Arkansas Quadratic Linear Linear
New Jersey Linear Quartic Quartic
Florida Quartic Cubic Interacted linear

Notes: The partially linear model approach to the pretest RDD required choosing three bandwidths for
each research design under test. We used the cross-validation bandwidth of 90 years for the pretest
indicator bandwidth for all of the models. Table A3 shows the bandwidths that were selected using the
cross-validation and the preferred bandwidths used in the Medicaid expenditures equation and in the
residualized equation used to form the treatment effect estimates.

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

Methods for Policy Analysis

Table A3. Bandwidths in the pretest RDD based on the partially linear model.

Panel A: cross-validation bandwidths
Cross-validation Medicaid Cross-validation residual

Parameter State Cut = 35 Cut = 50 Cut = 70 Cut = 35 Cut = 50 Cut = 70
Bandwidth Arkansas 90 90 90 13 90 11

New Jersey 1 1 1 9 7 13
Florida 2 90 90 66 66 67

Panel B: preferred bandwidths
Preferred Medicaid Preferred Residual

Parameter State Cut = 35 Cut = 50 Cut = 70 Cut = 35 Cut = 50 Cut = 70
Bandwidth Arkansas 90 90 90 13 90 11

New Jersey 10 10 10 15 7 13
Florida 20 90 90 66 66 67

Notes: We used cross-validation to select the length of the polynomial series for each model. In Arkansas,
for the untreated samples, we used a quartic model in the age 35 design, a linear model for the age 50
design, and a quartic model for the age 70 design. For the treated samples in Arkansas, we used a quartic
model for the age 35 design, a cubic model for the age 50 design, and a linear model for the age 70 design.
In New Jersey, for the untreated samples, we used a quartic for all three designs. For the treated samples
in New Jersey, we used a linear model for the age 35 design, a quadratic model for the age 50 design,
and a quartic model for the age 70 design. In Florida, for the untreated samples, we used a linear model
in the age 35 design, a quadratic model for the age 50 design, and a linear model for the age 70 design.
For the treated samples in Florida, we used a cubic model for the age 35 design, a quadratic model for
the age 50 design, and a linear model for the age 70 design.

Table A1 shows the cross-validation and preferred bandwidths and functional
forms from our analysis for the RCT benchmarks. Table A2 shows the parameters
for the posttest RDD.

Journal of Policy Analysis and Management DOI: 10.1002/pam
Published on behalf of the Association for Public Policy Analysis and Management

Copyright of Journal of Policy Analysis & Management is the property of John Wiley &
Sons, Inc. and its content may not be copied or emailed to multiple sites or posted to a listserv
without the copyright holder’s express written permission. However, users may print,
download, or email articles for individual use.

Designing Fractal Line Pied-de-poules:

A Case Study in Algorithmic Design

Mediating between Culture and Fractal Mathematics

Loe M. G. Feijs

Department of Industrial Design, Eindhoven University of Technology, NETHERLANDS
[email protected]

Abstract

Millions of people own and wear pied-de-poule (houndstooth) garments. The
pattern has an intriguing basic figure and a typical set of symmetries. The origin
of the pattern lies in a specific type of weaving. In this article, I apply compu-
tational techniques to modernize this ancient decorative pattern. In particular I
describe a way to enrich pied-de-poule with a fractal structure.

Although a first fractal line pied-de-poule was shown at Bridges 2015, a number
of fundamental questions still remained. The following questions are addressed
in this article: Does the original pied-de-poule appear as a limiting case when
the fractal structure is increasingly refined? Can we prove that the pattern is
regular in the sense that one formula describes all patterns? What is special
about pied-de-poule when it comes to making these fractals? Can the technique
be generalised?

The results and techniques in this article anticipate the future of fashion where
decorative patterns, including pied-de-poule, will be part of our global culture,
as they are now, but rendered in more refined ways, using new technologies.
These new technologies include digital manufacturing technologies such as laser-
cutting and 3D printing, but also computational and mathematical tools such
as Lindenmayer rules (originally devised to describe the algorithmic beauty of
plants).

Journal of Humanistic Mathematics Volume 10 Number 1 (January 2020)

Loe M. G. Feijs 241

1. Introduction

Regular and symmetric ornamental patterns are among the oldest forms of
design. Neolithic societies, such as the (Neolithic) Linear Pottery culture in
my own region and the iron-age Vikings in Scandinavia considered it worth-
while to combine functionalities, such as pottery and garments with the art
of decorative patterns. The pottery of Figure 1 (a) was found in Stein, The
Netherlands [21]. The (fragment of) a little statue of Figure 1 (b) was exca-
vated in Sittard, my home town, and is dated about 5000 BC. It probably
was part of a little statue with a textile decorative pattern [19].

Figure 1: Ancient pottery with decorative patterns found in Stein, The Nether-
lands (left). Neolithic decorative pattern found in Sittard, The Netherlands (right).

Sources: [21] and [19].

The Gerum cloak (Sweden), has has been radiocarbon dated to 360-100 BC,
the pre-Roman iron age [14]. Figure 2 shows a fragment of the garment
and a modern reconstuction of the weaving pattern. The pattern is what
we nowadays call pied-de-poule or houndstooth (more about pied-de-poule
in Section 2).

As technology became increasingly sophisticated, the decorative patterns
found application in other functional artefacts such as architecture (frieze
patterns), woven baskets, paintings, etc. Decorative patterns are among
the oldest components of human culture and are deserving of our continued
attention. In my view, continued attention means not only studying and
preserving old patterns but also looking in-depth and applying contempo-
rary technologies. In this context, mathematics and computer programming
are considered technologies, just like materials and production techniques.

242 Designing Fractal Line Pied-de-poules

Figure 2: Fragment of the Gerum cloak (360–100 BC) and a modern reconstruc-
tion. Source: historiska.se/ upptack-historien-start/gerumsmanteln/

In past centuries, technology has evolved enormously. In the domain of deco-
rative patterns, we have powerful tools such as printing, Jacquard, wallpaper
theory, group theory, tessellation theory and much more. Correspondingly,
mathematics and computation have merged into an even more powerful tech-
nology, creating fresh new tools such as laser cutters, 3D printers, computer-
controlled embroidery and computer Jacquard machines. This work fits in
the intersection of arts, math and technology. My collaborators and I pre-
sented several works aimed at revitalizing one specific decorative pattern,
viz. the pied-de-poule, also called houndstooth, already shown in Figure 2.

The work presented in this paper aligns with the goal of a project series
to revitalize and refresh pied-de-poule. Specifically, I want to add more
depth to the fractal line pied-de-poule presented in [9]. Although the fractal
pattern [9], its textile implementation and the garments based on it were
well-received at the Bridges Mathematical Art Exhibition held in Baltimore
in 2015, it was unclear how the construction works, and why. In this paper
I will dive deeper into the formal rules behind the zigzagged pied-de-poule,
and find out what is so special about it. The purpose of this work is to
discover why and how it works and how it can (or cannot) be generalised.
In the following section, I explain more about pied-de-poule, ending in an
overview of the plan of the work presented in this paper.

Loe M. G. Feijs 243

2. Pied-de-poule

Pied-de-poule, also called houndstooth, is a textile pattern which appears
from a specific form of weaving with black and white threads. Pied-de-poule
has an extensive history, with the oldest known occurrence being the Gerum
cloak (Sweden), which has been radiocarbon dated to 360–100 BC, the pre-
Roman Iron Age [14]. Pied-de-poule was introduced in fashion by Prince of
Wales (Edward the VII) in the 1930s and in haute couture by Dior in the
1950s.

Currently, pied-de-poule is frequently used in haute couture, prêt-à-porter
and mass-produced fashion. Although the classic pattern is old, the same
design is recycled repeatedly in different contexts, cuts and combinations.
Pied-de-poule is very much alive. As part of our research on generative
design, pied-de-poule is a recurring theme.

There is a family of pied-de-poule patterns [5], one for each integer N > 0.
The case N = 1 is ambiguous in the sense that it is both a block pattern
and a pied-de-poule pattern. Moreover there is a pattern which arises as a
limit case when N → ∞, although this cannot be woven; it can be printed
or laser-cut [5], however. In Figure 3 we show the successive pied-de-poule
patterns for N = 1, 2, 3 and 4. For more details of the computer programs
used to generate Figure 3 we refer to [5]. In essence they are grid based,
counting row and column numbers.

Figure 3: Successive pied-de-poule patterns (N = 1, 2, 3, 4).

As an important example of an innovative pied-de-poule in fashion we in-
cluded an image (Figure 4), which shows a jacket designed by Dior in 2012.
This was very innovative: near the shoulder, the pattern appears as a clas-
sic pied-de-poule, but as it descends, the pattern gradually changes and the
individual “tiles” become separated.

244 Designing Fractal Line Pied-de-poules

Figure 4: Pied-de-poule pattern in design by Dior in 2012 (with kind permission
of Dior Paris).

In [6], a first fractal pied-de-poule was described, which was a kind of Cantor-
set approach, recursively leaving out blocks from the classic pattern. Then
in [9] another fractal was described, a line fractal based on recursive zigzag-
ging and a specific idea about pen-up and pen-down inspired by turtle graph-
ics. The design by Dior (Figure 4) inspired us for this zigzagging, as we noted
that each basic figure was, in fact, a kind of zigzag line. We already knew
that inside a classic pied de poule pattern, the black basic tiles are connected
and thus form chains. Zooming out, such a chain could be considered a kind
of line. The zigzags could be chained, and at the same time, the line drawing
could recursively be done zigzag-wise. As described in [9]:

If we would be allowed to use pen-up and pen-down turtle
graphics commands, then we could draw all the essential di-
agonals and connect them by special line segments and arcs.

Loe M. G. Feijs 245

The special line segments and arcs would be outside of the classic
figure, but we could draw them with pen-up and thus they would
not be harmful. Or perhaps we could draw them with a very thin
pen, and they would be “almost” harmless.1

The elaboration of this idea is in Figure 5 for N = 1, 2, and N = 3. The bold
red lines are “pen-down”, the black segments and the blue arcs are “pen-up”.
The same can be done for any N > 0. Instead of working with pen-down
and pen-up commands, we choose to let the drawing function work either
recursively (writing pied-de-poules all along), or draw non-recursive lines
(thin lines).

Figure 5: Drawing the diagonals of a classic pied-de-poule with outer loops drawn
with a thinner pen for N = 1 (left), N = 2 (center) and N = 3 (right).

To make sure the figures tessellate correctly, we have to do two diagonal
pied-de-poule figures in each cell. So they shrink by a factor of 1

8


2 (for

N = 1), by 1
16


2 (for N = 2) and by 1

24


2 (for N = 3). In general they

shrink by a factor 1
8N


2. The effect of recursion is demonstrated in Figure 6

for N = 3.

The fractal pattern described in [9] was claimed to be a line fractal satis-
fying the following requirements: pied-de-poule-like, recursively tessellated,
parameterised (the figure for recursion level n is a tessellation of figures for
type n− 1), generic (the same for all n and N) and continuous (no jumps).

1I make the notion of “almost harmless” more precise in Section 5. WHERE??

246 Designing Fractal Line Pied-de-poules

Figure 6: Fractal line pied-de-poule approximation: solution for pied-de-poule
type N = 3 and recursion level n = 2.

The fourth claim, generic, means that the figures can be described by a
generic recipe with a minimum of ad-hoc tricks and which works the same
for each N and each n. Although the idea shown in Figure 5 appears to
be effective and generic, my collaborators and I were not able to present a
formal description of the fractal as a single formula. In this article, I fill part
of that gap, showing how to develop a generic description for the pattern.

Loe M. G. Feijs 247

To describe the pattern, we need a language; to this end we will deploy
the language of Lindenmayer systems [20], an elegant formalism which has
been used for describing fractals; both fractals as found in biology, and de-
signed fractals. In Section 3, I introduce these Lindenmayer systems. Then
in Section 4, I describe the recursive tessellation of Section 2 as a Linde-
mayer system. In Section 5, I present the formal properties of the fractal.
Section 6 gives practical implementation details, not only about coding the
fractal in the language of the computer but also about practical aspects of
contemporary production machines. In the last section, Section 7, I explore
whether the technique, first applied to pied-de-poule, can be generalised to
other tessellations. By way of example, I try this for a tessellation by the
great master of tessellation, the famous Dutch graphic artist Maurits Escher
(1898-1972). I summarise my findings in Section 8.

I should like to inform the reader that Sections 3–7 are relatively technical.
This is for two reasons. The first reason is to explain the concepts and state-
ments very precisely. When I say that the “zigzagging of pied-de-poule is
a generic recipe”, I want this to be a precise statement, not a vague claim.
The second reason is that I can envisage a future in which computational
rules, math, new technologies and art come together. This is very promis-
ing, but demanding in terms of digital skills and algorithmic technicalities;
Sections 3–7 give a preview of this aspect of the envisaged future.

3. Lindenmayer Systems

Lindenmayer systems [20] are often used to describe the growth of frac-
tal plants. In the core of this formalism, there is a substition approach,
for example, a forward move F can be replaced by F-F++F-F. As a for-
mal rule we can write: F → F-F++F-F. The idea is to apply the rule re-
peatedly (to all F simultaneously). Starting from F we get F-F++F-F, then
F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F, and so on. Interpreting the sym-
bols as turtle graphics commands, one may for example assign F the meaning
of drawing forward, + to turn right 60◦, and – to turn left 60◦. Then this
Lindenmayer system describes the Koch fractal [17] shown in Figure 7.

As another illustration of a Lindenmayer system in action let us look at a
fractal inspired by warp-knitting, yet having plant-like qualities (Figure 8).
Warp-knitting is a special type of knitting which is well-suited for machine-

248 Designing Fractal Line Pied-de-poules

production. For more details refer to [8] and for the garments to gallery.
bridges.org/exhibitions/2014-bridges-conference. The basic recipe
is to move forward while doing a few loop pairs (one loop pair means making
a loop to the left and then a loop to the right, see Figure 8). More precisely:
do 3 loop pairs for the first “forward”, 2 for the next (it is shorter), then
3 again, and 4 for the last “forward”. And then repeat in a glide-mirrored
fashion. The numbers are chosen after experimentation: 2 for the shortest
line, 3 inside the loops (where the corners would become messy otherwise)
and 4 for the last move. The recipe is related to the Lindenmayer rule F →
-F3-F2-F3-F4F3+F2+F3+F4+ where F2 abbreviates FF, F3 abbreviates FFF and
so on and where the four minus signs represent left turns of 30◦,105◦, 105◦

and 90◦ respectively; the plus signs represent right turns of 105◦,105◦, 90◦

and 30◦.

To formalise the fractal line pied-de-poule of [9] we deploy rules using two
main formal symbols. These are F and F, where the former is interpreted
as a turtle graphics forward step and the latter can either expand to a full
pied-de-poule-like zigzag line or just act as a forward step. The idea that
the turtle writes either thin lines or thick lines (implemented by recursion)
is reflected in the typography of these two symbols.

4. Towards a Generic Formula for Fractal Line Pied-de-poule

When trying the Lindenmayer formalism for the fractal line pied-de-poule,
we stumbled upon some hurdles which we had to overcome. Our fractal is
more complicated than the Koch fractal or the warp-knitting fractal of the
previous example. It was not a priori obvious that the formalism was powerful

Figure 7: Approximation of the Koch fractal with n = 3 recursion levels.

Loe M. G. Feijs 249

Figure 8: Approximations of the warp-knitting fractal for N = 0, 1, 2, 3 and
4.

enough for the complexity at hand. One particular task was to describe the
half-circles, which could be done, although the approach is atypical (usually,
one would not take a Lindenmayer approach for circles). The next challenge
was to describe the phenomenon that each tile has to be described twice:
once when travelling up with the turtle, and once when travelling down (in
reversed manner). As a third challenge, we found that we had to divide the
pied-de-poule tile in four components (“body parts”), each of which had to
be formalised in a slightly different manner.

First, how to make the half-circles? We found it useful to adopt special ver-
sions of the + and − signals, giving them an extra angle parameter denoted
as ϕ. In particular we interpret +ϕ as turn right over ϕ and −ϕ as turn left
over ϕ. If the + or − have no subscript, we take by default ϕ to be π/4,
that is, 45 degrees. Now we can define Rd, which describes a clockwise half-
circle with diameter d and Ld, a counter-clockwise half-circle with diameter
d. These half turns can be approximated using k steps as follows:

Rd → (+ϕ Fd′ +ϕ)k

Ld → (−ϕ Fd′ −ϕ)k

where d′ is d× sin(π/2k), and ϕ is π/2k. For example, taking k = 10 each
+ϕ means turning 9 degrees. By increasing k this Rd becomes a very good
approximation of a half-circle. Note that these R and L are not the basic
turtle graphics right and left turn commands ; the latter are denoted by +
and − respectively.

250 Designing Fractal Line Pied-de-poules

The symbol F is interpreted as moving forward over a certain distance, say
L. We need two distinct symbols for forward, F being interpreted as the
usual forward command of turtle graphics, and F, being a formal symbol
during the Lindenmayer substitutions, yet interpreted as F when taking an
approximating snapshot after a certain number of simultaneous substitutions
(in practice we make recursive Processing programs and then this number
is the recursion depth n). As before, we use exponentiation notation for
repeated symbols, for example F4 means FFFF.

We define what it means to execute a path in reverse manner, using negative
exponent notation: (c1 · · ·ck)−1 means ck−1 · · ·c1−1, (−ϕ)−1 is +ϕ and (+ϕ)−1
is −ϕ. Similarly Ld−1 means Rd, Rd−1 means Ld and finally F−1 is just F.
For each Lindenmayer rule F → c1 · · ·ck tacitly add F−1 → (c1 · · ·ck)−1
(treating F−1 as a symbol).

First we focus on the leftmost zigzag of Figure 5, which is the case N = 1.
When using recursion to make a single pied de poule figure so that the
distance between the begin and end points equals L, we need an equation:
FL = −F4sF

3
sRsF

−4
s FsLsF

4
sRsFsF

−4
s LsF

3
s+ where s =L


2/8 (the general rule

is s = L

2/8N). In the Lindenmayer approach we omit the size parameters
and just let the term expand by n-fold rule-application (recursion level n).
It is implicitly understood that the half-circles have a diameter which equals
the stepsize of the adjacent F steps. We give the formulas for N = 1, N = 2
and N = 3 now.

(N = 1) F →−F4F3RF−4FLF4RFF−4LF3 +

(N = 2) F →−F8FRF−8LF5F8RFF−8FLF8FRF−8F3LF8RFsF−8LF7 +

(N = 3) F → −F12FRF−12LFF12F7RF−12FLF12RFF−12FLF12FRF−12LF
F12RF5F−12FLF12RFF−12LF11+

Formulas like these are easy to read as zigzags. Each F4 is a “zig”, and each
F−4 is a zag. For the orientation adopted throughout all the drawings such
as Figure 9, a zig goes up; a zag goes down. Everything else builds the outer
loops that connect the zigs and the zags.

Now we sketch the main tasks at hand when developing a formula for arbi-
trary N. First, the main skeleton of the formula will be a “−” followed by 4N

Loe M. G. Feijs 251

copies of F4N or F−4N , alternating, followed by one final “+”. Between the
zigs and the zags we need extra turtle commands of the non-recursive type,
the details being slightly different for each of the transitions, for example
when temporarily leaving one body part, or when moving between adjacent
body parts. The main statement here is that the generic formula exists. The
details are outside the scope of the present paper (they are tedious, but not
really difficult). They can be found in [13].

Figure 9: Main geometric parts of the classic pied-de-poule basic tile.

5. Formal Properties

Intuitively we can say that the outer loops are a minor thing, but can we
prove it in a formal sense? We shall present two theorems doing precisely
that. Certain technicalities are outside the scope of this journal, but can
be inspected at Github, see [13]. First we need some preliminaries. We
write PDP as an abbreviation of “classic pied-de-poule”. We say that a set
P ⊆ R2 is a PDP of type N if it has been constructed according to the
methods mentioned in Section 1 and detailed in [5]. Such a P is the union
of 8N2 non-overlapping square regions of width d for some d ∈ R. We call
d the grid size. We say that the size of a PDP P, denoted by size(P) is the
width of the smallest square box which is aligned with the grid and which
encloses P. It equals 5N − 1. We write flPDP for “fractal line pied-de-
poule approximation” and we say that a set F ⊆ R2 is an flPDP of type
N and recursion level n if it has been constructed according to the methods
explained in Sections 2 and 4.

252 Designing Fractal Line Pied-de-poules

Let X ⊆ R2 be an arbitrary set then we define the �-fattening of X, denoted
as dXe� to be a set like X, but with a band of size � added all around it.
For each PDP P of type N, let Fn(P) be the flPDP which runs though the
diagonals of P and has recursion level n.

Theorem Let P1,P2,P3, . . . be a sequence of PDPs of type 1, 2, 3, . . . and
fixed size(PN ) = s for all integer N > 0. The PN thus have shrinking
grid sizes s/4, s/9, s/14, . . .. Then for each recursion level n ∈ N there
is a sequence of corresponding flPDPs Fn(P1), Fn(P2), Fn(P3), . . . and
a sequence of real numbers �1, �2, �3, . . . such that for all N > 0 we have
Fn(PN ) ⊆dPNe�N and

lim
N→∞

�N
size(PN )

= 0

Proof. See [13].

The idea of the theorem is presented in Figure 10. The outer loops get
closer and closer to the edge of the tile; in the visual Gestalt of the tile they
become almost invisible, they can be neglected.

Figure 10: Illustration of the first theorem. By taking successive pied-de-poule
types, that is, successive tiles from the pied-de-poule family, the zigzagging be-

comes more refined so that the auxiliary outer loops protrude less and less.

So we can neglect the outer loops for large pied-de-poule type N, but what
about fixed N? The next theorem says that the outer loops are neglectable
anyhow as they have infinitesimal thickness (by which we mean that in the
limit case, the thickness tends to zero). If we make a practical drawing of
an flPDP F, the mathematical line has to become visible. Whenever we use
a pen of a certain stroke width, we are in fact making an �-fattening dFe�
where � is half the pen stroke width. How wide such a pen stroke can be is

Loe M. G. Feijs 253

naturally limited by the condition that adjacent strokes still have sufficient
white space in between – even inside the smallest recursive figures embedded
in F. Without loss of generality we interpret ‘sufficient white space’ to mean
that the white space is equally wide as the pen strokes themselves.

Theorem Let F0,F1,F2, . . . be a sequence of flPDPs of increasing recur-
sion level n = 0, 1, 2, . . .. and such that the Fn all run through the diagonals
of a single PDP P of given type N and given size. Let �0, �1, �2, . . . be the
sequence of values in R (half stroke widths) such that for each n ∈ N the
band of points between the adjacent parallel diagonal strokes of the small-
est recursive figures embedded in Fn is equally wide as the diagonal strokes
dFne� themselves, viz. 2�n. Then

lim
n→∞

�n
size(P)

= 0

Proof. See [13].

The idea of the second theorem is presented in Figure 11.

Figure 11: Illustration of the second theorem. The pied-de-poule type does not
change, but the recursion level increases. Then the stroke-width � becomes arbi-

trarily small; in the limit case, it goes to 0.

6. Implementation Details

The following Java function shows how we constructed a left-turning half-
circle in Oogway. Oogway [4] is a turtle graphics library created by Jun Hu,
aimed at creative programming and tessellations. The function is named
LARC meaning left-arc (half-circle).

254 Designing Fractal Line Pied-de-poules

void LARC(float diam){

int steps = max(18,min(1,ceil(sqrt(diam))));

float phi = 180.0 / steps;

float segment = diam * sin(radians(phi/2));

for (int i = 0; i < steps; i++){

LEFT(phi / 2);

FORWARD(segment);

LEFT(phi / 2);

}

}

We can translate the Lindenmayer rules into Oogway commands, which are
mixed with regular Processing (=Java) statements. For example, for N = 3
the formula would be: F → −F12FRF−12LF etcetera +, which is rendered
in Java as follows:

void FORPIED(float LEN, int budget){

int N = 3;

float grid = LEN / 12;

float step = grid / sqrt(2);

if ( budget == 0)

FORWARD(LEN);

else {

LEFT(45);

for (int i = 0; i < 12; i++)

FORPIED(step, budget-1);

FORWARD(step);

RARC(step);

for (int i = 0; i < 12; i++)

FORDIEP(step, budget-1);

LARC(step);

FORWARD(step);

//etcetera

RIGHT(45);

}

}

The command FORWARD is the basic Oogway turtle graphics command. The
function FORPIED codes F and therefore is the pied-de-poule variation of

Loe M. G. Feijs 255

going forward (PIED being shorthand for pied-de-poule). And there is a
similar function FORDIEP which is the reversed version (DIEP being the word
PIED in reverse, in Dutch language ”IE” is one vowel). One call of the
function FORDIEP codes F−1 and then of course

for (int i = 0; i < 12; i++) FORDIEP(step,budget-1)

codes F−12.

For the implementation of the garments described in [9] (Figure 13), we did
not generate a recursive structure for fixed N, but we made a mixed figure
where the pied-de-poule is of the N = 3 type at the highest recursion level,
N = 2 for the smaller pied-de-poules, and N = 1 at the smallest level. This
is shown in Figure 12.

Also, in Fig 12 it can be seen how we avoided the effect that adjacent figures
touch each other. If we would run the turtle graphics commands as obtained
by straightforward coding of the Lindenmayer rules, we find that adjacent
pied-de-poule figures touch each other. We tweak the turtle path a little bit,
so the effect is hardly visible, and then we can claim that the entire path is a
single line which does not touch or cross itself. This tweaking is implemented
by adding extra statements inside the code of FORPIED and FORDIEP. Yet the
appearance of the total figure is not affected.

Today, there is a significant change happening in the world of fashion man-
ufacturing equipment. This change is one of the reasons why we expect a
new wave of innovation in fashion. We chose a production method which is
consistent with this development. Novel manufacturing methods are data-
driven and are less and less depending on manual machine set-up procedures.
Examples include 3D printing, computer-embroidery, laser cutting, Jaquard
weaving and computer-printing. These new manufacturing methods will sup-
port ultra-personalisation and aesthetic innovation. Examples of aesthetic
innovation can be found in [22] and [18]. Jacquard weaving was invented
a long time ago and it still expensive, but the digitisation may lead to a
renewed interest. In our case, we experimented with computer-embroidery
and also with computer-laser engraving; for implementing our fractal pied-
de-poule we have chosen laser engraving. The laser produces extremely thin
carving lines, which make the fractal appear very subtle and beautiful, both
at a short distance and far away. Although our Trotec Speedy300 laser cut-
ter can move at more than 3 m/s, it takes hours to engrave a large fractal

256 Designing Fractal Line Pied-de-poules

Figure 12: Mixed figure where the pied-de-poule is of the N = 3 type at the
highest recursion level, N = 2 for the smaller pied-de-poules, and N = 1 at the

lowest level.

(the figure is a very long densely compressed line). Further optimisation is
possible, for example, Bézier curves for the semi-circles would be better (as
the machine can interpret these faster).

Figure 13 shows one of the attractive fractal line pied-de-poule garments
we created and exhibited in Baltimore at the Bridges Mathematical Art
Exhibition in 2015 [10].

7. Generalisations

Under certain conditions, the idea of zigzagging a figure can be generalised for
more shapes. If the shape is part of a tessellation pattern and if the shape
can be zigzagged, then it can be turned into a recursively tesselated line
fractal. To zigzag a shape, one needs an entry point and an exit point, which
are most conveniently chosen to be a network point of the tessellation. If the
shape is convex, then it can always easily be zigzagged, otherwise, additional
tricks are needed. One trick is to adjust the angle, which so far was 45o.

Loe M. G. Feijs 257

Figure 13: Fractal line-pied-de-poule garment by Marina Toeters and Loe Feijs
at the Bridges Mathematical Art Exhibition in Baltimore in 2015 (photo by the

author).

Worst-case one needs other, less-pure types of zigzagging, such as re-entrant
loops. Most interesting artistic tessellations, such as Escher’s tessellations
are made with non-convex figures indeed. We illustrate this by creating
a recursively tessellated line fractal based on one of Escher’s birds, E128
(Figure 14).

Using the taxonomy of tessellations developed by Heesch and Kienzle in
the 1960s, we note that this particular bird configuration has Heesch type
TTTT [16], which means that each tile has four edges, pairwise related by

258 Designing Fractal Line Pied-de-poules

Figure 14: M.C. Escher’s “Symmetry Drawing E128” c©2019 The M.C. Escher
Company-The Netherlands. All rights reserved. www.mcescher.com.

translations (unlike pied-de-poule, which has Heesch type TTTTTT). In each
network point, four edges come together. Escher’s sketch has explicitly in-
dicated network points, and we choose two of them which are diagonally
opposed. The four network points are arranged in a square, but clearly, the
bird extends beyond the square (Figure 15).

The bird is not concave, but it (almost) fits in a rectangular box which
goes through the entry and exit points, which means that the zigzagging can
be done by lines parallel to the edges of the rectangle without missing too

Loe M. G. Feijs 259

Figure 15: Fitting the bird in a rectangular box. The box is helpful for choosing
the main direction of zigzagging. The black dots are the network points. The

arrows indicate the entry and exit points.

much of the bird. In this case, we work with an angle of 56o. In Figure 16,
the process of interactively choosing a proper zigzag pattern is shown. It
is done with a locally made software tool which enables editing a simple
Lindenmayer language and simultaneously interpreting it with an arbitrary
bitmap as background (here a rotated E128).

A first version of the zigzagging result is shown for three different recursion
levels in Figure 17 (the three leftmost birds). It has the following Linden-
mayer rule: F → −FR F−1L F2F3F3FR F−8L F9FR F−9L F9R F−4FF−4L
F5F4R F−13L F14F4FR F−16L FF7+. The rightmost bird in Figure 17 shows
an additional feature: it has a re-entrant loop in the tail.

After refinement of the details, we obtained the tessellation of Figure 18,
which has three recursion levels. The additional feature of making reentrant
loops, abandoning pure zigzagging, was used for the bird’s tail and foot. It
gives more creative freedom, but now the line self-intersects. The generated
vector graphics image is extremely detailed, and the challenge to material-
ize it is still ahead of us (the line of Figure 18 has more than two million
“forward” steps). From this generalisation, we conclude that:

• We are given another perspective on the pied-de-poule case. The
zigzagging of the pied-de-poule went smooth only because of the math-
ematical properties and precise rectilinear outline of the basic pied-
de-poule figure. We could take advantage of the typicalities of the
TTTTTT Heesch type. In fact, the basic pied-de-poule figure turns

260 Designing Fractal Line Pied-de-poules

Figure 16: Zigzagging Escher’s bird using a dedicated interactive tool in Process-
ing. Left is the interaction screen which shows both the turtle graphics program

and the result, overlayed on a background bitmap. Right is the tool’s code window.

out perfectly fit for 45o zigzagging. The first theorem of Section 5 can
only be formulated for pied-de-poules, not for the birds (they do not
come as a regular family).

• For different Heesch types, different solutions can be found, as demon-
strated by choice of diagonal entry points. Yet the process is somewhat
ad-hoc and the complement white-space in Figure 18 creates a bird
which is less elegant than Escher’s basic figure. This can be overcome
partly by using more zigzag lines and the re-entrant loop feature. Cer-
tain media such as high-resolution laser systems are better suited to
materialise the result than others (e.g. embroidery).

8. Concluding Remarks

The combination of turtle graphics and the concept of recursive zigzagging is
a technology which we can apply to any figure, not just pied-de-poule. But
because of the special symmetries embedded in pied-de-poule, it became a
fascinating exercise to prepare the program and to analyze the properties.
The mathematics of Sections 3–5 are essential to support the precise formula-
tion of the properties of the fractal line pied-de-poule. The negative exponent
notation for reversed turtle movements is to the best of our knowledge, new.

Loe M. G. Feijs 261

Figure 17: The bird with succesive recursion levels adopting a pure zigzagging
approach. The rightmost bird shows an additional feature: it has a re-entrant loop

in the tail. This solves the problem that the bird would not fit in the rectangle.

We consider it worthwhile to focus serious attention to the pied-de-poule
pattern, which is a great asset of European culture (and mediated by fashion
now of global culture). In this paper, we used the somewhat technical Lin-
denmayer rules so that we could see in sufficient detail what was so special
about the zigzagged pied-de-poule.

The work was somewhat technical, but that is an essential part of the en-
visioned fashion future. As limitations of (mass) production machines tend
to disappear, a new creative space is opened-up. In this new creative space,
computational rules, math, new technologies and art come together. These
allow for more personalization and sophistication but is demanding in terms
of digital skills and algorithmic technicalities. We would like to finish with
a quote from Karl Lagerfeld: “Fashion is about two things: continuity and
the opposite. That’s why you have to keep moving”. This project contributes
to moving towards radically new fractal decorative patterns, with continuity
coming from the ancient, almost archetypal pied-de-poule.

Related work: The work is related to ethno modeling (ethnomathematics and
ethnocomputing) as described in, for example [1] and [2]. In ethno modeling,
as in this paper, cultural artefacts are dissected using mathematicals and then
applied creatively in new ways, using computational tools. In the Bridges
community, this is a recurring theme, see for example, Gerdes’ descriptions of
African Basketry [15]. One of the goals described by Babbitt [1] is to educate
and empower young people from under-represented ethnic groups, deploying
the mathematics in cultural artefacts. Although I sympathize with the idea,

262 Designing Fractal Line Pied-de-poules

Figure 18: Tessellation obtained by recursively zigzagging the bird with three
levels of recursion. It is simultaneously a fractal structure and a tessellation struc-

ture. It consists of a single continuous zigzag line. Left is the entire bird, right we

zoom in to the four sub-birds at the highest point the tail.

Loe M. G. Feijs 263

in my own work, the ethnic aspect has played a lesser role. Besides pied-de-
poule, I worked with cultural themes related to The Netherlands: (Escher-
style) tessellations [3, 4] and (Mondrian-style) non-figurative art [7, 12]. Pied-
de-poule seems mostly a Western-world theme (Section 2). Tessellations and
fractals are used often to raise awareness and pleasure in mathematics for
children (see for example www.mathartfun.com). In my teaching, together
with colleagues Christoph Bartneck, Jun Hu, and Mathias Funk, we tried to
bring mathematical principles to the attention of our (university-level) design
students . Many design students like to make things, which has determined
the pedagogical approach of our course Golden Ratio at TU/e for the past
ten years [3, 4].

Acknowledgements: I would like to thank the M.C. Escher company
for their kind permission to use M.C. Escher’s “Symmetry Drawing E128”
c©2019, The M.C. Escher Company-The Netherlands (Figure 14). All rights

reserved www.mcescher.com. I thank Jun Hu, Melanie Swallow and Marina
Toeters for their support and cooperation. I am grateful to the anonymous
reviewers of the Journal of Humanistic Mathematics for their helpful sugges-
tions and comments.

References

[1] B. Babbitt, D. Lyles, and R. Eglash. From Ethnomathematics to Ethno-
computing: indigenous algorithms in traditional context and contempo-
rary simulation. In: S. Mukhopadhyay and W.M. Roth (Eds.) Alterna-
tive Forms of Knowing (in) Mathematics. Rotterdam: Sense Publishers
(2012), pages 205–220.

[2] R. Eglash. African Fractals: Modern Computing and Indigenous Design.
New Brunswick: Rutgers University Press (1999).

[3] L.M.G. Feijs, and C. Bartneck, (2009) Teaching Geometrical Principles
to Design Students. Digital Culture & Education, Volume 1, Number 2,
pages 104–115.

[4] L.M.G. Feijs and J. Hu, (2013). Turtles for tessellations. In:
G.W. Hart and R. Sarhangi (Eds.) Proceedings of Bridges 2013,
pages 241–248. Tessellation Publishing, Phoenix AZ. (available at
http://archive.bridgesmathart.org/2013/bridges2013-241.html,
last accessed on January 27, 2020).

264 Designing Fractal Line Pied-de-poules

[5] L.M.G. Feijs. Geometry and Computation of Hound-
stooth (Pied-de-poule), In: R. Bosch, D. McKenna and
R. Sarhangi(Eds.), Proceedings of Bridges 2012, pages 299–
306. Tessellation Publishing, Phoenix AZ. (available at
http://archive.bridgesmathart.org/2012/bridges2012-299.html,
last accessed on January 27, 2020).

[6] L.M.G. Feijs and M.J. Toeters. Constructing and Applying
the Fractal Pied de Poule (Houndstooth). In: G.W. Hart
and R. Sarhangi (Eds.) Proceedings of Bridges 2013, pages
429–432. Tessellation Publishing, Phoenix AZ. (available at
http://archive.bridgesmathart.org/2013/bridges2013-429.html,
last accessed on January 27, 2020).

[7] L.M.G. Feijs. Divisions of the Plane by Computer: Another Way of
Looking at Mondrian’s Nonfigurative Compositions. Leonardo, Volume
37, Number. 3, June 2004, pages 217–222, MIT Press.

[8] L.M.G. Feijs, M.J. Toeters, J. Hu, and J. Liu. (2014). Design
of a nature-like fractal celebrating warp-knitting. In: G. Green-
field, G.W. Hart and R. Sarhangi (Eds). Proceedings of Bridges
2014. Tessellation Publishing, Phoenix AZ. pages 369–372. (available at
http://archive.bridgesmathart.org/2014/bridges2014-369.html,
last accessed on January 27, 2020).

[9] L.M.G. Feijs and M.J. Toeters. A Novel Line Fractal pied-de-poule
(Houndstooth). In: K. Delp, C.S. Kaplan, D. McKenna and R. Sarhangi
(Eds.) Proceedings of Bridges 2015, pages 223–230. (available at
http://archive.bridgesmathart.org/2015/bridges2015-223.html,
last accessed on January 27, 2020).

[10] M.J. Toeters and L.M.G. Feijs. Fractal Pied de Poule (hound-
stooth) Spring/Summer ’15. In: R. Fathauer and N. Selikhoff
(Eds.), Bridges Baltimore Art Exhibition Catalog, Tessellation
Publishing, Phoenix AZ (2013). ISBN 978-1-938664-16-8 (avail-
able at http://gallery.bridgesmathart.org/exhibitions/
2015-bridges-conference/feijs, last accessed on January 27,
2020).

Loe M. G. Feijs 265

[11] L.M.G. Feijs and M.J. Toeters. Pied-de-pulse: sphere packing and pied-
de-poule (houndstooth). In: E. Torrence, B. Torrence, C. Sequin, D.
McKenna, K. Fenyvesi, R. Sarhangi (Eds.) Proceedings of Bridges 2016,
Tessellation Publishing, Phoenix AZ. pages 415–418. (available at http:
//archive.bridgesmathart.org/2016/bridges2016-415.html, last
accessed on January 27, 2020).

[12] L.M.G. Feijs (2019) A program for Victory Boogie Woo-
gie, Journal of Mathematics and the Arts, Volume 13, DOI:
10.1080/17513472.2018.1555687

[13] L.M.G. Feijs. Formal Aspects of Designing Fractal line Pied-de-poules:
The Formula, Its Properties, and The Proof of Its Properties.
Online: github.com
/LoeFeijs/FractalLinePiedDePouleFormalisationDetails/

(version /master/FORMALISATION 08.pdf, retrieved November 4,
2019).

[14] K.M. Frei (2009). News on the geographical origin of the Gerum
cloak’s raw material. Fornvännen Journal Of Swedish Antiquarian
Research, Volume 104, Number 4, pages 313–315.

[15] P. Gerdes. African Basketry: Interweaving Art and Mathematics in
Mozambique. In: R. Sarhangi and C.H. Séquin (Eds.) Proceedings of
Bridges 2011. Tessellation Publishing, Phoenix AZ. pages 9–16.

[16] H. Heesch and O. Kienzle (1963). Flächenschluss; System der Formen
lückenlos aneinanderschliessender Flachteile. Berlin: Springer.

[17] Koch, H., Une méthode géométrique élémentaire pour l’étude de
certaines questions de la théorie des courbes plane, Acta Mathematica,
Volume 30, Number 1, (1906) pages 145–174.

[18] D. McCallum, Glitching the Fabric: Strategies of new media art applied
to the codes of knitting and weaving. (2018). PhD thesis, Gothenborg
University.

[19] P.J.R. Modderman. Die Bandkeramische Siedlung von Sittard. In:
Palaeohistoria, Volume VI–VII, Groningen, 1958/1959, pages 33–120.

266 Designing Fractal Line Pied-de-poules

[20] P. Prusinkiewicz, A. Lindenmayer. The algorithmic beauty of plants,
Springer Verlag (1990).

[21] Stichting Erfgoed Stein. Bandkeramiek. http:
//www.stichtingerfgoedstein.nl/archeologie/44-bandkeramiek,
last accessed on January 27, 2020.

[22] L. Tenthof van Noorden, L. Feijs, M. Toeters, J. Hu and J. Liu. This
Fits Me and Warp Knit Fractal. In: R. Fathauer and N. Selikhoff
(Eds.), Bridges Seoul Art Exhibition Catalog, Tessellation Publishing,
Phoenix AZ (2013). (available at http://gallery.bridgesmathart.
org/exhibitions/2014-bridges-conference/feijs, last accessed
on January 27, 2020).

Copyright of Journal of Humanistic Mathematics is the property of Journal of Humanistic
Mathematics and its content may not be copied or emailed to multiple sites or posted to a
listserv without the copyright holder’s express written permission. However, users may print,
download, or email articles for individual use.

Needs help with similar assignment?

We are available 24x7 to deliver the best services and assignment ready within 3-4 hours? Order a custom-written, plagiarism-free paper

Get Answer Over WhatsApp Order Paper Now