|
|
STI: PopStats - Methodology
Your Data Building Blocks for Accurately Researching Populations in Today's Markets
In 2001, STI: PopStats changed the market research industry's conventional view
of population estimates by launching with several industry firsts including:
- Leveraging a new source of population data the U.S. Postal Service's ZIP+4® records
- Creating an innovative "bottom-up" methodology
- Updating estimates quarterly versus annually
- Adding new data variables in response to clients' specific requests
- Providing a "building blocks" philosophy that unleashes unlimited research potential
- Extending a responsive customer service approach for questions and enhancements
With the launch of PopStats, for the first time, companies viewed population
estimates as a highly accurate, dependable, and essential component of market
research, instead of data that added only marginal value to their research. The
data has become even more valuable with the addition of over 1,200 population
and demographic variables, including neighborhood segmentation and workplace
estimates.
No matter what research goals are driving a company forward from locating new
high-growth areas before competitors, to consolidating store networks, to
accelerating growth plans PopStats brings immense value to critical business
decisions. What's more, companies also gain confidence in their research,
knowing they have the most accurate and current insight including market
growth and decline, demographic variations, seasonal population fluctuations,
mortgage risk differences, income changes, and much more.
Seven years after PopStats' introduction, a watershed moment occurred at the
annual ICSC (International Council of Shopping Centers) Conference in August
2008. First, an independent study by ICSC found that Synergos Technologies Inc.
(STI) is one of two data providers most preferred by retailers. Also, at an
ICSC best practices session, three out of four retail panelists cited STI as
their demographic data provider of choice.
Today market researchers in a wide range of industries including retail,
healthcare, real estate development, telecommunications, and economic
development rely on PopStats to gain in-depth and dependable insight on where
people live and work across the U.S. With PopStats fueling their data engines,
companies are enjoying the confidence to make more informed and profitable
business decisions regarding markets, locations, and consumers.
PopStats' Innovation Overcomes Traditional Population Data Challenges
PopStats' unique approach to population estimates includes four primary
innovations over traditional population data methodologies.
- ZIP + 4 and 2000 Census vs. Census-Only Source Data. STI was the first data
provider to realize the value of ZIP + 4 postal data and to envision a way to
leverage this household-level source data. ZIP + 4 targets areas as small as a
specific group of houses typically four to 12 or a building. It is possible
to literally see structures come online as they are built and occupied. The ZIP
+ 4 level leads to more accurate population estimates for five reasons: (1) it
is extremely detailed, (2) it contains over 28 million records, (3) it includes
all major population centers, (4) it can be manipulated statistically, and (5)
it is easily consolidated into any geography.
- Bottom-Up vs. Top-Down Methodology. The decennial U.S. Census is based on
a traditional top-down construction, which takes macro-level data and
extrapolates it down to a micro-level: from U.S., to state, to county, to
tract, to block-group. The Census Bureau's national-to-local direction was
copied by demographers to generate population estimates at the block-group
level. However, there are significant problems with this direction. Foremost,
macro-level data is unsuitable for use at a micro-level, like block groups,
which are greatly influenced by singular local events, such as a new apartment
complex or a building demolition. To compensate, demographers developed
population-spreading techniques that broad-stroke areas of growth and decline at
the sub-county level. This is an improvement, but still retains limitations.
Namely, it can mask block-group level growth or decline. PopStats delivers a
more accurate population count on a micro level by starting at the ZIP + 4
level, then moving up ("bottom-up") to the block group, tract, county, then
state levels.
- Quarterly vs. Annual Updates. Traditional population data is updated only
once every 12 months (typically in May or June). As a result, the data is
chronically out-of-date. This puts researchers at a serious disadvantage. STI
created the industry's first population data to be updated on a quarterly basis
- every January, April, July, and October. Today we provide the industry's
leading quarterly updated population data.
- Expanding vs. Static Variables. Unlike many population data providers, STI
continually expands into new data territory. PopStats launched in 2001 with 21
variables and by 2009 had over 1,200. New variables include mortgage risk, home
values, employment, five- and ten-year forecasts, and much more. New variables
are based on a combination of STI innovation and client data requests. For
example, a leading grocery store chain requested seasonal data, a QSR (quick
serve restaurant) client requested transient (i.e., hotel, motel, and RV park)
population counts, and a drug store chain requested Puerto Rico population
counts. Every new data variable is available to all PopStats clients.
PopStats' Revolutionary Population Estimating Methodology
The PopStats model is a collection of models that calculate the quarterly
population estimates. The methodology consists of the following three steps.
STEP 1 Estimate Households. STI's research has shown that a unique and
quantifiable relationship exists between USPS (United States Postal Service)
data and U.S. Census Bureau household counts. Due to this relationship, STI
can model population shifts quickly and accurately using a proprietary technique
leveraging the correlation between the two. The process is initiated by
base-lining the ZIP + 4 data and its associated statistics as they existed in
April 2000. Then, as new ZIP + 4 data is provided (new data and statistics are
delivered monthly) we can model and derive a growth factor for every ZIP + 4 in
the country. This application occurs via our proprietary model that uses this
information as well as other pertinent factors to generate a current estimate.
To limit bias in the data due to extraneous figures, such as errors in the raw
data, PopStats methodology includes automated processes for overcoming any and
all anomalies, including ZIP + 4 inaccuracies, data smoothing issues,
conversions (lofts), and overrides.
STEP 2 Estimate Household Populations. A variety of U.S. Census Bureau and
private studies have shown that the relationship of persons-to-households
remains relatively stable over time. STI takes the Census 2000
persons-per-household-per-block group figures, and adjusts the ratio to reflect
any changes in the county estimated persons-per-household generated by the U.S.
Census Bureau. These new figures are then applied to the estimated households
to derive an estimated household population.
STEP 3 Apply Controls. To further ensure accuracy and limit bias in our
estimates, STI uses a series of checks-and-balances to validate the results.
One of these steps is to compare our estimates to the U.S. Census Bureau's
annual population estimates released every Spring. If any major discrepancies
occur between the two numbers, the model applies a set of heuristics to
determine the most probable population figure. We also consult with multiple
state and federal agencies whose data is independently gathered and calculated.
In addition, selected cities throughout the U.S. are field-surveyed to further
validate our model's results.
Methodology for Key PopStats Data Variable "Break Outs"
Once the base population has been estimated, the PopStats model "breaks out"
several demographic estimates, such as age and sex, race and ethnicity, group
quarters, incomes, and housing values. Many more data variables are available
in the ever-expanding PopStats data product.
Age and Gender. Age and gender are determined through a traditional cohort
survival analysis. This sub-model to the main model looks at each age
distribution within a race category and applies the appropriate birth and
survival rates as determined by the NCHS (National Center for Health
Statistics). These results are then balanced back to the base population using
an iterative approach. In addition, information from the NCES (National Center
for Education Statistics) is applied to validate the age distribution of
school-age children. U.S. Census estimates are used to validate all other age
ranges.
Race (Ethnicity). Race is calculated using a ratio analysis of April 2000
observed and annual U.S. Census estimates. In areas of high growth we use race
information gathered by the FFIEC (Federal Financial Institutions Examination
Council, which collects information from financial institutions concerning loans
and race issues. It is a reasonable source for understanding race percentages
in high-growth areas. As a final check for race, our model also consults with
NCES race data for elementary school children and checks NCES data against our
figures.
Income Estimates. Income estimates are based on a two-step process. First,
household incomes at the county level are estimated using a blend of information
from the IRS's Survey of Income, U.S. Census Bureau's ACS dataset (American Community
Survey) income estimates, and personal income estimates from the BEA
(Bureau of Economic Analysis). Once the county estimate is derived, we estimate
the block-group-level. This is done in two parts. First, we separate existing
households from new-growth households, because our research has found that in
high-growth areas existing households are not a good indicator for determining
the income of the new households entering the area. Therefore, we use a typical
income growth approach that resembles the growth of county income. Then we add
to that a separate income growth for new households modeled on the FFIEC's
mortgage data transactions.
Group Quarters. Group quarters are a collection of unrelated people where no
one individual can claim "head of household," such as college students and
military personnel. Generally speaking, group quarters data can be divided into
three categories: colleges, military bases, and institutions (i.e., state
homes, hospitals, and prisons). We estimate each category individually, then
combine them for a total estimate. College student dormitory information is
derived from the NCES annual college survey. Military group quarters are
determined based on a direct data feed received from the DOD (Department of
Defense) Manpower Data Center. Institutionalized persons are estimated using
historical trends from the U.S. Census.
Housing Values. Housing Values are determined in a fashion similar to income
estimates. Housing and associated values that existed as of April 2000 are
updated using data from the OFHEO (Office of Federal Housing Enterprise
Oversight). Our model performs a detailed analysis of same-home selling prices
that occur over time. We use the resulting growth factors and apply them to
existing April 2000 owner-occupied homes. New home values (homes built after
April 2000) are determined by ratio analysis of the FFIEC's mortgage values and
actual selling prices.
Data Sources for STI: PopStats:
United States Census Bureau
United States Postal Service (USPS)
United States Department of Defense (DMDC)
National Center for Education Statistics (NCES)
National Center for Health Statistics (NCHS)
Federal Financial Institutions Examination Council (FFIEC)
Internal Revenue Service (IRS)
Bureau of Economic Analysis (BEA)
Bureau of Labor Statistics (BLS)
Office of Federal Housing Enterprise Oversight (OFHEO)
Department of Defense (DOD)
|
STI: Events

STI: PopStats Research Conference Mar 23-25, 2011 Austin, TX
|