Data Enrichment for CRM

Pavel Brusilovsky

Background and Objectives

  • The solution to any CRM problem is based on the usage of a customer database.
  • Data enrichment is a process of supplementing an internal customer database by information from diverse external sources in order to enhance CRM solutions.
  • Data and attributes that are typically used for enrichment:
  • Personal, Geographical, Postal , Demographic, Psychographic, Socio-economic
  • data, World event information
  • The objective of the presentation is to discuss:
  • the essence of data enrichment
  • different external sources of information for data enrichment
  • types of supplemental data that can be treated as a valuable addition to internal data for CRM
  • enhancement of diverse CRM solutions due to data enrichment

Brief characteristics of supplemental data

  • There are hundreds of different sources of data that can be used for data
  • enrichment to enhance the solution of your CRM problems
  • Supplemental data for data enrichment can be
  • economy-wide and industry/sector specific
  • customer/brand/product specific
  • geography /territory specific
    • Nation, states, counties, Zip code, Census tracts/block groups, etc.
  • time specific, etc.
  • Some of supplemental data are free, but others are very expensive
  • It is good to know both demographics and psychographics of your customers in order to advertise and sell your product effectively.
  • In order to find a solution for other CRM problems, some additional information might be useful:
    • Personal information
    • Household level data
    • Neighborhood/Community level data
    • ZIP level data, etc.


  • Demographics are the average or typical characteristics of the people who buy your products or services.
  • Demographics include age, income, education, marital status, type of occupation, region of country, household size, etc.
  • Demographics can also include the age of children, the status of home ownership, median home value, and whether one's home is located in an urban or a rural location.
  • For some CRM problems it useful to know the population's makeup in terms of age, gender, income level, occupation, education, and family circumstances: married with children, singles, or retired.


  • Psychographics is equivalent to IAO variables (Interests, Activities, and Opinions)
  • Psychographics describe personality, values, attitudes, interests, and lifestyles
  • Can be treated as a proxy for the concept of "culture"

  • Example of psychographic questions:
    • What do customers like about your product?
    • What do customers like about your competitor's product?
    • What made them decide to buy your product?
    • Did they know which brand they were buying before they purchased it?
    • What advertising messages had they seen prior to buying?
    • How much disposable or discretionary income is available for this type of purchase?

Data Enrichment Sources

  • This information/data, combined with your internal database, can:
    • Provide valuable insight into your brand health
    • Help you make better advertisement decisions
    • Help with the development of more effective marketing strategy, etc.

Survey of American Consumers:

Mediamark Research & Intelligence (MRI)

  • Mediamark Research ( has a single goal: to provide the sharpest picture possible of American consumers - who they are, what they buy, what they think - and how to reach them
  • MRI's annual Survey of the American Consumer collects information on media choices, product usage, demographics, lifestyle and attitudes of adult consumers:
    • Measurement of the usage of nearly 6,000 product and service brands across 550 categories
    • Readership of hundreds of magazines and newspapers, Internet usage, TV viewership at the program level
    • National and local radio listening
    • Yellow Pages usage and Out-of-Home exposure

  • The Survey's multi-dimensional database provides marketers with a high resolution view of all major media audiences
  • It is the most comprehensive and reliable source of multi-media audience data available
  • It is the primary source of audience data for the U.S. consumer magazine industry

  • MRI survey is based on face-to-face interviews of 26,000 consumers in their homes.
  • It's not the easiest way to conduct research, but according to MRI, the pay-off in reliability, credibility and completeness of results is worth the added cost and effort.

Usage of MRI's Survey of the American Consumers

Consumers Data

The goal of the Survey of the American Consumer is to provide a high resolution single-source view of the entire consumer marketplace. MRI's Survey database can be effectively used for diverse analytical, planning and reporting functions, such as:
  • Pinpointing target markets
  • Identifying new buying trends
  • Developing and rollout new products
  • Repositioning existing brands and products
  • More confidently building innovative marketing plans
  • Gaining insight into consumer motivations
  • Learning more about users of competitive products
  • Gauging competitive brand loyalty
  • Analyzing market demand across segments
  • Determining market potential of niche targets
  • Analyzing brand volume data to locate profitable consumer segments
  • Increasing media buy efficiencies, etc.

Experian Simmons National Consumer


  • The Experian Simmons National Consumer Studies ( provides year-round single-source measurement of major media (English-language and Spanish-language), products/brands, services, and in-depth demographic, lifestyle and psychographic characteristics.
  • Consumer Behavior Studies collects diverse up-to-date information on American consumers, and in particular, what magazines they read, what television programs they watch, what products they buy, and even how they feel about certain issues
  • The study uses a two-phase data collection approach, with Phase 1 consisting of a telephone placement interview to determine the household survey participation eligibility and Phase 2 involving the mailing of self-administered survey booklets to eligible household members
Consumer Behavior Studies report
  • over 60,000 variables in over 8,000 product categories and 450 brands from over 25,000 annually surveyed American consumers
  • major media usage behavior
  • consumer buying behavior, consumer demographics and psychographics
  • Consumer Behavior Studies is projectable to national U.S. adult population with
  • rigorous quality control

Usage of Survey of Experian Simmons

National Consumer Studies Data

  • The goal of National Consumer Studies is to provide high quality research and single-source measurement of the brand preferences, lifestyle, attitudes and media usage behaviors of the American consumer.
  • National Consumer Studies data can be effectively used to profile the usage and preference of thousands of brands and services for your geography of interest (available levels of geography: Total U.S., State, MSA, County, ZIP Code, and Census Tract).
  • provide client insights into Americans' use of new media, such as mobile phones, social networking, instant messaging, blogging, gaming, social tagging/ bookmarking, online video/audio and dozens of other new and emerging media channels alongside traditional media like TV, magazines, radio, etc.
  • quickly and easily create successful marketing campaigns
  • guide strategic planning, measure performance and drive new business development on the national and local level
  • identify the most effective advertising medium
  • compare the consumer behavior of Hispanics/Latinos to the overall population
  • determine the target audience that will be most likely to make a purchase, and much more

Community Tapestry Segmentation from


ESRI Community Data provides an accurate and detailed description of America's neighborhoods:
  • This information is crucial for the solution of many CRM problems.
  • Encompasses a variety of datasets that help companies and organizations analyze markets, profile customers, evaluate competitors, and more.
  • Classification of neighborhoods is based on their socioeconomic and demographic composition:
    • 65 segments
    • 12 Life Mode summary groups
    • 11 Urbanization summary groups

Community Data

  • Community Data includes
    • Community Tapestry market segmentation data
    • Current-year estimates and five-year projections of demographic data
    • Consumer expenditure, market potential, shopping center, business, traffic, and census data

  • Use Community Data to
    • Analyze and describe your community or trade area
    • Identify new sites
    • Evaluate competitor sites
    • Profile customers and constituents
    • Forecast demand for products and services, etc.

Description of summary groups



Description of Income Range of Life Mode


Example: Last 5 Neighborhood Segments


Example: First 5 Neighborhood Segments


Market Potential Data

  • Measures the likely demand for a product or service in a county, ZIP Code, or any other defined trade area
  • Businesses and other organizations use Market Potential data to make decisions about where to offer products and services.
  • The database projects the expected number of consumers and provides a Market Potential Index (MPI).
  • An MPI compares the demand for a specific product or service for a trade area to the U.S. national demand for that product or service. The index is tabulated to represent a value of 100 as the average demand. A value of more than 100 represents higher demand, and a value of less than 100 represents lower demand.
  • With Market Potential Data You Can solve the following CRM problems:
    • Optimize your merchandise mix
    • Invest marketing dollars more effectively
    • Develop successful advertising and marketing plans
    • Decide which expansions are most profitable
    • Understand, predict, and influence consumer behavior by providing insight into areas with the highest growth potential
    • Make informed decisions about products and services based on the latest trends and consumer demand.

Market Potential Data Structure

Market Potential Reports provide you with information about the number of adults or households expected to consume a particular product or service.

Reports include a Market Potential Index (MPI) that measures the relative likelihood of adults or households in a specific area to exhibit certain consumer behavior compared to the U.S. average.

Health and Beauty Market Potential

  • Identifies market demand for health and beauty products and services among adults and households
  • A market potential index (MPI) measures the relative demand by consumers in a specified area for items such as vitamins, personal care products, and doctor visits compared to the U.S. average
  • Key variables used in this report include:
  • 2007 and projected 2012 Population, Population 18 , Households, and Median Household Income
  • 2007 number of adults or households expected to consume health and beauty
  • goods and services. Some examples include the number of individuals who:
    • Used prescription drugs (by reason)
    • Spent $100 at beauty parlors in last 6 months
    • Visited a doctor in last 12 months (by type)
    • Used a complexion care product in last 6 months
    • Used a hair coloring product (at home) in last 6 months
    • Used hand and body cream/lotion/oil in last 6 months
    • Used headache/pain reliever (nonprescription) in last 6 months
    • Used a vitamin/dietary supplement in last 6 months

Health and Beauty Market Potential (report fragment)


ESRI Community Data: Summary

  • Neighborhood data from ESRI is valuable information for majority of CRM problems
  • The information (and in particular, market potential data) can be used within the CRM paradigm for the improvement of:
    • customer profiling and customer segmentation
    • sales force restructuring
    • targeting
    • product marketing
    • optimization of promotion and evaluation of its effectiveness
    • potential calculation
    • sales analysis
    • sales forecasting, etc.

  • Data enrichment by combining internal data with ESRI America's neighborhood data can be treated as an important step in the creation of your company's Knowledge Repository

US Census Bureau

  • The US Census Bureau ( is the most important source of demographic, socio-economic, and housing data.
  • The US Census Bureau conducts:
    • Decennial Census-taken every 10 years to collect information about the people and housing of the United States
    • Economic Census-profiles the U.S. economy every 5 years
    • American Community Survey-an ongoing survey that provides data about your community every year
    • Population Estimates Program-produces population numbers between censuses
    • Annual Economic Surveys-data from the Annual Survey of Manufactures, County Business Patterns and Non-Employer Statistics

US Census Bureau:

Annual American Community Survey

  • The American Community Survey (ACS) is a new Census Bureau's nationwide yearly survey of three million housing units from across every county in the nation.
  • The ACS collects information such as age, race, income, commute time to work, home value, veteran status, and other important data.
  • One-year population, demographic and housing unit estimates are available annually for geographic areas with a population of 65,000 or more for the nation, all states and the District of Columbia, all congressional districts, approximately 800 counties, and 500 metropolitan and micropolitan statistical areas.
  • Although the ACS produces population, demographic and housing unit estimates, it is the US Census Bureau's Population Estimates Program that produces and disseminates the official estimates of the population for the nation, states, counties, cities and towns and estimates of housing units for states and counties.

Content of the ACS

  • The information collected by the American Community Survey can be grouped into four main types of characteristics –social, economic, housing, and demographic:
  • Social characteristics include information on education, marital status, fertility,
  • grandparent caregivers, veterans, disability status, place of birth, citizenship status, year of entry, language spoken at home, ancestry and tribal affiliation.
  • Economic characteristics include variables on income, benefits, employment status, occupation, industry, commuting to work, and place of work.
  • This data can be used to assess the economic well-being of individuals and households.
  • Housing characteristics include information on tenure, occupancy and structure, house value, taxes and insurance, utilities, and mortgage or monthly rent.
  • Basic demographic characteristics include information on sex, age, race and Hispanic origin.
  • The ACS collects survey information continuously, nearly every day of the year, and then aggregates the results over a specific period of time, one year, three years, or five years.
  • Data is available for areas with estimated populations of 20,000 or greater.

US Census Bureau: Annual Economic Surveys

  • In addition to conducting the Economic Censuses every five years, the U.S. Census Bureau conducts more than 100 economic surveys covering annual, quarterly, and monthly time periods for various sectors of the economy.
  • Economic surveys measure a wide variety of economic activities, from capital expenditures for food manufacturing companies to annual auto dealership sales. In particular:
    • Annual Survey of Manufactures (ASM) provides sample estimates of statistics for commercial manufacturing establishments with paid employees.
    • County Business Patterns (CBP) and ZIP Code Business Patterns (ZBP) provide economic data by industry at various geographic levels (U.S., state, county, metro area and ZIP Code).
    • Non-Employer Statistics (NES) provides national and regional data by industry for businesses without paid employees.

North American Industry Classification

System (NAICS)

  • Economic statistics are published by US Census Bureau in terms of establishments.
  • Establishments are classified according to the North American Industry Classification System (NAICS). The NAICS is a unique system for classifying business establishments.
  • Adopted in 1997 to replace the old Standard Industrial Classification (SIC) system, it is the industry classification system used by the statistical agencies of the United States.
  • NAICS specific information can be a valuable supplement for a variety of CRM problems

Business Intelligence Solutions is a firm that intelligently performs data enrichment and uses new data for problem solving

  • Data enrichment (combining supplemental data with internal databases) will allow to formulate and solve new CRM problems, significantly enhance the solution of traditional CRM problems, and finally considerably improve the company bottom-line.
  • Business Intelligence Solutions can help you with the enrichment of your customer database, and will employ these data in various CRM problems by applying state-of-the-art data mining / predictive analytics / GIS methods and tools