Why did USDA conduct the FoodAPS survey?

USDA’s National Household Food Acquisition and Purchase Survey (FoodAPS) was designed to fill a critical data gap and support research to inform policymaking on key national priorities, such as health and obesity, food insecurity, and food and nutrition assistance policy.

The bulk of economic research in these areas previously had to rely on individual-level dietary recall data, consumer expenditure surveys, or retail-store purchase data. However, these surveys do not provide a complete picture of the way in which food prices, local food environments, and participation in USDA’s food and nutrition assistance programs affect the amount and types of foods that households acquire or the extent to which low-income households rely on alternative sources, such as food pantries or relatives, to supplement their food purchases.

While dietary recall data provide detailed information on what individuals consume and the nutritional quality of those foods, they provide no information on food prices or food access. Consumer expenditure surveys typically lack detail on the form in which foods are purchased, and there is little information on the food items that are purchased from restaurants, fast food places, and other eating places (food away from home). Retail purchase data, which provide detailed information about food items and their prices, tend to underrepresent lower-income households. These data also lack information on food-away-from-home purchases and those acquired for free.

To provide a complete picture of the foods that consumers buy or acquire and the factors that influence their food choices, FoodAPS collected information about quantities and expenditures for the foods and beverages acquired by all household members over a one-week period from grocery stores and other food retailers (food at home); from restaurants, fast food places, and other eating places (food away from home); and from other sources such as schools, community food pantries, and gardens.

The first round of FoodAPS (FoodAPS-1) oversampled lower-income households, particularly those participating in USDA’s Supplemental Nutrition Assistance Program (SNAP), and FoodAPS-1 was linked to administrative data on SNAP participation to improve the quality of collected data.

FoodAPS-1 also collected detailed household and individual characteristics, and linked these data to a rich set of food environment characteristics that are relevant to consumer food choices. The data are nationally representative, as well as representative of SNAP participants and nonparticipants.

What benefits has FoodAPS-1 provided beyond its main objectives?

In addition to its unique and comprehensive content, FoodAPS-1 was innovative in its data collection strategies, providing lessons for other surveys:

  • Hand-held scanners were used to collect item-level information, reducing respondent burden. The Bureau of Labor Statistics' Consumer Expenditure Survey has been investigating the use of scanners.
  • SNAP administrative records and quality control data were used to sample SNAP households more efficiently and to improve data quality, reducing costs and response burden on American households.
  • The field-collected data were enriched by linking product characteristics and nutrient content of food items and details about households' food environment and community characteristics. Because of this linkage, FoodAPS-1 obtained more complete and accurate data without burdening respondents.

How does FoodAPS improve USDA and Federal programs? And, how can FoodAPS improve program efficiency and effectiveness?

Agencies are expected to use evidence-based approaches to design, implement, evaluate, and improve Federal programs and policies. FoodAPS provides insights into the challenges facing low-income consumers, including those participating in USDA food and nutrition assistance programs. This information can improve Federal food and nutrition assistance programs and other programs designed to assist low-income Americans.

FoodAPS’s detailed information on food assistance program participation; types and amounts of foods purchased; prices paid; the influence of the surrounding environment and of household characteristics; and nutritional knowledge and attitudes will allow investigation of such critical policy issues such as:

  • How local food prices influence the cost of healthy diets and food assistance benefit adequacy;
  • How local food environments influence the ability to purchase an economical healthful diet;
  • How nutritional knowledge and attitudes influence the nutritional quality of food purchases and acquisitions.

FoodAPS research will also be useful to other Federal programs that provide assistance to low-income Americans:

  • Information on where low-income Americans shop for food and how they travel to shop for food will inform Federal efforts to understand the connection between geography and access to healthy, affordable foods.
  • Information on how much low-income Americans spend on food, along with the additional socioeconomic information collected, can be useful to program and policy officials in the design of other income support programs, such as housing and heating assistance.
  • Information on product characteristics and nutrient content of foods purchased and acquired will be useful for targeting nutrition education efforts to food assistance program participants and efforts to reduce health disparities.

What policy questions are being addressed by FoodAPS research?

FoodAPS was designed to allow researchers to address important policy questions about food insecurity, food and nutrition assistance policy, the food retail environment, and dietary health and obesity. FoodAPS-1 data are being used to address the following policy-related questions by ERS researchers or through external research funded by ERS:

  • Are SNAP benefits adequate for purchasing a healthy diet, and how does price variation across geographic areas affect the adequacy of SNAP benefits?
  • How does variation in prices of specific foods affect the healthfulness of foods acquired by SNAP and other low-income households?
  • How does the issuance of SNAP benefits on a monthly basis affect food spending and shopping among SNAP households?
  • Do SNAP households cut back on food spending and meals as their monthly benefits are fully used?
  • How does access to large grocery stores and other aspects of the food retail environment, including prices, affect food security, food spending, and diet quality?
  • What factors affect households’ decisions about where to shop for food and their preferences for food at home versus food away from home?
  • Are WIC (Special Supplemental Nutrition Program for Women, Infants, and Children) households less price sensitive when they use WIC benefits, as compared with when they use their own money or SNAP benefits?

What is referenced by the term FoodAPS-2? How does it differ from the terms FoodAPS and FoodAPS-1?

FoodAPS-2 refers to the second round of data collection in the National Household Food Acquisition and Purchase Survey. Several enhancements are planned for FoodAPS-2 which are currently being tested. FoodAPS and FoodAPS-1 refer to the first round of the survey conducted in 2012. In general, the term FoodAPS refers to the original survey design and data collection in 2012.

Is there a need for a second round of FoodAPS (FoodAPS-2)?

Innovations in food production, processing, and marketing, along with consumer concerns about healthy choices, changing product offerings in the food-at-home and food-away-from-home markets, and changing food retail structure and delivery systems are contributing to an evolving food environment. At the same time, access to affordable healthy foods, childhood obesity, food security, and the effectiveness of Federal food and nutrition assistance programs remain top policy concerns.

FoodAPS-2 will provide updated, timely, and relevant information on key policy issues related to the changing food environment and consumer food choices. USDA plans to expand the focus of FoodAPS-2 to cover other population groups of importance to policy and program officials, such as those participating in the WIC program and child nutrition programs. In addition, FoodAPS-2 will enable researchers to link food acquisitions and food commodities in order to assess the farm production implications of alternative baskets of food choices.

Examples of questions related to the changing food landscape are as follows:

  • How will the 2015-16 Dietary Guidelines for Americans or USDA’s initiatives for increased use of farmers' markets and locally-grown food change food purchasing patterns?
  • How will the changing number and characteristics of SNAP participants affect overall demand for food and the food security of America’s children?

FoodAPS-1 has created a baseline against which the dynamics of the retail food sector can be measured.

What were the challenges in FoodAPS-1 and how will they be addressed in FoodAPS-2?

There were some challenges in conducting FoodAPS-1 in terms of collecting reliable and complete data, as well as in making those data available to researchers on a timely schedule and within budget. ERS staff who worked on the project drafted a paper describing some of the lessons learned from the survey collection and processing efforts.

The survey contractor conducted a systematic study of the lessons learned from the contractor’s perspective. This report, along with evaluation studies by an independent contractor, have contributed to ERS’s institutional knowledge and will be highly useful in designing FoodAPS-2 and in monitoring survey operations and progress when the survey is in the field.

Scheduling and budget issues are more tractable than issues of data quality. FoodAPS-2 will address the following data quality challenges:

  • Distinguish between days for which no food was acquired and days for which the respondent failed to report acquisitions.
  • Develop methods to more accurately identify all foods acquired by survey respondents.

What additional value will be achieved in conducting FoodAPS-2?

No single survey can address all policy-relevant data needs related to food-choice research. FoodAPS-1 represents a major leap forward in fulfilling many of those needs, but ERS expects to do even better in FoodAPS-2. ERS will continue to work with and listen to research peers, stakeholders, and FoodAPS-1 data users to understand what would make FoodAPS-2 better in promoting food policy research. ERS is planning to implement the following in FoodAPS-2:

  • FoodAPS-1 made a special effort to collect data from a large number of SNAP households and low-income households not participating in SNAP. A number of sampled households contained WIC participants, but the Food and Nutrition Service (FNS) would like a larger sample of WIC participants in FoodAPS-2. Significant efforts will be devoted to doing so in FoodAPS-2.
  • A large number of children are represented in FoodAPS-1, but the sample may not be nationally representative of all U.S. children by age group. Data from FoodAPS-2 will be nationally representative of children within multiple age categories.
  • ERS expects that the income and food data to be collected in FoodAPS-2 will be more accurate and complete than the data from FoodAPS-1. Lessons learned from prior experience should lead to improved results in the future.
  • Finally, most of the FoodAPS-1 data were collected between mid-April and late December of 2012. No data were collected from mid-January through mid-April. Because the food supply is seasonal and food acquisitions reflect this seasonality, ERS is investigating the possibility of expanding FoodAPS-2 to a year-long survey.

Data Access

Sample Design, Survey Operations, and Protocols

Data Contents

Data Access

Who owns the FoodAPS data?

The FoodAPS data were collected by the U.S. Department of Agriculture (USDA) under authority of U.S.C, Title 7, Section 2026 (a)(1) and are owned by USDA. Mathematica Policy Research, a private research firm with experience conducting large-scale surveys, conducted the survey under contract to ERS. In the field, the survey was called the National Food Study for simplicity (see Informational brochure). The OMB clearance number for FoodAPS was 0536-0068.

Due to the sensitive nature and confidentiality of the FoodAPS data (the data were collected under the Confidential Information Protection and Statistical Efficiency Act, or CIPSEA), the full survey files may be accessed by external researchers only through the National Opinion Research Center (NORC) Data Enclave—a secure data enclave. See Data Access for more information.

Are free, public-use data available?

Yes. Free, public-use files were released in November 2016. The public-use files are stripped of data that pose a risk of disclosing confidential information. However, depending on the research project, the data may be just as viable and informative. See the Overview page to view and download the public-use files.

How can I access the restricted-use data?

Instructions for accessing the restricted-use data are available on the Data Access page. Other links to commonly requested forms are below:

How much does it cost to use the restricted-use data?

Researchers with approved project agreements will receive access to a web-based data enclave. The annual cost for researchers to access the data at NORC is $4,350 per year.

How can researchers who are approved to use the restricted data access the data enclave?

ERS allows users to access the restricted-use FoodAPS data from a remote location. ERS's security checklist requires that the physical security of the workspace not be in a "high traffic" or public area; the monitor not be readily visible from external facing windows or doors; the workspace only be accessible by authorized individuals; and the workspace be locked down when not in use.

If an individual has an account with NORC, he or she should have received instructions for access. The new access is web-based, and thin-clients—special laptops allowing access to the data enclave—are no longer needed. Please make sure that your system meets the following requirements for best user experience:

Connectivity Speed: minimum of 2 MB/sec download.
Operating System: Microsoft Windows 7 (Home, Professional, Ultimate edition) or higher; or Mac OS X 10.8 or higher.
Browser: Internet Explorer 9 or higher; Safari 6.2 or higher; and add URL to trusted sites to ensure desktop launches automatically.
Citrix Receiver: Receiver version 14.3 or higher; install/update as needed.

How long does it take to receive approval and gain access to the restricted data?

It normally takes a week or two to receive approval, and it will take approximately three weeks for NORC to set up accounts and provide training on the data enclave.

How can I access summary tables and aggregate figures for the FoodAPS-1 data?

Users may access public-use files from the Overview page and create summaries on their own. ERS is also developing a Data Tool so that users can easily create crosstabulations and charts by selecting from a list of variables. The tool will be released within the coming months.

To access detailed aggregate figures, you will need to follow the process of gaining access to the restricted-use data files through USDA and NORC (see Data Access).

How can students gain access to the restricted FoodAPS data?

To access the restricted-use data set, students need to collaborate with a professor to submit a proposal because students cannot serve as project directors. Review of a Project Agreement will only take a week or two, so funding should be secured prior to submitting a proposal. However, there is no need to show that funding has been secured when the proposal is submitted.

As the cost of access can be prohibitive, especially for students, you may want to consider using the FoodAPS public-use files available on the Overview page.

How can a new researcher be added to a FoodAPS restricted data agreement?

To add a new researcher to a data agreement, please amend the project agreement and notify ERS via email with “PROPOSED CHANGE TO AGREEMENT’ in the subject line. The message should explain the proposed change and any expected changes to analytic output that would be submitted for review.

The new researcher will need to complete the CIPSEA training and sign a pledge of confidentiality before being granted access to the restricted data.

Furthermore, all research results must be reviewed for disclosure risk and be approved by the ERS Confidentiality Officer or his/her designee before the results can be shared with anyone who has not signed an ERS confidentiality agreement specifically related to FoodAPS data.

For restricted-use data, the Project Agreement form states that the Project Leader cannot be a student. However, if a student will be conducting the research, who should be designated as the Project Leader? Will payment be for two accounts, even though only one personthe studentwill need access to the data? If applicable, the dissertation chair or the student’s academic adviser should be designated as the Project Leader. The NORC account will be set up for the student, and payment will be due for only the one account. However, the professor supervising the work is also required to be CIPSEA-certified.

What do I need to do if the scope of my restricted-data project has changed?

If the research team decides to make any changes to its approved Project Agreement, the original Project Agreement will need to be amended. This needs to be done so that when research materials are submitted for review, ERS will have an up-to-date agreement for reference. In addition, ERS does not want researchers to spend time on analyses that could later create a disclosure risk and thereby not be allowed to be published.

Proposed changes that are directly related to the approved research questions and pose no additional disclosure risk will not need a formal amendment to the Project Agreement, but they will be maintained on file to assist in ERS’s review of project output(s). Upon request, ERS will alert the research team if a formal amendment is needed. If the agreement does need amending, the team will need to submit a Request for Amended Project Agreement (see the Data Access page for more information).

What information can be removed from the NORC data enclave?

Before any type of output—including tables, charts, graphs, slide presentations, draft reports, and final reports—can be downloaded from the NORC Data Enclave, it will be reviewed by both NORC and ERS for disclosure risk and adherence to outputs specified in the Project Agreement. This review requirement includes preliminary or interim results meant to be shared outside of the data enclave with other CIPSEA-trained and approved colleagues on the research team.

If it is determined from a review that disclosure risk is present, the research team and the ERS reviewer may work together to reach an agreement on output that would be approved for download from the data enclave. If there is a disagreement between the research team and the ERS reviewer as to whether any proposed output poses a disclosure risk, the research team may request a review by the ERS Confidentiality Officer. The Confidentiality Officer’s decision regarding disclosure risk will be final.

Neither subsets of any FoodAPS data file nor sections of codebooks containing distributions of variable responses may be exported from the data enclave. Standard output from statistical packages often contains information not planned for publication (for example, minimum and maximum values of a variable) that could increase disclosure risk, and requests to export such output from the data enclave are usually rejected. Researchers may copy and paste relevant portions of a statistical output into a text file for export, or have their statistical package output Excel files for subsequent review and export.

Does ERS need to review a summary version of a previously reviewed and approved report using restricted data for publication in a journal?

Yes, any alterations to a previously-approved report must be reviewed again before publication.

Does a new project agreement need to be submitted for an additional study, if the researcher(s) already have access to the restricted-use data?

If the new project is a separate study, a new project agreement will need to be submitted. If you are making an adjustment to an already-approved study, OR if you are conducting new research in a related area, you may submit a project agreement amendment (see Data Access for detailed instructions).

Is there a procedure for merging an outside dataset with the data in the NORC Data Enclave?

Researchers may upload an external data set to their project folder in the NORC data enclave. For further instructions on how to do so, contact the NORC Data Enclave Manager.

What is included in the IRI data, and how can a researcher gain access?

There are multiple data sets that comprise the IRI data. At a broad level, the IRI Consumer Network data includes information about where households shopped, when they shopped, and what they purchased for food at home. Another data set, InfoScan, contains scanner data from retail food sales by store type and market area. A third data set—the product dictionaries—include product descriptions with information such as brand and flavor for items with Universal Product Codes (UPC). Finally, the nutrition and claims data set includes nutrition facts and health-related claims (such as lower cholesterol) for UPC items. For more information, see:

Understanding IRI Household-Based and Store-Based Scanner Data

Use of proprietary data is governed by contractual agreements between USDA and data owners. In order to gain access to the proprietary IRI data, it is necessary for a researcher to collaborate with an ERS researcher on a USDA-sponsored project. USDA-sponsored projects include ERS grants, non-USDA funded collaborative agreements, ERS cooperative agreements, and direct collaboration with ERS researchers.

Usually, external collaborations are formed independently between ERS and external researchers based on similar research interests. If you have someone in mind you would like to work with, please contact them directly.

Would a grantee have access to the weekly point-of-sale data (including UPC and product description) from FoodAPS retailers through IRI?

If a USDA-sponsored research project requires access to a specific proprietary data source, the institution with which the research collaborator is affiliated must enter into a Third Party Agreement (TPA) with the data provider.

Can proprietary IRI data be linked to household food purchases?

Point-of-sale data from IRI can be matched to FoodAPS retailers and places that FoodAPS households visited, but the IRI data cannot be matched to specific purchasers or FoodAPS households. In addition, the IRI point-of-sale data have poor coverage of stores in some of the survey areas.

Sample design, survey operations and protocols

What is the survey’s sample design?

FoodAPS is a nationally representative, stratified sample of 4,826 households surveyed between April 2012 and January 2013. Stratification was based on participation in the Supplemental Nutrition Assistance Program (SNAP) and poverty level. The four strata are as follows:

  1. Households receiving SNAP benefits;
  2. Non-SNAP households with income less than the poverty level guideline;
  3. Non-SNAP households with income at or above the poverty guideline and less than 185 percent of that level; and
  4. Non-SNAP households with income equal to or greater than 185 percent of the poverty guideline.

Prior to sampling, 948 PSUs within the continental United States were defined as counties or groups of contiguous counties. In forming PSUs, metropolitan statistical area (MSA) boundaries were used (some MSAs were split into multiple PSUs, but in no case was part of one MSA joined to part of another MSA to form a PSU). One large PSU was sampled with certainty, and 49 non-certainty PSUs were selected using probability proportional to size (PPS) with implicit stratification based on the metropolitan status of the PSU and its FNS region. Eight secondary sampling units (SSUs) were formed within each of the 50 sampled PSUs; each SSU is a census block group (CBG), or a group of contiguous block groups if the CBG did not meet minimum size requirements.

SSUs were selected using PPS sampling as well. Within sampled SSUs, addresses for screening were selected from two primary sampling frames: (1) a list of addresses of all SNAP participants active in February 2012; and (2) a list of addresses in the Address Based Sampling (ABS) list, obtained from the United States Postal Service Delivery Sequence File, that were not on the SNAP list of addresses. In states for which SNAP administrative data could not be obtained, the ABS list was used as a single sample frame. Finally, field listing of addresses was done in a few rural areas, and the field listing was used as a single address frame.

Residential units were sampled in two phases. FoodAPS used this two-phase sampling approach for conducting the screener interview as a way to reduce the potential of non-response bias. In the first phase, attempts were made to screen all released sample addresses. If no contact was made after a pre-specified number of attempts at different hours of the day and week, the address was moved to a sample frame for Phase 2. Addresses in this frame were sampled for additional contact efforts toward the end of the survey period.

For more information, see the User's Guide.

How does one account for the sample design when using the data?

Each household and individual observation has a final sampling weight that makes the sample nationally representative of all non-institutionalized households in the contiguous United States.

The household weights were constructed in three stages. In the first stage, the weights accounted for the differences in the probability of selection across households and then were adjusted to account for unit nonresponse. The second stage of the weighting process involved post-stratifying the weights to replicate external estimates of the number of households in the United States and the distribution by specific demographic and economic characteristics using a raking process (iterative proportional fitting). The final stage of the weighting process involved trimming the weights to reduce the variability of the weights and the overall design effect.

Each household is given a final sampling weight, HHWGT. The weights were constructed for the household, but can be applied to individual-level analyses as well. Software such as SUDAAN, STATA and SAS can be used to estimate sampling errors by the Taylor series method (linearization), using the HHWGT along with the stratum variable, TSSTRATA, and the PSU variable, TSPSU. The variables necessary for Taylor series variance estimation are attached to both the faps_household and faps_individual data file.

Section 6.1 of the User's Guide outlines the construction of the sampling weights and how users can apply these weights to their analyses. Appendix C includes detailed examples of variance estimation.

Who is the primary respondent, and how was he or she selected?

The primary respondent (PR) is the main food shopper or meal planner in the household. The PR was identified during the screener interview and was asked to complete two in-person interviews and three telephone interviews over the course of the survey.

Over what time period were FoodAPS-1 data collected?

The first initial interviews were conducted in mid-April, 2012, and the last initial interviews were conducted in mid-January, 2013. The distribution of initial interviews by month of data collection is as follows:

April 2012: 83
May 2012: 624
June 2012: 585
July 2012: 636
August 2012: 888
September 2012: 643
October 2012: 727
November 2012: 445
December 2012: 127
January 2013: 68
Total: 4,826

How were height and weight measured/recorded?

The primary respondent was asked to provide the height and weight of each individual in the household in either English or metric units. If respondents were reluctant to tell the interviewer weights, they were allowed to enter this information directly using the CAPI laptop. All heights and weights have been converted to English units of inches or pounds, respectively. Extreme values of height and weight (based on age-specific height and weight distributions) have been set to missing.

Are the names of places from which a household purchased food left missing for confidentiality reasons?

In the restricted-use data files at NORC data enclave, names of places from which households purchased food are available for analysis, but these names cannot be used in any presentations or papers based on the data. For example, one cannot say that a certain percentage of all purchases was made at Walmart). Place names are excluded in the public-use data files, but a measure of place type—for example, supermarket, convenience store, coffee shop, or buffet restaurant—is available.

When working in the secure environment, users can gain access to the geographic coordinates of each geo-coded food store or public eating place that were reported by a household in the survey. However, not all PRs provided enough information to identify the specific location for their primary and alternate stores or the stores visited during the week. For example, 6.6 percent of primary stores (317 stores) could not be geo-coded because a unique and valid address could not be determined.

A project must have an analytic need for these retailer geo-codes in order to gain access to them. See the Data Access page to learn how to gain access to the FoodAPS data.

Can researchers find out which counties or States are included in the FoodAPS data?

To reduce the risk of disclosure of confidential FoodAPS information about survey respondents, ERS cannot disclose any specific information about the PSUs or SSUs in which the survey was fielded. A list of States included in the survey is available for projects needing this information (for example, to construct an external file of county-level data to upload to NORC and subsequent merging with the FoodAPS data). Researchers needing county or census block group information for an approved or proposed project may request access to the Geography Component data within the restricted-use NORC data enclave.

Note that the FoodAPS data were collected to be nationally representative of the continental U.S. and were not designed for State- or county-level analyses.

Data Contents

Where are the definitions/distinctions between terms such as household, family, gueststype1, guesttype2, food at home (FAH), food away from home (FAFH), and other terms?

For in-depth definitions and distinctions, please refer to the FoodAPS User's Guide, Household Codebook, and Individual Codebook.

Is there a way to distinguish urban/rural and metro/nonmetro? How many observations are in each area?

There are two variables to identify metro/nonmetro and rural/urban status. In the household data, the variable NONMETRO identifies if a household resides in a census core-based statistical area (CBSA), and the variable RURAL identifies if a household resides in a rural census tract. The NONMETRO indicator and the RURAL indicator do not necessarily coincide. NONMETRO is based on whether or not the county in which the household lives is within a CBSA, while the RURAL indicator is based on the census tract in which the household lives. An unweighted crosstab of the variables is displayed in the table below. For more information, see the Household Codebook.

Table of NONMETRO variable by RURAL variable
NON­METRO RURAL (FARA: rural tract)
(Household does not reside in a CBSA) 0 1 Total
0 3,408 992 4,400
1 107 319 426
Total 3,515 1,311 4,826

Do the data include individual Body Mass Index (BMI) measures?

Reported height and weight have been used to compute the Body Mass Index (BMIVALUE) for individuals age 2 or older. The variable BMISTATUS indicates whether the individual is of “normal weight”, “overweight,” or “obese”, according to Centers for Disease Control and Prevention (CDC) standards. When height was reported and the respondent either refused to provide an individual’s weight or did not know the weight, the value for BMISTATUS was determined by asking whether the individual’s weight exceeded or fell below weight thresholds for reported heights. The variable BMIFLAG indicates the records of individuals for which bounding thresholds were used to determine BMI status.

Is there a question on shopping frequency?

No, the survey did not ask respondents about the usual number of times they went shopping per week.

How can individuals with disabilities be identified?

The PR was asked if they have difficulty using the phone because of a disability (PRDISPHONE), if they have difficulty writing because of disability (PRDISWRITING), and if they have difficulty with memory/concentration/making decisions. The survey did not ask questions about specific disabilities outside of the questions mentioned above. See the Household Codebook for more information.

Depending on specific research needs, disability income may be identified by determining if the individual or household reported income from disability payments (INCRETDISIND, INCAMOUNT4). Additionally, if an individual was not working, a follow-up question was asked to determine why. Researchers may use the variable REASONNOWORK to determine disability status for those who are unemployed. Furthermore, the SCHLEVEL variable may be used to identify children with a disability who are not attending school. See the Individual Codebook for more information on disability income.

Why are there two variables indicating presence of income from a trust fund?

Trust fund payments could be recorded as a source of investment income or as a source of “other” income. Mathematica included this in two locations because: 1) they found another government survey that collected it this way; 2) they found some government surveys collecting it as investment income and others that collected it as “other” income; and 3) the field test indicated that respondents sometimes included it as “other” income.

How was SNAP participation confirmed?

To confirm respondents’ reports of SNAP participation, records of households that had given consent for data matching were matched against two sets of SNAP administrative data: State-level enrollment files for March through November 2012 and transaction records from the program’s electronic benefit transfer (EBT) ALERT database. ALERT data contain one record for each swipe of an EBT card and include information on: State, store ID, date/time, EBT account number, EBT card number, dollar amount of purchase, and balance remaining in the account. Although SNAP issuance dates—the dates at which SNAP benefits are transferred to recipients—are not in the ALERT database, they often may be closely approximated by seeing when the remaining balance increases between two consecutive transactions.

For more information, see the Household Codebook.

How was SNAP eligibility of each household determined?

SNAP eligibility was estimated four times, using different assumptions about income and composition of the SNAP unit. See the Household Codebook for more information.

Does one of the four alternatives for SNAP eligibility closely match how SNAP eligibility would be defined with respect to an individual child?

The alternatives include different assumptions for the estimates. Users can review the Household Codebook for more information. Section of the Household Codebook details the performance of these estimates from a number of different metrics:

Among the 1,581 FoodAPS households with current SNAP participants (SNAPNOWHH=1), the estimations identify about 79 to 87 percent as having at least one SNAP-eligible unit (column b of Table 7). This means that about 13 to 21 percent of households with SNAP participants are estimated to not be eligible, depending on the treatment of income and identification of SNAP units within the household. Run 3, which allows for multiple SNAP units per household and does adjust reported net earnings, performs the best in minimizing such "false negatives".

Section 2.4.10 of the Household Codebook also includes information on known data anomalies that could be useful:

As noted earlier, the estimations treated foster children differently in the four runs. In runs 1 and 2, the 17 foster children in the sample were included as part of the SNAP unit, even though SNAP regulations exclude foster children from being part of a unit. In runs 3 and 4, foster children were not assigned to a SNAP unit.

It appears that there are households in the data set that are receiving SNAP but that are also above the SNAP Federal Poverty Level (FPL) requirement. Is this correct?

Yes, it is possible to see households with income above the gross income limit for SNAP. Some reasons for this include the following:

  1. Categorical eligibility eliminates the gross income test.
  2. There could be multiple SNAP units within the same household—the household unit in FoodAPS does not necessarily correspond to the unit that the Food and Nutrition Service (FNS) uses to determine eligibility for SNAP. Specifically, there may be multiple SNAP units within a FoodAPS household.
  3. Eligibility was determined at an earlier point in time, and there have been variations in monthly household income over time.

Does the Youth Food Survey include meals and snacks purchased at school?

Yes, the data include school purchases and acquisitions, including free school lunches and breakfasts. The PR completed books on behalf of children 11 years of age and younger.

Can a reimbursable school meal be identified?

No—school meals cannot be identified as reimbursable. There are cases of children in the same household reporting different costs of school meals, and even some children with different costs for school breakfasts and lunches. No attempt to edit this information has been made. It is possible that children who attend different schools have access to different levels of subsidies. For example, one child may attend a school that offers universal free breakfast and lunch, while another child in the same household attends a different school that does not offer free meals to all students, and the student is only eligible for reduced-price or full-priced meals. Moreover, ERS does not know how eligibility for school meals was determined.

How detailed is the PLACENAME variable?

In the restricted-use data at the NORC data enclave, the PLACENAME variable is specific. If a household reported a Walmart shopping event, one would observe “WALMART” under PLACENAME. However, place names are not disclosed in the public-use files.

Stores were also assigned a three-digit "place type" code which is used to classify places consistently across FAH and FAFH event places. The PLACETYPE variable is available in the public-use files.

What are the ten food security questions in the Household data based on? How is the composite score calculated?

The 10 food security questions are based on USDA’s 30-day Adult Food Security Scale. The ADLTFSCAT variable is calculated according to the methodology described in USDA’s Revised Guide to Measuring Household Food Security. Imputations for missing responses are based on a household’s responses to other valid items (page 36-7 of guide). Section 2.3.7 of the Household Codebook has further details about the food security questions and variables.

How can researchers interpret the store type categories (especially the SNAP categories—for example, what is the difference between SS and SM; LG and MG and SG)?

FNS classifies stores based on information, such as sales, reported by stores. The exact classification guidelines are confidential. The list below displays the SNAP category value and value description.

Please note: a store’s SNAP authorization status may change over time, so the presence of a SNAP store code does not necessarily mean that the store was authorized to accept SNAP benefits during a household’s food data collection week. Similarly, absence of a SNAP store code does not mean that the store was not authorized to accept SNAP benefits. A store could have become SNAP-authorized after the STARS match file was created (December 2011), or lack of a valid address could have prevented a match. Additionally, places with the same name (such as a chain) may be classified as different place types because of variance in size.

Value Value description
BC: Non-Profit Cooperative
CO: Combination Grocery/Other
CS: Convenience Store
FM: Farmer's Market
LG: Large Grocery Store
MC: Military Commissary
ME: Specialty—Meat/Poultry
MG: Medium Grocery Store
SG: Small Grocery Store
SM: Supermarket
SS: Super Store

How were driving and walking distances and time determined when there are multiple routes to get to a location?

Distance measures were calculated once all geocoding of places was completed. Straight-line distances from each household to each place were calculated by a SAS function, while walking and driving distances and times were obtained from the Google Maps Application Programming Interface (API). When multiple routes were possible, the default provided by Google was selected.

There are 16 households where the driving distance to the primary store is shorter than the straight-line distance, and 9 households where this is the case for the alternate store. There are 11 households where the walking distance to the primary store is shorter than the straight distance (8 for the alternate store). In all of these cases, the difference is less than 0.01 miles. This may be due to the different methods employed to calculate the distances (SAS for straight-line vs. Google for driving and walking distances).

Is there a variable indicating whether a food-away-from-home acquisition was purchased from a fast food restaurant?

No, there is no fast food indicator. The PLACETYPE variable may be a rough indicator, but “Mexican Restaurant” could include everything from a Taco Bell to a formal, sit-down Mexican restaurant.

Why do some places have addresses and others do not in the restricted-use data?

Some places are missing addresses because the address was not reported or observed.

Why does the sum of item-level expenditures (either FAH or FAFH) not match event-level expenditures?

The sum of item expenditures and the total expenditures in the event data may not match for several reasons. The TOTALPAID in the event data is the total amount of the purchase reported for the event and may include nonfood items. TOTALPAID also includes food (and possibly non-food) taxes and, if applicable, container deposit fees. If there are any imputed item prices for the event, the imprecision of the imputation process may cause the sum of item prices to not equal TOTALPAID. Finally, there are some items for which missing price data were not imputed, and it is possible that some purchased food items were not reported at all.

To estimate the expenditure on food, the closest estimate will be the sum of the item expenditures, knowing that this is an underestimate, on average, due to missing prices or food items.

What payment types are captured?

Information on method of payment was collected for transactions, so if a respondent used SNAP, WIC or a different payment method to pay for an acquisition, the method will be recorded in the event-file data. Some events have multiple payment methods reported. For these events, it is not possible generally to match individual food items to payment type. The only exceptions would be when SNAP benefits or a WIC food instrument were used and some of the purchased items are not program eligible.

For information on which program variables were collected, see the following codebooks: FAH Event Codebook, FAFH Event Codebook, and Household Codebook.

Why are some events (not indicated as free) missing payment types?

Payment types are missing from some events if they were not reported or observed on the receipt.

Does the survey provide point-of-sale data for specific cities in the U.S.? Are there any additional geo-referenced data that would cover point-of-sale data for a specific city?

ERS cannot disclose any specific information about city representation. The FoodAPS data were collected to be nationally representative of the U.S. and were not designed for State- or city-level analyses. That is, the sample is too thin at the State or city level, and the sampling weights were not designed to provide city-level estimates.

Other data sources such as the IRI or Nielsen point-of-sale data include household and retail scanner data on household food purchases and retail food sales. The household data are derived from over 120,000 households who report what food products they purchased, when they shopped, and where they shopped. The retail scanner data cover a large portion of retail food sales in the United States and contain billions of transactions by store, outlet type—such as grocery or convenience store, and market area. These data are only available for researchers working on a USDA-sponsored project.

Regarding item level-data, is there one record per unique item or multiple records for two or more of the same item?

Each record in the item-level files does not necessarily represent a unique product. Because the receipt sometimes served as a guide to entering items into the database, the purchase of multiple units of the same item, such as two boxes of a specific cereal, may appear in the data two different ways. If the barcodes were scanned or the receipt recorded the purchase on two separate lines (one for each box), the FAH item data will include two line records, one for each box of cereal. However, if the barcodes were not scanned and the receipt recorded the two boxes on only one line, the data will include one record for the two boxes, but the quantity will be marked as "2". FAFH items are recorded as reported by respondents and, just as with FAH items, each record does not necessarily represent a unique food item. Items can be linked to the event record using EVENTID, which is unique across all FAH and FAFH events and all households.

In a hypothetical scenario, if a family purchased four burgers for dinner at Burger King, and the mother (also the primary respondent) purchased all of the food, would she write down that she purchased four burgers for dinner? Or would her husband and two teenage children who were also tracking their food acquisitions write that they each acquired one "FREE" burger for dinner from Burger King?

Respondents were asked to record each event once, so the primary respondent should have recorded the food event for all household attendees, while the other household members would not. However, it is possible that multiple household members would have recorded the same event and entered only their portion. The contractor, Mathematica, tried to remove such duplications, but there is always a chance that they missed some.

Is it possible to determine if someone purchased a bundled meal such as a kid's meal, as well as the individual components?

Yes, one can identify kid combo meals and the respective items obtained in that combo. However, the ability to link the items in a combo and to see all the items in a combo depends on how the household reported the food information, and if limited detail was reported, on whether the meal/combo could be assumed with near certainty. For example, if the household reported a "kid’s chicken nugget meal" at McDonald’s, but not the separate items, the contractor filled in the items that come standard with such a meal. If the household reported a kid's meal from a different kind of restaurant (Aunt Sally’s, etc.), and did not report the items, then ERS would not be able to fill in such data. Additionally, if multiple combos were purchased, the contractor may not have been able to link the individual items to specific combo meals.

For the most part, the various caveats or limitations mentioned here are a small share of the data.

How can food items that were purchased by weight (random weight items) be identified?

The variables VARWGTCOUNT and VARWGTLBS can be used to identify items purchased by weight. Items with UPCs do not have values for these variables.

Can barcodes in FoodAPS be matched with barcodes in other data sets?

The EAN is the “real” barcode that would typically match with external data, but the EAN variable is 13 digits, not 12. If FoodAPS barcodes are 12 digits, try dropping the first digit from EAN. Users should be able to match about 60 percent of FAH products with IRI point-of-sale (POS) products.

For perishables, users may be able to trim a PLU out of the BARCODE and BARCODE_ORIGINAL variables but may have to match by category/product/type.

Why are there some missing item descriptions, quantities, and prices?

Item descriptions were stripped from the data set because the data are proprietary data from IRI. Access to the item descriptions requires a Third Party Agreement (only available for USDA-sponsored projects). Some users may be able to use food category, food groups, or food classifications to substitute for the restricted-item description data.

In most cases, items have prices. If the price data are missing, the price could not be identified, or the item is part of a combo purchased away from home.

How are foods grouped?

Food items are grouped in three main variables. USDAFoodCat1 identifies nine main food categories. Food items are also grouped into more detailed Thrifty Food Plan (TFP) groups. The TFP groups include 29 different categories.

Variables FOODCODE, USDAFOODCAT1, USDAFOODCAT2, and USDAFOODCAT4 can be used to identify food types.

Can FoodAPS places be matched to the Thrifty Food Plan (TFP) data? What match rate can be expected?

Yes, FoodAPS places can be linked to the TFP data using the linker file PlaceID_TempERSID.xlsx (available to researchers with access to the NORC Data Enclave). This file matches the PlaceID (from faps_places) to an ERS-created ID that is included in the basket prices data set. This allows a user to link IRI store-week price to places visited by FoodAPS households. Note that the basketprice data do not include all stores that FoodAPS households may have visited because the IRI store from which they were derived are not inclusive of all stores. Because of this, users should expect that about 30 percent of the non-restaurant food retailers visited by FoodAPS households have a match to the basket price data.

Which of the TFP basket cost measures should be used?

There are two basket cost measures, and it is up to the user to determine which is best for their research. For more information, see the Construction of Weekly Store-Level Food Basket Costs Documentation.

Why are there so many FLAG variables?

The Office of Management and Budget (OMB) requires any data modifications be flagged. OMB’s Standards and Guidelines for Statistical Surveys specifies that:

Agencies must add codes to collected data to identify aspects of data quality from the collection (e.g., missing data) in order to allow users to appropriately analyze the data. Codes added to convert information collected as text into a form that permits immediate analysis must use standardized codes, when available, to enhance comparability.

Please review the guidelines for additional information.

Are there measures of access to stores that are not SNAP authorized?

There are several summary measures of access to all stores aggregated to the census block group and tract level, as well as at the county level. The Retail Environment Study Codebook details these measures, which are based on a combined list of stores from TDLinx and the SNAP-authorized store list, STARS. Additionally, users may obtain access to TDLinx store locations if they obtain a Third Party Agreement with ERS and Nielsen to use the TDLinx data for a USDA-sponsored project.