DSDR KB: Data Type Questions

  1. I'm interested in looking at birth weight as a measure of child well-being. I can only find this in the Child Development Supplement (CDS), but this is only available from 1997 on.
  2. The PSID is not based on a simple random sample. What variables should I use for complex sample survey variance estimation?
  3. Where can I find out the change in the distribution of religious denominations over time?
  4. I need to know if there is a good source to look up 2000 PUMA geography via maps or lists. I need to get an idea of how different metros and counties line up with PUMAs.
  5. I am looking for the total number of births (no rates), 1975-present, by state. I need for this to be in electronic form if possible.
  6. I am unsure how to merge geographic data to a file that I have. I noticed that the census tract numbers do not stand alone in the data file that I downloaded from the census and that the GEO variable is saved as a text variable so I am unable to merge my data sets using that variable. Would you be willing to show me how to link these two data sets if all I have is the census tract number?
  7. I am looking for estimates of the size of the 15-44 year old female population by state for a far back as I can get it. I am trying to construct fertility rates and have births back to 1930. Even if I can not get them back to 1930, getting them back to the late 1940s would be very helpful.
  8. I am trying to some mortality data for the state of Michigan and Montcalm county, Michigan from 1960 to present. We’d prefer age-adjusted figures over age-specific. Do you have any thoughts on how best to obtain these data? This is for a health disparities documentary and the data are needed immediately.
  9. Can you please give me an estimate of the 2006 size of the baby boomer cohort (born 1946 to 1964)?
  10. I downloaded a US Census Extract recently and have been running some descriptive statistics. In terms of race, Brazilian median household income immigrants mostly identify as white, "some other race" and "two or more races". I would like to find out how they classified in the latter 2 categories. Is there any way I can do this? Also, if it is just a matter of creating an extract with specific variables (e.g. "OTHER", NUMRACE), would it be possible to add these variables to my existing dataset? Or would I need to do a completely new data extract with all of the variables?
  11. I need time series data on attitudes in New York City from 1999 to the present.
  12. I am doing an analysis with the PSID and I need a very simple variable – race. Where is the race variable?
  13. Is there a document or documents that compares occupation codings in the census over time . . . say 1960-2000?
  14. I would like to get a copy of the Census 1990 elderly households 8% sample dataset and codebook. Please let me know if it is available and the procedure for obtaining it.
  15. I want to compare the mobility of siblings in the PSID. How can I find siblings considering they no longer live in their family of origin?
  16. Do you know if the 1960 census asks the same ‘where did you live 5 years ago’ question that was asked in 1970-2000?
  17. Do you know where I can find a list of the variables that are available in the summary files for the 2000 census down to the block level? I should know the answer to this by now.
  18. Do you know if there is a way to construct a Hispanic measure in the 1960 PUMS? I assume by the 1970 census there is a Hispanic category. Is that right?
  19. The PSID changed their sample in 1997 - reduced the number of original families that are included in the survey. How does this affect the weights?
  20. I can not find the 2000 version of the 1990 “School District Data book." I know that there is a 2000 version, since the National Center for Education Statistics has it on their website: []( The NCES interface is extraordinarily cumbersome to use.
  21. I need data from several years of the March Current Population Survey but notice that these data do not have set-up files associated with them.
  22. I need national data from Summary File 3 from the 2000 Census. I need data at the county, census tract, and block level – summary levels 40, 140, and 150). This looks like a lot of files to download for just these summary levels. Is there a better solution?
  23. I am working with the National Comorbidity Survey Replication (NCS-R) and I want to know the exact sampling method, universe eligibility, etc. for the following items: TB15L: Self report of tobacco as causation of emotional problems SC21, SC22, SC23: Variables related to depression DA31B_101: Religious preference DA40: Age of mother when you were born PEA52: Personality question - “I often feel empty inside"
  24. Do you know if the summary files include information on the US protectorates in the Pacific (Guam, Marshall Islands, Samoa, Micronesia, etc.) and the Virgin Islands? I know the Census Bureau conducts a census in these locations.
  25. Is there a public use version of the ACS, similar to PUMS, with household data? Or just the interface where you get custom tables?
  26. Do you know anything about the National Longitudinal Mortality Study? I am interested in whether the cause of death (CAUSE) is available in the public use data.
  27. The 1970 natality detail file does not have FIPS codes to identify states and counties. It has NCHS codes. Is there a crosswalk between the two?
  28. What level of geography do natality detail files go down to?
  29. I am considering using Geolytics Neighborhood Change Database for some longitudinal analyses that I am doing with death certificate data geocoded at the county level. However, I am a little unclear if the NCDB has info for nonmetro counties. The documentation is a bit confusing on this point. Could you advise?
  30. I am using microdata from the 2000 Census of Population and Housing. I have found some families with a code of "0" on family income. Is this an error on the part of the Census Bureau? If not, what is the explanation?
  31. Why are there zero weights in the 1990 public use microdata file for the U.S. census?
  32. I am looking at the life history calendar for the Chitwan Valley (Nepal) study data. The total number of observations listed is 5,271, but there are only 1,469 observations in the life history data. What is the solution?
  33. ICPSR has the Early Head Start Research and Evaluation (EHSRE) study but it only has the public use file. Where is the restricted version of the data and what are the access conditions?
  34. I want to create a kml file using zip code data and boundaries. Can you point me in the right direction?
  35. I am generating some age-specific counts for various neighborhoods to compare with data in the Research Data Center (RDC). However, I am having trouble getting these counts to agree with what previous workers on this project have generated. Can you take a look at my numbers vs the project counts for our neighborhoods?
  36. I have found some estimates data on the Census Bureau website, but I have no idea how to look at it. Can you help me?
  37. I am interested to public use microdata from the Survey of Consumers. I would like it for as many years as possible. I do not know if the producers readily give this out but I know of researchers who have access to the data.
  38. The Pew Hispanic Center put out a report last November with Race, age and citizenship status info from the September 2007 CPS. (See table below from report): []( I didn’t know you could get race info from anything other than the March CPS. What do you know about race information available for other months?
  39. I am trying to get a sense of the American Community Survey (ACS). It is harder than I would have thought. Am I correct in thinking that annually they survey approximately 3,000K households per year? Are the samples independent of each other - e.g., 2005 and 2006? Finally, they do not seem to ask about country of birth. This seems odd although you might be able to make an estimate with ancestry and citizenship.
  40. I am interested in creating some tables from the 2006 General Social Survey. It looks like ICPSR has the data. How easy is it to make a 4-way table – Political affiliation * Age * Education * Race/Ethnicity?
  41. Can you help me find a public-use codebook for Baccalaureate and Beyond (B&B)and the National Education Longitudinal Study (NELS)?
  42. I have some census characteristics that I want to put into GIS. However, I am having trouble doing so because the FIPS codes in my data are represented as numbers; e.g., 1 for Alabama instead of 01. Likewise, Autauga county is represented as 1 instead of 001. I need an ID variable for Autauga county, Alabama that looks like 01001. How can I convert numeric data back to text in Excel? Right now, state is in one column and county is in the next column.
  43. How is the residence of prisoners noted on death certificates - the prison or the original home address? What about other institutional populations (dorms, hospitals, etc.)?
  44. I am looking for the size of the 18-19 year old population in the US over time. This must be available from the Census Bureau but I am not finding anything.
  45. Why do the counts from SF1 and SF3 differ?
  46. I am using the IPUMS data and for 1970 the only file available is the 1970 metro, form 2. Is this a 1% file?
  47. I want to download the 2006 ACS microdata from the Census Bureau website as a stata file. Can you point me to the location where I can do this?
  48. I am looking for census data by gender on age, marital status, education, employment status, income, and race for congressional districts for each Congress from the 103rd to the 110th.
  49. I want to apply for the restricted version of PSID data (geocode). Can you clarify what sort of secure environment PSID requires.
  50. I have looked at census data on commuting patterns, but there is no information on the characteristics of the commuters. How can I get data on commuters? I am interested in the townships surrounding Philadelphia. I also need maps that show townships for Pennsylvania. I'd like the maps to include major roads.
  51. I want to use the NLS97 cohort for a study of religious affiliation of youth. I am having trouble finding religious preference.
  52. Do you know of a source of vital statistics data for the United Kingdom and Canada? My ideal data would be single years of age and cause of death for many years.
  53. I want to combine two years of the March CPS to compare with data from a survey that took place over a two-year period. The survey took place in 1994 and 1995.
  54. Does the American Community Survey (ACS) have data for zip codes?
  55. I need a longitudinal data file that has multiple measures of blood pressure. The sample should include women of all races.
  56. Is there data for Washtenaw county in the ACS yet? How many counties in Michigan are available?
  57. Is there a way to get a count of the number of single women over 40 who moved in the past year?
  58. I am trying to match data for 2006 and 2007 from the March CPS so that I can track respondents across the two years. I downloaded the NBER matching files, but they don't seem relevant to data this recent. I also read the information about matching in the CPS codebook, but it's not clear to me how to operationalize their suggestions. Would it be possible to get some assistance with this issue?
  59. Where can one get abortion data? Right now I need data for the state of New York, but I'd like to have it for all states.
  60. I have a neighborhood that I define by census tracts. How can I create characteristics for these neighborhoods in SAS
  61. I need many tables from the summary census data for 1990 at the census tract and block group level. I find American FactFinder difficult to use because I can only get data for a single tract at a time. I need all census tracts and block in Los Angeles county. Is there a better solution?
  62. I have some students working for me who are unfamiliar with statistical packages. Is there a way that they can pull off data for all census tracts in the tri-county area for Detroit? I have specific tables that I want for 1990 and 1980.
  63. Does PSID have a value of car variable for the time period 1987-1997?
  64. Can I identify the lower 9th ward in New Orleans by zip code or census tracts?
  65. I need median household income for all zip codes in California. Where can I get access to this?
  66. I want to combine the 2005 and 2006 ACS microdata. What do I do to the weights?
  67. I have a student looking to measure the geographic dispersion of families (at least in the U.S., and other places if possible), specifically: 1) how many families with children at home have grandparents who are out of state and 2) how many divorced parents with children at home are not co-located in the same state (e.g., child lives with one parent and the other is out of state or child spends time with each parent in separate states)." There seem to be lots of measures of an individual household's mobility (from the U.S. Census and CPS per se), and the U.S. Census even has a measure of grandparents living in the same household with children. Do you know of any nationally representative data that will fit the bill?
  68. What are the geographic areas available for the 2007 ACS?
  69. The 2007 ACS data were released today. How long will it take for us to get the microdata?
  70. I am using data from selected zip codes in California for 1980 to 2000. How can I tell if the boundaries for these zip codes have remained the same?
  71. I am using data from the American Community Survey (ACS) for Monroe and Lenawee counties in Michigan. The tables from American Factfinder have margins of error for all the cells. However, if I am combining the two counties is there a way to calculate new margins of error based on this larger population?
  72. I have a student who is interested in post-retirement employment patterns and determinants. She is very interested in looking at trends in the experience and correlates of post-retirement employment. We are thinking about using the CPS and taking advantage of the "month in sample" variables to construct some short longitudinal files so that we can estimate the prevalence and determinants of returning to work after retirement. So here are my questions: 1. If a respondent reports being retired at time t, are they asked questions about employment in subsequent interviews? 2. Do you know of any technical bulletins or such information that lays out exactly how to link respondents across the their multiple interviews?
  73. I have some health areas that are defined on the basis of census tracts. When I try to get race-specific results using summary file data from the census (2000) using American FactFinder, I find that some of my census tracts are missing. I know these tracts exist. What is going on?
  74. I have a student who is looking for annual data for each state on the party composition of their state legislatures, specifically the percent of Democrats in state legislative chamber.
  75. I am using a restricted data file that has zip codes for the geocode ID. I need to add some race-specific characteristics to the zip codes. However, zip codes are not iterated by race the way other geographies are (e.g., states, counties, census tracts). Is there a way around this?
  76. I need data for all zip codes in the nation. What is the easiest way to obtain characteristics on zip codes?
  77. I use NHGIS to get summary data from the census - not just historical data, but even the 2000 Census. However, when I try to get data for zip codes, I can only get data for one zip code at a time. Is there a solution for this?
  78. Can one get data on Polish Americans from the census?
  79. I am trying to work with the Early Childhood Longitudinal Survey Kindergarten cohort data or some other data that will allow me to look at the impact of school year length (especially long school year calendars) on student achievement. Do you know of any other sources of education data?
  80. I have a student who is trying to find time series physical capital stock from years more recent than are provided by the Penn World Table. Do you know if these data are available anywhere post-early 1990's? A search of the literature points to Nehru and Dhareshwar (1993) dataset, which can be found at
  81. We were wondering if you might know of a center or individual on campus with the 2001 Canadian Census Public Use Sample. It does not appear that ICPSR has it, and Statistics Canada only appears to offer it for a fee. We might also be interested in looking at the 1996 data.
  82. A grad student is looking to compare birth data for specific days across several years (2000-2002) in the United States. Does anyone know where these data may be found?
  83. Where can I find data on exact dates of death?
  84. How do I concatenate two files in SAS?
  85. I am interested in the examining the American Indian populations. Do you have an idea if the ACS includes American Indians, on and off the researvations?
  86. I am looking for a number of US births in each month from January 1985 to December 1994. I have looked on NCHS's website and cannot find it. Can you help me?
  87. I am a doctorate student and am interested in analyzing Add Health and NLSY data. Where can I get permission to access the data? Could I publish a paper with the data analyzed?
  88. What proportion of American Community Survey (ACS) interviews end up in the ACS microdata samples?
  89. Do you have a quick way to access the Current Population Surveys (March supplements) prior to 1992? I am trying to get a small extract from each year, but the DataFerrett access only lets me go back to 1992.
  90. A student of mine is interested in a survey called the Longitudinal Survey of Immigrants to Canada. Can you help her with access?
  91. I am interested in using the "housework" question in the Canadian census. IPUMS-i has the data, but it looks like they left out that variable.
  92. I need a data file that is gathered monthly so that I can examine the change in optimism for the population. I need for the data collection to have taken place in 2008.
  93. I have run into an ArcMap mxd file that has a cell that I am interested in joining to that is defined as 'string'. This is happening in the counties template within my ArcGIS version 9. The cell has a 5 digit value representing a state and county mixed FIPS code (i.e. 26183 ) How do I go about converting the cell to a numeric one?
  94. What is the difference between OCCSOC and OCCCEN in the 2000 census (and the ACS)?
  95. How can an intercensal estimate change? I have an estimate from a P-25 report for July 1, 1977 and it does not agree with what is on the Census Bureau estimation web site.
  96. What does data defined person mean? It is an item in the microdata for 2000.
  97. I am pulling race-specific county level data from the 1990 summary file data. I have to use the original summary files because neither American FactFinder or NHGIS iterate characteristics across all the race groups I need. However, I am getting strange results. I get multiple records for my counties.
  98. Where can I find out more information on the quality of the data used in the American Community Survey?
  99. How can I make a map without having special software?
  100. I need more information on IPUMS. Where can I find it?
  101. What are the sizes of the geographic units for the 2006 ACS and how are they determined?
  102. How do you create county level data using microdata files?
  103. What are the sizes of the geographic units for the 2007 ACS and how are they determined?
  104. What types of institutions are considered "group quarters" in the 2006 ACS? How will the inclusion of GQ affect comparisons with previous ACS?
  105. How do I use American FactFinder?
  106. Is the Master Address File that the Census Bureau uses for data collection public use or not?
  107. I am interested in creating a contextual file for counties, places, and MCDs form 1970 - 2000. What is the best source for this? I want my items to be measured the same over time. I cannot really determine the feasibility of this with Geolytics' Neighborhood Change database.
  108. When will new zip code data be made available?
  109. I want to create an annual file with census-type characteristics of counties in California. What is the best source for this? This needs to be current, but I want it updated every year.
  110. I need a table of poverty status and tenure for Washtenaw and Wayne Counties. However, I am defining the poor population as below 2x of the poverty line rather than the normal cutpoint.
  111. Can you help me find the University of Michigan Inflation Expectation Survey?
  112. Do you know of a source for data on Southern Cities before World War II? I want to look at racial residential segregation during the interwar period in Houston, Little Rock, Atlanta and Raleigh. I do not want to rely on ward data, since the size, shape, and number of wards within each city changed often during this period.
  113. I need historical data on life expectancy and child outcomes for Nepal and how this compares to the US historically. In other words, when did the US look like Nepal in 1970?
  114. I am publishing an article based on the public use NLS79 file. Do these data need to be acknowledged like the restricted data?
  115. Can you help me with historical data on rates of homicide and aggravated assault for the 1940s and 1950s for large American cities?
  116. Can one interpolate between the 2000 and 5-year ACS (2005-2009) to get characteristics of census tracts? In other words, can the 5-year data be thought of as a snapshot for 2010?
  117. Is there any source that reports percent of rural population in the USA by race (1965-2010)?
  118. What is up with the allocation item for earnings in the March CPS? It drops from a reasonable 22% to around 2% between 1987 and 1988. I am using data from IPUMS-CPS (qincwage).
  119. Where can I get exact date of birth data for the US?
  120. Where can I find a digital version of birth data for Michigan counties, for the 1950s and 1960s?
  121. I have funny results for the distribution of education using 1950 IPUMS data. What am I doing wrong?
  122. I recently downloaded the Excel files [Median Household Income and Mean Household Income [2006-2010]]( I was confused about the differences between different sheets in the file. For example, the median incomes on the "national" sheet don't match up with the median incomes on the "median" sheet. Is there a mistake?
  123. I am examining the relationship between metropolitan-level foreclosure and racial residential segregation in U.S. American cities between 1990 and 2010. I am including characteristics which can be found in decennial census surveys and ACS estimates. One variable absent from these sources is the age in which the largest city in the metro reached a population size of 50,000. I was curious to know of any source that has already compiled this.


