« Home

Home » Business & Economics » Economics » Dougherty: Introduction to Econometrics 3e » Student resources » Data sets » National Longitudinal Survey of Youth (NLSY) » About the NLSY

Dougherty: Introduction to Econometrics 3e

About the NLSY

NLSY panel data set

(Used in Exercises 14.5–14.8 in the text)

The data set is a sub-set of a major US data-base, the National Longitudinal Survey of Youth (NLSY79). NLSY79 is a panel survey in which a nationally-representative sample of young males and females aged 14 to 21 in 1979 have been re-interviewed since 1979. Until 1994 the interviews took place annually and now they are being conducted at two-yearly intervals. The core sample originally consisted of 3,003 males and 3,108 females. In addition there are special supplementary samples (some now discontinued) of ethnic minorities, those in poverty and those serving in the armed forces. Extensive background information was obtained in the base-year survey in 1979 and since then information has been updated each year on education, training, employment, marital status, fertility, health, child care and assets and income. In addition special sections have been added from time to time on other topics – for example, drug use. The surveys have been extremely detailed and the quality of the execution of the survey is very high. As a consequence NLSY79 is regarded as one of the most important data bases available to social scientists working with U.S. data.

The data relate to the years 1980–1994, 1996, 1998, and 2000. Note that there are many missing data. Obviously if a respondent was not interviewed in a given year, all data for that year are missing. In addition many data are missing for specific reasons.

The data are restricted to males whose marital status is either single or married, who are not in school, for whom ASVAB scores are available, who worked at least 30 hours per week and whose reported hourly rate of pay was at least $2.50 and not more than $250,

The variables listed below were recorded for each respondent for each of the years 1980–1994, 1996, 1998, and 2000. Hence there are potentially 18 observations for each respondent. However, owing to non-interviews or exclusions, the actual number is lower for many respondents and the panel is of the unbalanced type.

 

Personal variables

ID
C
Respondent identification number
AGE
C
age
AGESQ
C
square of AGE
S
C
years of schooling (highest grade completed)
Ethnicity:
ETHBLACK
D
black
ETHHISP
D
hispanic
HEIGHT85
C
height in inches in 1985
WEIGHT
C
weight in pounds
Score on a component of the ASVAB battery (scaled with mean 50, standard deviation 10):
ASVAB2
C
arithmetic reasoning
ASVAB3
C
word knowledge
ASVAB4
C
paragraph comprehension
ASVABC
C
composite of ASVAB2 (with double weight),ASVAB3 and ASVAB4
SM
C
mother’s years of schooling
SF
C
father’s years of schooling
SIBLINGS
C
number of siblings
CHILDREN
C
number of children in the household
YOUNGEST
C
age of youngest child
MARRIED
D
married in the interview year
SINGLE
D
single in the interview year
SINGBOTH
D
single in the interview year and four years later
SOONMARR
D
single in the interview years but married four years later
URBAN
D
living in an urban area
Region of residence (census classification):
REGNE
D
north-east
REGNC
D
north-central
REGW
D
west
REGS
D
south

 

Work-related variables

EARNINGS
C
current hourly earnings in 1996 constant dollars
HOURS
C
hours worked per week
TENURE
C
years worked with present employer
TENURESQ
C
square of TENURE
EXP
C
total years of work experience
EXPSQ
C
square of EXP/td>
Sector of employment:
CLASSPRI
D
private sector employee
CLASSPUB
D
public sector
CLASSSE
D
self-employed
UNION
D
member of a union (question asked 1988-2000 only)
UNCOLB
D
wages set by collective bargaining

 

C indicates a continuous variable, D a dummy variable.