Input commands are in blue.
Statements generated by DATAB are in green.
Goal of the analysis: To create a table of person years for an occupational cohort study that includes information on time-dependent cumulative exposures.
Data: The data are from a study of cancer incidence in a cohort of Chinese workers with occupational exposures to benzene. There are up to 7 annual exposure measurements and a relatively complex set of criteria for determining eligibility for follow-up.
Describe the input variables and format. These data can not be read as free format. Enter a series of transformations used to define the outcome variables. (Additional transformations of this type are not shown here.)
NAMES exptype city sex inc_stat inc_icd cause4
meas1 meas2 meas3 meas4 meas5 meas6 meas7
byr bmo bdy entyr entmo entdy
exp1yr exp1mo exp1dy dxyr dxmo dxdy
wu77yr wu77mo wu77dy exityr exitmo exitdy
id @
FORMAT '(f1.0,f2.0,f1.0,f1.0,f3.0,t6,f3.1,t16,7f7.2,9f2.0,t113,3f2.0, t131,6f2.0,f5.0)' @ CON
Use constant transformations to define values related to the exposure lag.
CONSTANT #yrlag = 3; ! Years of lag
#lag = #yrlag * 365.25 @ ! Turn lag into number of days
Define an array of annual exposure values. Enter a series of transformations
used to define the outcome variables. (Additional transformations of this
type are not shown here.)
TRAN
! define case by checking against current cause code
IF inc_stat == 1 and (
(( inc_icd >= 200 ) and ( inc_icd <= 219 )) or
(( inc_icd >= 280 ) and ( inc_icd <= 289 ))
) THEN AHP = 1;
ELSE AHP = 0;
ENDIF;
IF inc_stat == 1 and (
(( inc_icd >= 200 ) and ( inc_icd <= 219 )) or
( inc_icd == 287 )
) THEN HPD = 1;
ELSE HPD = 0;
ENDIF;
IF inc_stat == 1 and (
(( inc_icd >= 200 ) and ( inc_icd <= 219 ))
) THEN HPM = 1;
ELSE HPM = 0;
ENDIF;
IF inc_stat == 1 and (
(( inc_icd >= 200 ) and ( inc_icd <= 204 )) or
( inc_icd == 210 ) or ( inc_icd == 214 )
) THEN LHM = 1;
ELSE LHM = 0;
ENDIF;
IF inc_stat == 1 and (
( inc_icd == 200 ) or
(( inc_icd >= 202 ) and ( inc_icd <= 204 )) or
( inc_icd == 210 ) or ( inc_icd == 214 )
) THEN LHX = 1;
ELSE LHX = 0;
ENDIF @
Additional transformations to define various dates used in the construction of the table. ! Add 1900 to years of birth, followup, endpoint date TRAN byr = byr + 1900; dxyr = dxyr + 1900; entyr = entyr + 1900; exityr = exityr + 1900; exp1yr = exp1yr + 1900; wu77yr = wu77yr + 1900; ! Calculate Julian days for entry and exposure dates entj=julian(entyr,entmo,entdy); expj =julian(exp1yr,exp1mo,exp1dy); ! Date of cut points for exposure periods cut1=julian(1949,1,1); cut2=julian(1960,1,1); cut3=julian(1965,1,1); cut4=julian(1970,1,1); cut5=julian(1975,1,1); cut6=julian(1980,1,1); cut7=julian(1985,1,1); cut8=julian(1988,1,1); ! Reassign entry date to exposure date (omits prexposed person years) IF (expj>entj) and (exptype==1) THEN entyr = exp1yr; entmo = exp1mo; entdy = exp1dy; ENDIF; ! If incidence case, reassign exit to diagnosis; IF (inc_stat==1) THEN exityr = dxyr; exitmo = dxmo; exitdy = dxdy; ENDIF; ! Calculate julian Days at exit and entry to work unit 77 exitj =julian(exityr,exitmo,exitdy); wu77j =julian(wu77yr,wu77mo,wu77dy); ! Reassign exit date to work unit 77 date ! omits unknown level of exposed time IF (wu77j<exitj) and (exptype==1) THEN exityr = wu77yr; exitmo = wu77mo; exitdy = wu77dy; inc_stat = 0; ENDIF; @ ! End of TRAN statement
Define arrays to hold the measurement interval cutpoint and exposure value variables.
ARRAY \cutp cut1 - cut8 @
ARRAY \twe meas1-meas7 @
LEVELS city 1:5 @
Designate city as a categorical variable and indicate that a table is to be created. Specify the resolution for the time scale evaluation to be 1 year. (the default is to determine this from the category definitions.
TABULATE over city @
RESOLUTION 1 @
Define cumulative exposure categories. Use TTRAN to specify the time-dependent transformations that define the cumulative exposure variable.
TDEP cumexpg 0 0 ] "never" 0 10 "0-10" 10 20 "10-20" 20 30 "20-30" 30 40 "30-40" 40 50 "40-50" 50 60 "50-60" 60 70 "60-70" 70 80 "70-80" 80 90 "80-90" 90 100 "90-100" 100 120 "100-120" 120 140 "120-140" 140 160 "140-160" 160 180 "160-180" 180 200 "180-200" 200 250 "200-250" 250 500 "250-500" 500 1000 "500-1000" 1000 999999 ">1000"
TTRAN cumexpg=0;
#i = 1;
DOWHILE ((#i <= 7) and (caltm >= expj));
en_i = MAX(\cutp(#i),expj);
ex_i=MIN(\cutp(#i+1),(caltm-#lag));
IF (ex_i <= en_i) THEN cum_i = 0;
ELSE cum_i = ((ex_i-en_i)* \twe(#i))/365.25;
ENDIF;
cumexpg = cumexpg + cum_i;
#i = #i + 1;
ENDDO;
cum_exp = cumexpg; @
Continue with the definition of additional category variables. Exposure type and sex do not depend on time, while age and calendar time are time scales.
CATEGORY exptype 0 1 unexposd 1 1 exposed @ CATEGORY sex 1 2 male 2 2 female @
! Time from the date of birth
TIME age FROM byr MONTH bmo DAY bdy AS ageg
0 / 15 / 20 / 25 / 30 / 35 / 40 / 45 / 50 / 55 / 60 / 99 @
CALENDAR caltm TO 1987 12 31
1972 1 1 @
Specify the variables that contain the entry and exit dates for each
person.
ENTRY entyr entmo entdy @
EXIT exityr exitmo exitdy @
Define the summary variables to be computed for each cell. These include
event counts and risk set means.
FCOUNT as frisk @
EVENT AHP @ EVENT HPD @ EVENT HPM @ EVENT LHM @ EVENT LHX @
EVENT LYM @ EVENT NHG @ EVENT NHN @ EVENT NHX @ EVENT MMY @
EVENT LEM @ EVENT LEU @ EVENT LYL @ EVENT ALL @ EVENT CLL @
EVENT MLD @ EVENT MML @ EVENT AMD @ EVENT AML @ EVENT CML @
EVENT OUL @ EVENT BFO @ EVENT APA @ EVENT MDS @ EVENT AGC @
MEAN cum_exp ; MEAN age ;@
Specify the variables to be written to a reject file. Whenever a subject
does not contribute any person-years to the table they will be written
to this file.
REJECT id entyr exityr exptype city sex;
TO nowork.rej @
Indicate that the data are to be read and the table created.
INPUT benz2.dat @
A table description is printed. Only part of this description is shown here. Following the creation of the table you can compute descriptive statistics, define additional summary variables, and save the data.
Input from benz2.dat
DATAB Version 2.0 Jul 1996 July 18, 1996 0:01:53
Description of table:
Variable Category Lower Upper
Number Name Number Name Bound Bound
1 CITY
1 1 [ 1.000 1.000 ]
2 2 [ 2.000 2.000 ]
3 3 [ 3.000 3.000 ]
4 4 [ 4.000 4.000 ]
5 5 [ 5.000 5.000 ]
2 CUMEXPG
1 NEVER [ 0.0000E+00 0.0000E+00]
2 0-10 [ 0.0000E+00 10.00 )
3 10-20 [ 10.00 20.00 )
4 20-30 [ 20.00 30.00 )
5 30-40 [ 30.00 40.00 )
6 40-50 [ 40.00 50.00 )
<< Part of table description omitted here >>
Summary Variables
1) %CELLNO 2) AT_RISK 3) PYR 4) FRISK 5) AHP
6) HPD 7) HPM 8) LHM 9) LHX 10) LYM
11) NHG 12) NHN 13) NHX 14) MMY 15) LEM
16) LEU 17) LYL 18) ALL 19) CLL 20) MLD
21) MML 22) AMD 23) AML 24) CML 25) OUL
26) BFO 27) APA 28) MDS 29) AGC 30) CUM_EXP
The potential number of cells is 4400 Workspace for 4400 cells with 110 summary variables
Return to HiroSoft's Home
Page
Back to the directory of examples
Back to the DATAB overview