DATAB: Creating an Event-Time Table with Time-dependent Variables

Input commands are in blue.
Statements generated by DATAB are in green.

Goal of the analysis: To create a table of person years for an occupational cohort study that includes information on time-dependent cumulative exposures.

Data: The data are from a study of cancer incidence in a cohort of Chinese workers with occupational exposures to benzene. There are up to 7 annual exposure measurements and a relatively complex set of criteria for determining eligibility for follow-up.

Describe the input variables and format. These data can not be read as free format. Enter a series of transformations used to define the outcome variables. (Additional transformations of this type are not shown here.)

NAMES   exptype city sex inc_stat inc_icd cause4 
        meas1 meas2 meas3 meas4 meas5 meas6 meas7
        byr bmo bdy             entyr entmo entdy
        exp1yr exp1mo exp1dy       dxyr dxmo dxdy
        wu77yr wu77mo wu77dy  exityr exitmo exitdy
        id  @ 
FORMAT  '(f1.0,f2.0,f1.0,f1.0,f3.0,t6,f3.1,t16,7f7.2,9f2.0,t113,3f2.0,
 	t131,6f2.0,f5.0)' @   CON
Use constant transformations to define values related to the exposure lag. 
CONSTANT       #yrlag = 3;      ! Years of lag
      #lag = #yrlag * 365.25 @  ! Turn lag into number of days

Define an array of annual exposure values. Enter a series of transformations used to define the outcome variables. (Additional transformations of this type are not shown here.)
TRAN
! define case by checking against current cause code
IF inc_stat == 1 and (
(( inc_icd >= 200 ) and ( inc_icd <= 219 )) or
(( inc_icd >= 280 ) and ( inc_icd <= 289 ))
) THEN AHP = 1;
ELSE AHP = 0;
ENDIF;

IF inc_stat == 1 and (
(( inc_icd >= 200 ) and ( inc_icd <= 219 )) or
( inc_icd == 287 )
) THEN HPD = 1;
ELSE HPD = 0;
ENDIF;

IF inc_stat == 1 and ( 
       (( inc_icd >= 200 ) and ( inc_icd <= 219 ))
        ) THEN HPM = 1;
 ELSE HPM = 0;
 ENDIF;
IF inc_stat == 1 and ( 
       (( inc_icd >= 200 ) and ( inc_icd <= 204 )) or  
       ( inc_icd == 210 ) or ( inc_icd == 214 )        
        ) THEN LHM = 1;
 ELSE LHM = 0;
 ENDIF;
IF inc_stat == 1 and (
        ( inc_icd == 200 ) or
        (( inc_icd >= 202 ) and ( inc_icd <= 204 )) or
        ( inc_icd == 210 ) or ( inc_icd == 214 ) 
        ) THEN LHX = 1;
 ELSE LHX = 0;
 ENDIF @
Additional transformations to define various dates used in the construction of the table.

! Add 1900 to years of birth, followup, endpoint date 
TRAN  byr = byr + 1900; dxyr = dxyr + 1900;
  entyr = entyr + 1900; exityr = exityr + 1900;
  exp1yr = exp1yr + 1900; wu77yr = wu77yr + 1900; 
! Calculate Julian days for entry and exposure dates 
  entj=julian(entyr,entmo,entdy); 
  expj =julian(exp1yr,exp1mo,exp1dy); 
! Date of cut points for exposure periods 
  cut1=julian(1949,1,1); cut2=julian(1960,1,1); 
  cut3=julian(1965,1,1); cut4=julian(1970,1,1);
  cut5=julian(1975,1,1); cut6=julian(1980,1,1);
  cut7=julian(1985,1,1); cut8=julian(1988,1,1); 
! Reassign entry date to exposure date (omits prexposed person years)
IF (expj>entj) and (exptype==1)
THEN entyr = exp1yr; entmo = exp1mo;
   entdy = exp1dy; ENDIF;
! If incidence case, reassign exit to diagnosis; 
IF (inc_stat==1) THEN 
  exityr = dxyr; exitmo = dxmo; exitdy = dxdy; 
ENDIF; 
! Calculate julian Days at exit and entry to work unit 77
 exitj =julian(exityr,exitmo,exitdy);
 wu77j =julian(wu77yr,wu77mo,wu77dy);
! Reassign exit date to work unit 77 date 
! omits unknown level of exposed time 
IF (wu77j<exitj) and (exptype==1) THEN 
  exityr = wu77yr; exitmo = wu77mo; exitdy = wu77dy;
  inc_stat = 0; ENDIF; @
! End of TRAN statement

Define arrays to hold the measurement interval cutpoint and exposure value variables.

ARRAY \cutp cut1 - cut8 @
ARRAY \twe meas1-meas7 @
LEVELS city 1:5 @

Designate city as a categorical variable and indicate that a table is to be created. Specify the resolution for the time scale evaluation to be 1 year. (the default is to determine this from the category definitions.

TABULATE over city @
RESOLUTION 1 @

Define cumulative exposure categories. Use TTRAN to specify the time-dependent transformations that define the cumulative exposure variable.

TDEP cumexpg
  0    0 ] "never" 
  0   10    "0-10" 
 10   20   "10-20"
 20   30   "20-30" 
 30   40   "30-40" 
 40   50   "40-50" 
 50   60   "50-60" 
 60   70   "60-70" 
 70   80   "70-80" 
 80   90   "80-90" 
 90  100  "90-100" 
100  120 "100-120"
120  140 "120-140" 
140  160 "140-160" 
160  180 "160-180"
180  200 "180-200" 
200  250 "200-250" 
250  500 "250-500"
500 1000 "500-1000"
1000 999999 ">1000" 
TTRAN cumexpg=0; 
   #i = 1; 
 DOWHILE ((#i <= 7) and (caltm >= expj));
    en_i = MAX(\cutp(#i),expj);
    ex_i=MIN(\cutp(#i+1),(caltm-#lag));
    IF (ex_i <= en_i) THEN cum_i = 0;
       ELSE cum_i = ((ex_i-en_i)* \twe(#i))/365.25;
    ENDIF;
    cumexpg = cumexpg + cum_i; 
    #i = #i + 1;
  ENDDO;
cum_exp = cumexpg; @

Continue with the definition of additional category variables. Exposure type and sex do not depend on time, while age and calendar time are time scales.

CATEGORY exptype
 0 1 unexposd 
 1 1 exposed @ 
CATEGORY sex 
1 2 male 
2 2 female @ 

! Time from the date of birth
TIME age FROM byr MONTH bmo DAY bdy AS ageg
0 / 15 / 20 / 25 / 30 / 35 / 40 / 45 / 50 / 55 / 60 / 99 @
CALENDAR caltm TO 1987 12 31
1972 1 1 @

Specify the variables that contain the entry and exit dates for each person.
ENTRY entyr entmo entdy @
EXIT exityr exitmo exitdy @

Define the summary variables to be computed for each cell. These include event counts and risk set means.
FCOUNT as frisk @
EVENT AHP @ EVENT HPD @ EVENT HPM @ EVENT LHM @ EVENT LHX @
EVENT LYM @ EVENT NHG @ EVENT NHN @ EVENT NHX @ EVENT MMY @
EVENT LEM @ EVENT LEU @ EVENT LYL @ EVENT ALL @ EVENT CLL @
EVENT MLD @ EVENT MML @ EVENT AMD @ EVENT AML @ EVENT CML @
EVENT OUL @ EVENT BFO @ EVENT APA @ EVENT MDS @ EVENT AGC @
MEAN cum_exp ; MEAN age ;@

Specify the variables to be written to a reject file. Whenever a subject does not contribute any person-years to the table they will be written to this file.
REJECT id entyr exityr exptype city sex; TO nowork.rej @

Indicate that the data are to be read and the table created.
INPUT benz2.dat @

A table description is printed. Only part of this description is shown here. Following the creation of the table you can compute descriptive statistics, define additional summary variables, and save the data.

Input from benz2.dat 
            DATAB Version 2.0 Jul 1996                             July 18, 1996 0:01:53 
Description of table: 
    Variable   Category   Lower           Upper 
Number Name  Number Name  Bound           Bound 
1 CITY 
               1     1    [ 1.000          1.000 ]
               2     2    [ 2.000          2.000 ]
               3     3    [ 3.000          3.000 ]
               4     4    [ 4.000          4.000 ]
               5     5    [ 5.000          5.000 ]
 2 CUMEXPG
                1  NEVER  [ 0.0000E+00 0.0000E+00]
                2   0-10  [ 0.0000E+00     10.00 ) 
                3  10-20  [ 10.00          20.00 ) 
                4  20-30  [ 20.00          30.00 ) 
                5  30-40  [ 30.00          40.00 )
                6  40-50  [ 40.00          50.00 )
<< Part of table description omitted here >> 
    Summary Variables 
     1) %CELLNO  2) AT_RISK   3) PYR    4) FRISK    5) AHP 
     6) HPD      7) HPM       8) LHM    9) LHX     10) LYM 
    11) NHG     12) NHN      13) NHX   14) MMY     15) LEM 
    16) LEU     17) LYL      18) ALL   19) CLL     20) MLD 
    21) MML     22) AMD      23) AML   24) CML     25) OUL
    26) BFO     27) APA      28) MDS   29) AGC     30) CUM_EXP 
     The potential number of cells is 4400 
Workspace for 4400 cells with 110 summary variables 

Return to HiroSoft's Home Page
Back to the directory of examples
Back to the DATAB overview