PECAN: Matched Case-Control Data with 1:1 Matching

Input commands are in blue.
Statements generated by PECAN are in green.

The data is from the Los Angeles Endometrial Cancer Study (Breslow and Day, vol I), a matched case-contol study of the effect of exogenous estrogens on the risk of endometrial cancer among women in a retirement community. The example makes use of 63 case-control pairs consisting of the case and first control from each matched set in the data set.

Goal of this analysis:

To reproduce some of the analyses described in Section 7.3 of Breslow and Day (1980).  The data is in free format in the file leimod.dat It contains the following variables: setno-set number; cases -case\control indicator; gall - gall bladder disease; est - estrogen; cdose - dose of estrogen; dur - duration in months.
PECAN Version 2.0 July 1996
NAMES setno cases gall hyp ob est dose dur non dura age cdose @
INPUT leimod.dat @
Input frex\leimod.dat
315 records read 315 records used
0 records rejected
Workspace for 300 variables. 14 are currently defined.
Up to 286 new variables can be created
63 useable case/control sets with 315 records

The missing command recodes missing vales for dose, ob, and dura to the EPICURE missing value code. Cases and cdose are categorical variables. A new 3 level categorical variable is created.
MISS ob 9; cdose 9 @
LEVELS cases cdose @
TRAN ageg = (age >= 65) + (age >= 75) @
LEVELS ageg @
CASES has 2 levels from 0 to 1
CDOSE has 4 levels from 0 to 3
AGEG has 3 levels from 0 to 2

Compute summary statistics.
MEAN est ; by cases @
MEAN dur; by ageg@
FREQ cases cdose@
Summary for EST
CASES Mean Count Std. Dev.
0 0.503968 252 0.50098
1 0.888889 63 0.31679
Summary for DUR
AGEG Mean Count Std. Dev.
0 2.16667 60 2.7257
1 1.68208 173 2.1987
2 1.13415 82 1.9923
TABULATE FREQUENCY:
Entries: count
% of total
% of row
% of col
CDOSE
CASES | 0 1 2 3 | Total
--------+--------------------------------+--------
      0 | 143   44    42    19    | 248
        | 46.58 14.33 13.68  6.19 | 80.78
        | 57.66 17.74 16.94  7.66 |
        | 92.26 73.33 73.68 54.29 |
      1 | 12    16    15    16    | 59
        |  3.91  5.21  4.89  5.21 | 19.22
        | 20.34 27.12 25.42 27.12 |
        |  7.74 26.67 26.32 45.71 |
--------+--------------------------------+--------
Total   | 155   60    57    35    | 307
        | 50.49 19.54 18.57 11.40 |

Fit a conditional logistic regression model for the odds ratio. Begin with a model with no covariates. Define this as the initial null model:
FIT @
NULL @
Conditional-logistic regression (1:M matching)
Product additive excess model {T0*(1 + T1 + T2 + ...)}
CASES is used for cases
SETNO is used to define sets
Deviance = 202.789 Free parameters = 0
Number of risk sets = 63
NULL @

Fit a multiplicative model with estrogen exposure status and gall bladder disease effects. The odds ratio model can be written as: 

FIT gall est @
FIT est gall @ LRT
Iter Step Deviance
0 0 202.7892
1 0 160.0688
2 0 157.7848
3 0 157.7426
4 0 157.7426

Conditional-logistic regression (1:M matching)
Product additive excess model {T0*(1 + T1 + T2 + ...)}
CASES is used for cases
SETNO is used to define sets
Parameter Summary Table

#   Name                   Estimate      Std.Err.  Test Stat. P value

-- ----------------------- ------------ ---------- ----------- -------
    Log-linear term 0
1   GALL .................... 1.275     0.4109      3.102       0.002
2   EST ..................... 2.115     0.4398      4.809     < 0.001
Records used = 315
Deviance = 157.743 Free parameters = 2
Number of risk sets = 63
Non-informative risk sets = 4
LR statistic = 45.05 df = 2
P = 0.0000

Compute the Wald and Likelihood based confidence bounds:
CI@
BOUND 1 @
BOUND 2 @
95% Confidence Bounds
#    Name                      Estimate   Std. Error   Lower     Upper
-- ------------------------- ----------- ----------- ----------- ---------
Log-linear term 0
1    GALL .................... 1.275       0.4109      0.4694      2.080
EXP(estimate)                  3.577       1.508       1.599       8.004
2    EST ..................... 2.115       0.4398      1.253       2.977
EXP(estimate)                  8.288       1.552       3.500      19.62
BOUND 1 @
Likelihood bound for parameter 1 GALL
MLE 1.275 exp(MLE) 3.577
97.50% lower bound 0.47599
         exp(bound) 1.6096
97.50% upper bound 2.1027
         exp(bound) 8.1886
BOUND 2 @
Likelihood bound for parameter 2 EST
MLE 2.115 exp(MLE) 8.288
97.50% lower bound 1.3158
         exp(bound) 3.7279
97.50% upper bound 3.0617
         exp(bound) 21.365

Define the last model fit as the null model. Add a gallbladder disease history by estrogen usage interaction. Fit this model and compute the likelihood ratio test for the interaction term.
NULL
FIT + gall*est @ LRT
NULL @
FIT + gall*est @ LRT
Iter Step Deviance
  0   0    157.7426
  1   0    153.7168
  2   0    153.4613
  3   0    153.4612
Conditional-logistic regression (1:M matching)
Product additive excess model {T0*(1 + T1 + T2 + ...)}
CASES is used for cases
SETNO is used to define sets
Parameter Summary Table
#     Name                   Estimate     Std.Err.  Test Stat. P value
-- ----------------------- ------------ ---------- ----------- -------
Log-linear term 0
1    GALL .................... 2.894     0.8831       3.278     0.001
2    EST ..................... 2.700     0.6118       4.414   < 0.001
3    GALL * EST .............. -2.053    0.9950      -2.063     0.039
Records used = 315
Deviance = 153.461 Free parameters = 3
Number of risk sets = 63
Non-informative risk sets = 4
LR statistic = 4.281 df = 1
P = 0.0385

Clear the current model and then specify an alternative form for the odds ratio. The new model is

NOMODEL @
LINE 1 est gall @
FIT @
Model has been reset
LINE 1 est gall @
FIT @
Iter Step Deviance
  0   0    202.7892
  1   0    170.3245
  2   0    158.3328
  3   0    154.4549
  4   0    153.5711
  5   0    153.4809
  6   0    153.4792
  7   0    153.4792
Conditional-logistic regression (1:M matching)
Product additive excess model {T0*(1 + T1 + T2 + ...)}
CASES is used for cases
SETNO is used to define sets
Parameter Summary Table
#    Name                    Estimate    Std.Err.   Test Stat. P value
-- ----------------------- ------------ ---------- ----------- -------
Linear term 1
1    EST ..................... 13.95     9.139       1.526      0.127
2    GALL .................... 18.23    14.43        1.263      0.207
Records used = 315
Deviance = 153.479 Free parameters = 2
Number of risk sets = 63
Non-informative risk sets = 4

Again consider a model with gall bladder disease by estrogen usage status interaction. Compute the likelihood ratio statsitistic for this model relative to the last model fit. Also include a hypertension effect with its parameter fixed at 0. This leads to a single degress of freedom score test for this effect. The statistic, given in the score column, is the signed square root of the usual chi-square statistic and can be interpreted as a normal deviate.
NULL
LINE 1 + gall*est hyp=0 @
FIT @ LRT
Iter Step Deviance
  0   0    153.4792
  1   0    153.4612
  2   0    153.4612
Conditional-logistic regression (1:M matching)
Product additive excess model {T0*(1 + T1 + T2 + ...)}
CASES is used for cases
SETNO is used to define sets
Parameter Summary Table
#       Name                 Estimate     Std.Err. Test Stat.   P value
-- ----------------------- ------------ ---------- ----------- -------
Linear term 1
1   EST ..................... 13.88       9.104       1.525     0.127
2   GALL .................... 17.07      15.96        1.070     0.285
3   GALL * EST .............. 2.573      19.17        0.1342    > 0.5
4   HYP ..................... 0          Fixed        0.6390    > 0.5
Records used = 315
Deviance = 153.461 Free parameters = 3
Number of risk sets = 63
LR statistic = 0.1810E-01 df = 1
P = 0.8930

Return to HiroSoft's Home Page
Back to the directory of examples
Back to the PECAN overview