Washington University School of Medicine > Department of Psychiatry > Epidemiology and Prevention Research Group
C DIS-IV
Psychometric Properties

Psychometric properties of the DIS and related instruments have been studied extensively – including test-retest reliability studies, test-comparison studies, longitudinal studies and factor analytic studies (e.g., Hasin & Grant, 1987a, 1987b; Helzer et al., 1985; Hesselbrock, Stabenau, Hesselbrock, Mirkin, & Meyer, 1982; Robins, Helzer, Croughan, & Ratcliff, 1981; Rogler, Malgady, & Tryon, 1992; Semler et al., 1987; Vandiver & Sher, 1991; Wittchen et al., 1989).

The current version of the DIS was tested for reliability and validity in a study among substance abusers (Dascalu, Compton, Horton, & Cottler, 2001; Horton, Compton, & Cottler, 1998). The sample for this study was recruited from current and previous patients of substance abuse and psychiatric treatment sites to provide a broad range of diagnoses with varying severity. Trained nonclinician interviewers administered the DIS-IV in a blinded manner at test and retest, and reliability of lifetime disorders was measured by the kappa statistic (Bishop, Fienberg, & Holland, 1975; Cohen, 1960) among the 165 subjects. Results are shown in Tables 1 and 2 and demonstrate that substance abuse and dependence disorders had fair to excellent reliability (kappa .53 to .86); suicidal ideation and attempts had excellent reliability (kappa .76 and .80, respectively); and depression, mania, PTSD, panic disorder, phobic disorder, obsessive- compulsive disorder, antisocial personality, conduct disorder, and oppositional defiant disorder had fair to good reliability (kappa .40 to .67). Disorders with poor reliability were generalized anxiety disorder (kappa .33), attention deficit disorder (kappa .33), and specific phobia (kappa .25). For attention deficit and generalized anxiety, the symptoms had a higher reliability than the full disorder. This indicates that the symptom clusters have adequate reliability but the age of onset and impairment criteria are less reliable. These results are consistent with the literature on reliability of psychiatric disorders among drug abusers, and based on these results, we conclude that DIS-IV psychiatric disorders, except for specific phobia, have adequate reliability among substance users. Because most psychiatric disorders are less reliable among substance abusers than among nonsubstance abusers (Bryant, Rounsaville, Spitzer, & Williams, 1992), these tests of reliability may show the lower limit of reliability compared to non-substance-using populations.

TABLE 1 Test-Retest Agreement on DSM-IV Substance Abuse and Dependence Diagnoses from the Reliability of the DIS-IV among Drug Users Study
Diagnosis Kappa (95% CI)
Alcohol  
  Dependence .67 (.54-.79)
  Abuse* .74 (.60-.87)
Amphetamine  
  Dependence .67 (.48-.85)
  Abuse* .77 (.62-.93)
Cannabis  
  Dependence .60 (.45-.85)
  Abuse* .60 (.46-.93)
Cocaine  
  Dependence .53 (.35-.70)
  Abuse* .56 (.39-.73)
Hallucinogen  
  Dependence .59 (.33-.84)
  Abuse* .61 (.39-.84)
Opiate  
  Dependence .69 (.50-.89)
  Abuse* .53 (.31-.75)
Phencyclidine (PCP)  
  Dependence .69 (.42-.96)
  Abuse* .86 (.68-1.0)
Sedative  
  Dependence .59 (.36-.82)
  Abuse* .50 (.26-.74)

* Abuse calculated without regard to whether dependence was present. From Horton et al., 1998.

In a study of co-occurring psychiatric illnesses among substance abusers being admitted to treatment (Compton, 2001; Compton & Horton, 2001), the computerized version of the DIS took approximately 75 minutes (Compton, personal communication, 2001).

Validity of the DIS has been tested in a subsample from the same study by comparing diagnoses obtained with the DIS to those obtained using the WHO Schedules for Clinical Assessment in Neuropsychiatry (SCAN; Wing et al., 1990). Of the 100 subjects in this diagnostic concordance sample, 46 were from the St. Louis public drug treatment Central Intake unit; the remainder were patients previously treated in inpatient drug and psychiatric programs. Overall comparison of DIS and SCAN indicated fair to good agreement for substance use disorders (kappa .45 to .71). For co-occurring major depression and social phobia fair to good agreement was found (kappa .41 to .55). For schizophrenia, agreement was marginal (kappa .39). For panic disorder, agoraphobia, and specific phobia, agreement was poor but statistically significantly greater than chance (p < .05). Agreement between the SCAN and DIS diagnoses is nearly as good as the agreement between the SCAN and clinical diagnoses (p < .05) determined by the SCAN interviewers themselves. This indicates acceptable agreement between clinical and nonclinical interviewing techniques. These results are consistent with other comparisons of clinician and nonclinician diagnostic assessments (e.g., Hasin & Grant, 1987a, 1987b; Helzer et al., 1985).

TABLE 2 Reliability of Selected DSM-IV Psychiatric Diagnoses and Symptoms among Substance Users from the Reliability of the DIS-IV among Drug Users Study
Diagnosis Kappa (95% CI)
Major Depressive Episode .67 (.55-.80)
  Suicidal ideation .76 (.66-.86)
  Suicide attempts .80 (.70-.90)
Manic Episode .49 (.29-.68)
  Elevated mood .40 (.22-.59)
  3 + positive manic symptoms .45 (.26-.65)
Schizophrenia .48 (.35-.61)
  Hallucinations .44 (.26-.62)
  Delusions .61 (.46-.75)
Generalized Anxiety .35 (.14-.56)
  Difficulty controlling worry .43 (.24-.61)
  Excessive worry .41 (.22-.60)
Panic Disorder .52 (.27-.77)
  Panic attacks .54 (.40-.68)
Post-Traumatic Stress Disorder .46 (.29-.62)
  Exposure to trauma .61 (.33-.89)
Any Phobia .42 (.24-.59)
  Agoraphobia .41 (.14-.68)
  Social phobia .56 (.35-.77)
  Specific phobia .25 (.02-.47)
Antisocial Personality Disorder .49 (.27-.71)
  Adult antisocial symptoms .44 (.28-.61)
Conduct Disorder* .51 (.33-.68)
Oppositional Defiant Disorder** .60 (.47-.73)
Attention Deficit Hyperactivity Disorder .33 (.11-.55)
  Attention deficit symptoms .63 (.47-.79)
  Attention deficit impairment .56 (.38-.75)
  Attention deficit before age 7 .32 (.07-.56)
  Hyperactivity symptoms .45 (.27-.63)
  Hyperactivity impairment .42 (.20-.63)
  Hyperactivity before age 7 .25 (.03-.46)

* Calculated without exclusion for antisocial personality.
** Calculated without exclusion for conduct disorder.
From Horton et al., 1998.

The strengths and weaknesses of the diagnostic manual are reflected in the instrument. Thus validity of the diagnoses derived from the DIS is generally limited to the validity of the DSM constructs themselves. If future research shows that additional symptoms are relevant for particular conditions, the DIS may not be able to accurately reflect these symptoms. On the other hand, the DIS routinely assesses the full range of DSM criteria for each endorsed diagnosis (i.e., no early skipouts). Therefore, new constellations of symptom profiles can be generated with DIS data. Such work may allow the DIS to be relatively robust with regard to changes in diagnostic systems over time.

The DIS has not been designed to take the place of clinical diagnosis, which requires a degree of clinical judgment not possible with nonclinician interviewers. Therefore, results from the DIS should be considered approximations of clinical diagnoses, and medical decisions based on DIS results require clinical confirmation. On the other hand, for clinical settings where full evaluations are not feasible, the DIS can be used to screen persons for additional psychiatric conditions not routinely evaluated. Positive cases should be referred for evaluation and possible intervention.

The DIS has been used in many different cultural settings. Versions of the DIS have been translated into over a dozen languages and have been used in large-scale epidemiological projects across the globe. Examples of translation and use of the DIS in disparate settings are studies in Taiwan, Korea, and Puerto Rico (Canino et al., 1987; Hwu, Yeh, & Chang, 1989; Lee et al., 1990a, 1990b). The instrument has also been adapted for use in American Indian populations and has been applied in several specific cross-cultural studies (e.g., Compton et al., 1991; Helzer & Canino, 1992; Hwu & Compton, 1994).

First and foremost, because the DIS is closely linked to the DSM system of diagnosis, applying the DIS in disparate cultures depends on the applicability of the DSM in those cultures. In most international settings, the DSM has gained widespread acceptance as the standard diagnostic system. Specific examples of psychopathology may vary from setting to setting, but the overall diagnostic groupings are well established and consistent (Helzer & Canino, 1992; Mezzich, Fabrega, Mezzich, & Coffman, 1985).

Translation and adaptation of the DIS into different languages requires extensive work to assure the conceptual equivalence of the symptom questions. Such conceptual equivalence may be even more important than literal equivalence. Even before any formal psychometric testing is undertaken, both bilingual and monolingual experts and respondents must review the translated instrument to make sure of its applicability.

As in all research involving exploration of health experiences, some respondents may experience emotional discomfort when answering certain questions in the DIS. Training of interviewers includes consideration of such difficult interviewing situations along with ways to address these problems. If any particular questions make people uncomfortable, the question should be skipped. Despite this warning, refusal to answer particular questions and interview breakoff because of discomfort is quite rare (< 1%).

A specific concern in the depression section of the DIS is how to handle respondents who express current suicidal ideation. We suggest that each study develop its own data and safety monitoring plan for handling such situations based on available local resources. In general, for cases in which there is a clear potential for immediate danger, the interviewer is instructed to respond with an active intervention (i.e., have mental health authorities assess the respondent).

Future enhancements of the C DIS-IV will be to develop a web-based interactive version of the interview. The advantage of such an administration method is that data from remote sites can be stored in one central location and updates to the interview can be done for all users.

REFERENCES

American Psychiatric Association. (1980). Diagnostic and statistical manual of mental disorders (3rd ed.). Washington, DC: Author.

American Psychiatric Association. (1987). Diagnostic and statistical manual of mental disorders (3rd ed., rev.). Washington DC: Author.

American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (4th ed.). Washington DC: Author.

Bishop, Y.M., Fienberg, S., & Holland, P. (1975). Discrete multivariate analysis. Cambridge: MIT Press.

Bryant, K.J., Rounsaville, B., Spitzer, R.L., & Williams, J.B. (1992). Reliability of dual diagnosis-substance dependence and psychiatric disorders. Journal of Nervous and Mental Disease, 180, 251-257.

Canino, G.J., Bird, H.R., Shrout, P.E., Rubio-Stipec, M., Bravo, M., Martinez, R., et al. (1987). The prevalence of specific psychiatric disorders in Puerto Rico. Archives of General Psychiatry, 44, 727-735.

Canino, G.J., & Bravo, M. (1994). The adaptation and testing of diagnostic and outcome measures for cross-cultural research. International Review of Psychiatry, 6, 281-286.

Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurements, 20, 37-46.

Compton, W.M. (2001, December). Improving treatment services for substance abusers with co-occurring depression. Paper presented at the annual meeting of the American Academy of Addiction Psychiatry, Amelia Island, FL.

Compton, W.M., Helzer, J.E., Hwu, H.G., Yeh, E.K., McEvoy, L., Tipp, J.E., et al. (1991). New methods in cross-cultural psychiatry: Comparing rates of psychiatric illness in Taiwan to rates in the United States. American Journal of Psychiatry, 148, 1697- 1704.

Compton, W.M., & Horton, J.C. (2001, March). Case management to improve treatment engagement and outcomes for substance abusers with comorbid depression. Paper presented at the annual meeting of the American Psychopathological Association, New York, NY.

Dascalu, M., Compton, W.M., Horton, J.C., & Cottler, L.B. (2001). Validity of DIS-IV in diagnosing depression and other psychiatric disorders among substance users. Drug and Alcohol Dependence, 63, 37.

Hasin, D.S., Carpenter, K.M., McCloud, S., Smith, M., & Grant, B.F. (1997). The alcohol use disorder and associated disabilities interview schedule (AUDADIS): Reliability of alcohol and drug modules in a clinical sample. Drug and Alcohol Dependence, 44, 133-141.

Hasin, D.S., & Grant, B.F. (1987a). Diagnosing depressive disorders in patients with alcohol and drug problems: A comparison of the SADS-L and DIS. Journal of Psychiatric Research, 21, 301-3 11.

Hasin, D.S., & Grant, B.F. (1987b). Psychiatric diagnosis of patients with substance abuse problems: A comparison of two procedures, the DIS and SADS-L. Alcoholism, drug abuse/dependence, anxiety disorders and antisocial personality disorder. Journal of Psychiatric Research, 21, 7-22.

Helzer, J.E., & Canino, G. (Eds.). (1992). Alcoholism in North America, Europe and Asia. New York: Oxford University Press.

Helzer, JE., Robins, L.N., McEvoy, L.T., Spitznagel, E.L., Stolzrnan, R.K., Farmer, A., et al. (1985). A comparison of clinical and diagnostic interview schedule diagnoses. Physician reexamination of lay-interview cases in the general population. Archives of General Psychiatry, 42, 657-666.

Hesselbrock, V., Stabenau, J., Hesselbrock, M., Mirkin, &Meyer, R. (1982). A comparison of two interview schedules: The Schedule for Affective Disorders and Schizophrenia-Lifetime and the National Institute for Mental Health Diagnostic Interview Schedule. Archives of General Psychiatry, 39, 674-677.

Horton, J., Compton, W.M., & Cottler, L.B. (1998). Assessing psychiatric disorders among drug users: Reliability of the revised DIS-IV. In L. Harris (Ed.), NIDA Research Monograph-Problems of Drug Dependence. Washington, DC: NIH Publication No. 99-4395.

Hwu, H-G., & Compton, W.M. (1994). Comparison of major epidemiological surveys using the Diagnostic Interview Schedule. International Review of Psychiatry, 6, 309-327.

Hwu, H-G., Yeh, E.K., & Chang, L.Y. (1989). Prevalence of psychiatric disorders in Taiwan defined by the Chinese diagnostic interview schedule. Acta Psychiatrica Scandinavica, 79, 136-174.

Lee, C.K., Kwak, Y.S., Yamamoto, J., Rhee, H., Kim, Y.S., Han, J.H., et al. (1990a). Psychiatric epidemiology in Korea. Part I: Gender and age differences in Seoul. Journal of Nervous and Mental Disease, 178, 242-246.

Lee, C.K., Kwak, Y.S., Yamamoto, J., Rhee, H., Kim, Y.S., Han, J.H., et al. (1990b). Psychiatric epidemiology in Korea Part II: Urban and rural differences. Journal of Nervous and Mental Disease, 178, 247-252.

Mezzich, J.E., Fabrega, H., Mezzich, A.D., & Coffman, G.A. (1985). International experience with the DSM-III. Journal of Nervous and Mental Disease, 173, 738-741.

Robins, L.N., Helzer, J.E., Croughan, J., & Ratcliff, K.S. (1981). National Institute of Mental Health Diagnostic Interview Schedule: Its history, characteristics, and validity. Archives of General Psychiatry, 38, 381-389.

Robins, L.N., & Regier, D.A. (Eds.). (1991). Psychiatric disorders in America: The epidemiologic catchment area study. New York: Free Press, 1991.

Rogler, L.H., Malgady, R.G., & Tryon, W.W. (1992). Evaluation of mental health issues of memory in the Diagnostic Interview Schedule. Journal of Nervous and Mental Disease, 180, 215- 222 (discussion, pp. 223-226).

Semler, G., Wittchen, H.U., Joschke, K., Zaudig, M., von Geiso, T., Kaiser, S., et al. (1987). Test-retest reliability of a standardized psychiatric interview (DIS-CIDI). European Archives of Psychiatry and Neurological Sciences, 236, 214-222. Vandiver, T., & Sher, K.J. (1991). Temporal stability of the Diagnostic Interview Schedule. Psychological Assessment, 3, 277-281.

Wing, J.K., Babor, T., Brugha, T., et al. (1990). SCAN-Schedules for Clinical Assessment in Neuropsychiatry. Archives of General Psychiatry, 47, 589-593.

Wittchen, H.U., Burke, J.D., Semler, G., Pfister, H., Von Cranach, M., & Zaudig, M. (1989). Recall and dating of psychiatric symptoms: Test-retest reliability of time-related symptom questions in a standardized psychiatric interview. Archives of General Psychiatry, 46, 437-443.