Validation of an Arab name algorithm in the determination of Arab ancestry for use in health research

Abdulrahman M El-Sayed; Diane S Lauderdale; Sandro Galea

doi:10.1080/13557858.2010.505979

Validation of an Arab name algorithm in the determination of Arab ancestry for use in health research

Ethn Health. 2010 Dec;15(6):639-47. doi: 10.1080/13557858.2010.505979.

Authors

Abdulrahman M El-Sayed¹, Diane S Lauderdale, Sandro Galea

Affiliation

¹ Medical School, University of Michigan, 1500 Medical Center Dr., Ann Arbor, MI 48109, USA. elabdul@umich.edu

Abstract

Objective: Data about Arab-Americans, a growing ethnic minority, are not routinely collected in vital statistics, registry, or administrative data in the USA. The difficulty in identifying Arab-Americans using publicly available data sources is a barrier to health research about this group. Here, we validate an empirically based probabilistic Arab name algorithm (ANA) for identifying Arab-Americans in health research.

Design: We used data from all Michigan birth certificates between 2000 and 2005. Fathers' surnames and mothers' maiden names were coded as Arab or non-Arab according to the ANA. We calculated sensitivity, specificity, and positive (PPV) and negative predictive values (NPV) of Arab ethnicity inferred using the ANA as compared to self-reported Arab ancestry.

Results: Statewide, the ANA had a specificity of 98.9%, a sensitivity of 50.3%, a PPV of 57.0%, and an NPV of 98.6%. Both the false-positive and false-negative rates were higher among men than among women. As the concentration of Arab-Americans in a study locality increased, the ANA false-positive rate increased and false-negative rate decreased.

Conclusion: The ANA is highly specific but only moderately sensitive as a means of detecting Arab ancestry. Future research should compare health characteristics among Arab-American populations defined by Arab ancestry and those defined by the ANA.

Publication types

Comparative Study
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Validation Study

MeSH terms

Adolescent
Adult
Algorithms*
Arabs / ethnology*
Birth Certificates
Data Collection / methods*
Data Collection / statistics & numerical data
Female
Humans
Male
Michigan
Middle Aged
Names*
Predictive Value of Tests
Self Report
Sensitivity and Specificity
Young Adult

Abstract

Publication types

MeSH terms

Grants and funding