Comprehensive healthcare provider data

About Healthcare Provider Data Accuracy

Healthcare provider data accuracy is the priority at CarePrecise

Much has been said about the need for high quality healthcare provider data. Inaccurate provider data can lead to limited access, inefficiencies in business operations, government sanctions, and reduced revenues for both providers and vendors. Almost one-third of physicians, for instance, change affiliations and practice locations every year.

Since approximately 2007, the three most accurate resources have been:

A study conducted by Mathematica Policy Research and published in the Journal of General Internal Medicine found that the NPPES data was more accurate than the SK&A data, and much more accurate than the AMA Masterfile. CarePrecise distributes the NPPES (commonly pronounced "EN-pez") data, along with additional data merged from the federal PECOS (Provider, Enrollment, Chain and Ownership System - pronounced "PAY-cose") system and other public and private sources.

Over the past six years since that 2015 study, we have seen additional improvements in timely updates to provider records in the NPPES, and further declines in quality elsewhere. One reason the NPPES has become such a useful resource is that, by law, providers are required to update their NPI Registry (NPPES) data within thirty days of any change. It is the only publicly available U.S. provider data resource whose accuracy is protected by federal law. State medical boards also maintain data on providers in connection with their licensing requirements; however, state resources are structured differently one to another, and most are less accurate than the federal system.

The NPI Final Rule states "Covered health care providers must communicate to the NPS any changes in their required data elements within 30 days of the change."

Until recently, the industry had observed that many providers were not following the law, sometimes leaving their information unchanged as they moved from one practice location to another; however, that has changed. By 2015 we had begun to see measurable improvement, and now, in 2020, we see the trend continuing, with improved accuracy in the NPPES data. Why is this happening?

Put simply, the law and the profits. Though not necessarily in that order.

Besides the fact that U.S. federal law requires providers to keep their data up-to-date – the federal NPI Final Rule states "Covered health care providers must communicate to the NPS any changes in their required data elements within 30 days of the change" – there are financial incentives to do so. There is substantial evidence that health insurance plans may "throttle" claims payments by using discrepancies between data submitted on a claim and the provider's NPI information to delay payments. This can drive up the number of days a claim languishes in receivables, slowing reimbursement. Providers are becoming more savvy about such techniques, and are keeping their NPI data up-to-date. Note that those "required data elements" include name, credentials, practice address, practice phone number, specialties (taxonomy codes), license – key data for public health applications, insurance reimbursements, and marketing.


Another factor is the migration of providers to larger practice organizations, and away from small practices and solo practices. Larger organizations maintain personnel dedicated to keeping individuals' federal records up to date. Increasingly, professional credentials management relieves the practitioner of the burden of updating NPI and other critical repositories. As credentialing managers know, and rank-and-file practitioners are learning, the penalty for having false information on your NPI record is severe:

"18 U.S.C. 1001 authorizes criminal penalties against an individual who in any matter within the jurisdiction of any department or agency of the United States knowingly or willfully falsifies, conceals, or covers up by any trick, scheme or device a material fact, or makes any false, fictitious or fraudulent statements or representations, or makes any false writing or document knowing the same to contain any false, fictitious or fraudulent statement or entry. Individual offenders are subject to fines of up to $250,000 and imprisonment for up to five years. Offenders that are organizations are subject to fines of up to $500,000. 18 U.S.C. 3571(d) also authorizes fines of up to twice the gross gain derived by the offender if it is greater than the amount specifically authorized by the sentencing."

"Penalties for Falsifying Information" section of the U.S. Department of Health and Human Services' NPI registration information publication.

To date, we do not know of any action taken by CMS against providers based on stale data, but the law is the law, and as accuracy becomes more and more critical given the growing reliance on records technology, teeth may be shown at any time.


Importantly, the SK&A data is not complete; that is, it does not include all U.S. HIPAA-covered healthcare providers. And notably, neither does the AMA Masterfile, which permits physicians to opt out, and includes essentially only physicians and students in medical training to become physicians. CarePrecise provider data contains all of the NPPES provider records. Both the SK&A data and the Masterfile are much more expensive than CarePrecise data packages; the Masterfile is as much as two orders of magnitude more expensive than the CarePrecise Access Complete database, and licensing it for use in applications and redistribution is extremely costly, whereas CarePrecise redistribution licensing plans are exceptionally affordable, even for startups.

The previously cited accuracy study placed phone calls to a sample of physicians that were found in the three data sets: NPPES, SK&A, and AMA Masterfile. It found that:

Overall, SK&A (85 %) and the NPPES (86 %) had the highest rates of correct address information among our confirmed cases, whereas the AMA Masterfile included correct contact information for less than half of the physicians we were able to confirm (42 %). In the NPPES, the proportion of physicians by specialty with correct information was highest in general internal medicine (94 %) and lowest for radiology (72 %). For SK&A, percentages were highest for cardiology (92 %) and family medicine (91 %), and lowest for radiology (80 %) [and] internal medicine (79 %). In the AMA Masterfile, confirmation of contact information was lowest in radiology (32 %) and highest in family medicine (54 %).


The study contrasted CarePrecise's primary source data with the AMA Masterfile, finding that, compared to the NPPES, the AMA Masterfile contains some additional information on providers, while including fewer physicians due to the opt-out option offered by the AMA. While CarePrecise starts with the NPPES data, additional data is merged in from sources such as the federal Provider Enrollment, Chain, and Ownership System (PECOS) and from millions of Medicare claims, including much of what the study cited was missing from the NPPES, such as medical schools attended and graduation date, and number of years in practice, as well as sanction information from the List of Excluded Individuals and Entities (LEIE) database managed by the CMS Office of Inspector General.

The CarePrecise Master Bundle now contains more information than the AMA Masterfile and SK&A, including practice affiliations and hospital affiliations, PAC ID, Medicare enrollment ID, graduation year, estimated years in practice, and group size. Additional information is available in CarePrecise's Authoritative Hospital Database (AHD) and Authoritative Physician Database (APD), including health system affiliation, procedure volumes, staffing levels, and patients served. CarePrecise Platinum and the CarePrecise Gold packages include econometric data and Zip5 geo-coding (with per-address geocoding available separately). Overall, CarePrecise data packages are far less costly, and more comprehensive.

CarePrecise also adds email addresses, which are available as custom extractions, permitting buyers to identify specific criteria, such as geographic location, practice description (by taxonomy code), etc.

Most CarePrecise data packages include the exclusive CoLoCode™ – a hyper-conformed version of the practice location data used to group providers working together at any given practice location, and to link Type 1 (individuals) to their Type 2 (business) records.

In 2021, six years since the study, work environments and financial incentives have made a positive impact on healthcare provider data quality in the NPPES. It is time that a follow-up study be conducted, ideally one that samples other key provider types including practice groups, and dental, vision care, and mental health providers, as well as pharmacies and medical laboratories.


Determining accuracy in a dataset requires more than making phone calls to check a random sample. For instance, to determine how many practice addresses are correct requires some "buffering" of the source data to protect against organic inconsistencies. For instance, if the goal is to arrive at an accuracy percentage for all U.S. clinics, one might want to eliminate mobile clinics, as their reported practice addresses will likely be the address of a parent clinic, not the mobile clinic's spot in a parking lot. If the goal is to determine accuracy of physician practice locations, it would be prudent to first remove locum tenens physicians, whose organic practice is to move around, never having a permanent location. Such physicians' federal records - perhaps 7% to 8% of U.S. physician Type 1 records - will, statistically, bear a weak relationship with their true practice location at any given time. In such a case, it would be prudent to remove locum tenens physicians (or nurses, or whichever provider taxonomies are under study) before testing. The trouble is, this is very hard to do, as there is no indicator in the datasets for locums. Therefore, a relatively large sample needs to be extracted, and the outcome adjusted by the percentages just mentioned. If a large sample of physicians comes back with 85% accuracy, and you back out the locums, you might understand the actual accuracy to be closer to 93%.

Many other factors need to be taken into consideration in expressing accuracy as a number. The presence of dead records is difficult to remedy; practitioners may have died and the families do not know to update the federal record and CMS hasn't caught it yet. Practitioners may work at several locations on rotation and the federal record lists only one, and so on.


Additional Resources

About the NPI

Obtaining an NPI number

Updating your NPI Number Record