All visuals across this report are interactive. In essence, this implies that for each visual, the user will have the ability to navigate through making selections and subsets of the data.
Of note, each one of the interactive visuals has a menu option to the right of image. In said menu **the user is able to toggle zoom, box selectors and other functions. Particularly, a useful feature is the image of a camera, which will capture a PNG snapshot of whatever data interaction the user is working through.
A couple of useful notes:
Health claims data contains untapped, often unquantified knowledge, that if leveraged has the potential to change the health care landscape for providers, health care administrators, insurance companies and most importantly, for the member (also referred to as client or patient).
Health claims govern the payout process associated with a patient’s reason for seeking clinical care in the first place. Because a claim is associated with treatment received, it is possible to extract context data (diagnosis codes, metadata, etc.) from claims databases in order to summarize the aggregate level view of health claim submissions for a single regional group of interest (e.g county or state).
Because health claims data also contains metadata such as age, sex, and address, it is possible to develop both patient and aggregate (e.g. regional/spatial) level understanding on diagnosis categories in Delaware. In HCCD, the addresses linked to each member file create the opportunity to visualize the distribution of health claims, as they relate to a member’s place of residence and not the provider’s place of service.
In order to do so, here we implement a top-down (reductionist) exploratory approach in order to understand the variation of diagnostic categories across the state of Delaware.
HCCD is equipped with a number of unique claim identifiers, which allows for diverse hypotheses to be generated from the Data.
Quality and access to care during the COVID-19 pandemic were notably two of the most complex systems to navigate for every state. This resulted in a reduction of routine, screening and non-urgent (or so patients identified) visits across centers providing health care services. The risks associated physical encounters between clinician and patients often outweighed the benefits of a single visit encounter. As such, the need for alternative methods for delivering medical care is one of the most important topics at hand. Telehealth (or Telemedicine) describes any virtual encounter between a patient and provider where clinical care or counsel is provided.
During the COVID-19 pandemic, sheltering in place brought to light the dire need and demand for telemedicine services across Delaware.
Not only would reallocating in-person visits to virtual create a new mechanism for patients to seek clinical care, it would also reduce the in-person visit burden health care centers and entities experienced -this hospital burden was documented throughout the pandemic period, nationally.
In HCCD, a unique identifier is place of service. Telemedicine, is one of the “places” of service included in the database. We can then evaluate each unique claim entry for its telehealth status. With this data we are able to conduct a sub-analysis of telehealth related claims and their differential submissions throughout the COVID-19 pandemic period. We will do so in order to identify:
The Delaware Health Care Claims Database (HCCD), powered by the Delaware Health Information Network (DHIN), collects healthcare claims, enrollment and provider data from Medicare, Medicaid and the seven largest commercial health insurers in the State of Delaware.
The Delaware General Assembly passed legislation in FY16 authorizing the Delaware Health Information Network to develop a healthcare claims database.
10311 The Delaware Health Care Claims Database — Findings; purpose; creation (delaware.gov)
The purpose of the HCCD is to facilitate data-driven, evidence-based improvements in access, quality, and cost of healthcare and to promote and improve the public health through increased transparency of accurate claims data and information
The data in HCCD represents all health claims submitted/reported by any entity (or group of them) seeking payment from the insurance company on behalf of a patient of a provided service. As such, it is important to acknowledge that this report will present findings that can relate to:
Whether a Delawarean patient receives services instate does not prevent a health event from being documented back to DHIN, even from other states. As such, the data captured by HCCD is representative of the individual level health claim submissions that relate, insurance, patient and clinicians for over 10 years. Our goal is to aggregate the complete study space captured by the Year, Service provider state or province and Principal Diagnosis filters for the data.
We selected any health claim that restricts the date of service through between Jan-01-2018 and Dec-31-2020. There are several reasons for why this decision was made, but the primary reasons are:
We restricted the service provider’s state or province to Delaware.
The main driving reason being that in order for us to capture the landscape of provider/client health claim submission for the State of Delaware (and how this related to neighborhood level factor) we assume the health service should be delivered within the defined boundaries of the state of Delaware.
We make this assumption to capture services provided to Delawareans residing in Delaware by service providers located in Delaware.
By original design, the HCCD’s function was to facilitate outcomes based research. However, because HCCD hosts health claims data, we must develop and apply careful methods in order to leverage claims allowing us to make appropriate inferences about the different populations under examination.
Health claims, are not originally designed with the intent to facilitate outcomes based research or analytics. With this in mind, given the high-dimensionality and complexity of claims data, there is no single-standard approach for selecting and aggregating identifying variables from across a list of eligible ICD-10 codes that could explain system level diagnostics and/or prognostics.
More specifically, referring to diagnosis codes (represented by ICD-10 codes) every single claim, will minimally have one (1) associated principal diagnosis and can additionally be labeled with up to twelve (12) optional fields titled “other diagnoses”. A single reason for a visit may relate to more than a single principal diagnosis code. However in its definition as principal, the principal diagnosis is intended to serve as the governing ICD-10 representing a claim most globally. The reasons for this can be, structural, financial, administrative, insurance-specific or operational.
Furthermore, it is important to note that traditional reporting and research have focused on the use of principal diagnosis codes as the main identifiers for claims. Leading to its use in funding, analytics and administrative efforts. At the national scale, Medicare reports and manages their health claims data on the basis of principal diagnosis.
Prior to analyzing the HCCD data, two additional features were incorporated in order to enhance inferences made from HCCD. These are the:
Broadly speaking, data at census tract granularity exist for a number of different data sources and types. In fact, community, fieldwork, nonprofit and government teams often target & serve specific populations defined by their census tracts. Zip codes do not represent geographical boundaries that are unique to each county within a state. Additionally, Delaware is unique in that less than 1 million people reside across three (3) counties.
“Neighborhoods” are loosely defined, formed and maintained by the persons that inhabit them. They are also the environments that most directly influence individual level health. The aforementioned captures the idea of person and place, but for a state like Delaware, where there are only 3 counties subdivisions, a county or zip code level analysis may fail to capture the unique characteristics of its neighborhoods. As such, we implement census tract labeling, in order create an aggregating label to
HCCD data is high-dimensional. As such, when attempting to create a top-down view of medical health claim distribution to begin understanding the neighborhood level mechanisms that drive health patterns in Delaware, we must first make decisions on how to best aggregate and report on the data. Our framework is but one of many approaches, but the spirit of the schema remains the same: to generate a cross-sectional view at a more aggregate level than ICD-10 across Delaware.
More simply, this implies that high-dimensional data captures information complex in structure, content and representation. One defined example is that the number of existing ICD-10 codes eligible as principal diagnosis labels exceeds 70,000 codes.
Because the claims data were not designed for outcomes based research, here we demonstrate an approach for leveraging the HCCD at large scale reducing the grouping labels from over 70K to 26 diagnosis categories (MDC). We do so to begin understanding any neighborhood level relationships to health claim frequencies data. To date, for Delaware, the health claims database has not been analyzed at such a large scale and has not been aggregated and visualized cross-sectionally. Rather than proceeding with selecting from an a priori based list of ICD-10 codes historically associated with specific diseases of interest, we developed an MDC labeling scheme. Each medical health claim was labeled with its associated MDC(s) with the intent to create the first comprehensive visualization of MDC-level distribution of health submissions for Delaware.
This will allow the exploration of differences in health claim submissions across counties, while also capturing individual level census tract information across for 26 (N=25 defined by physiological systems & N=1 defined by transplant/tracheotomy status).
A note regarding MDCs: MDC 1 to MDC 23 are grouped according to principal diagnoses. Patients are assigned to MDC 24 (Multiple Significant Trauma) with at least two significant trauma diagnosis codes (either as principal or secondary) from different body site categories. Patients assigned to MDC 25 (HIV Infections) must have a principal diagnosis of an HIV Infection or a principal diagnosis of a significant HIV related condition and a secondary diagnosis of an HIV Infection.
There were two aggregating decisions that could influence how the reader interprets and understands the findings presented in this report.
First, for the purposes of this report, claims were analyzed at the distinct reporting entity claim control number level, rather than at the EID (individual member identifier). A single claim belongs to an unique clinical event. A clinical event is defined as the complete period of care for each member, resulting from a singular initial care encounter.
For any given member, a clinical event can be composed of N unique reporting entity claim control numbers. If so, there are also N unique provider instances associated to N unique clinical payments (N is minimally equal to one (1)). In tandem, each one of these individual instances of clinical payment, although related at a singular EID clinical event, represent individual providers/services rendered to a member during the period of their visit. This is because a single EID can:
A claim dated to event i, will have one (1) or more providers of clinical service or care associated to it. When there is more than one provider seeking payment for a service provided during patient X’s jth’s clinical event during year k, there will be more than one reporting entity claim control number associated with said claim. The number of reporting entities that individually file for a single clinical event, is highly dependent on billing & administrative processes, payout and procedure tracking schemes.
The decision was made to separate each claim into its provider level instances, given that our goal is to aggregate the individual level services (representing individual providers) that are captured by a single event for a member.
To put it more simply, if a patient is billed by two different providers for the same clinical event (more than one (1) reporting entity claim control number) each reporting entity claim number is weighed equally for that whole patient’s encounter of care. This is how we tracked the individual service-level reasons as to why health claims were filed, allowing us to understand the diagnoses and treatments provided at any given time.
Second, each provider submission is led by a single principal diagnosis code, a single ICD-10 code. However, one (1) ICD-10 code can represent more than one (1) MDC. For the purposes of aggregating our data to the MDC level, if an ICD-10 code was associated with more than one MDC, the decision was made to create one row per ICD-10/MDC pair for each provider submission. This will summarize the total count of health claims, weighing more disease represented by more than one major diagnostic system. This is the basis of an approach aimed at understanding the major diagnostic categories that capture provider services across clinical events.
The main data leveraged for this project originates from the HCCD. Additionally there are other sources of data that were included or implemented in this work, Census and CMS data.
We sourced American Community Survey, 2019 5-year estimates on population size (and other factors) for conducting our 2019 sub-analysis. This also is the source of data that allows us to link census tract labels to population and county labels.
Additionally, it is from the Census that we source the shapefiles for our spatial builds.
Here is the source for the conversion table we implemented in order to label each provider level submission’s principal diagnosis. We related ICD-10 codes with their respective MDC (s).
From CMS: The Diagnosis Code/MDC/MS-DRG Index lists each diagnosis code, as well as the MDC, and the MS-DRGs to which the diagnosis is used to define the logic of the DRG either as a principal or secondary diagnosis.
The labels for each one of the N=26 total MDCs were extracted from NJ State’s public health resource. The number of individual level ICD-10 codes that represent each MDC vary greatly across MDC categories. We’ve already presented the individual labels for each Major Diagnostic Category, next we showcase the distribution of ICD-10 codes captured by MDCs.
Notably, MDCs represent a broad distribution of counts. However, MDC 8, Musculoskeletal System and Connective Tissue, is represented by almost 3-fold the count of ICD-10 than the second leading count MDC 23, Factors Influencing Health Status.
Implementing the MDC label, will allow claims to be aggregated at the different clinical system levels. This will allow the exploration of provider services in order to:
After labeling each provider claim with its MDC and census tract, our goal is to explore the change in provider level claim submissions across counties in Delaware. Because each individual filing instance is associated with its member’s corresponding census tract, we can relate the insured population and health claim submission counts to explore the relationship for the 2019 year. 2019 is the most-recent, complete, pre COVID-19 available full year of data.
Census tracts, although independently delineated by the Census, align coterminously to county boundaries. Because of this we infer county level MDC patterns, in order to understand their differential distribution over time. It is important to note that across the US, some census tracts follow governmental unit boundaries and other invisible features in some instances (e.g. state boundaries). We aggregate data at the county level to explore a “top view” of MDCs across Delaware, but explore variation in change at the census tract level.
Aggregating all provider level counts and calculating the yearly magnitude changes we observe differential growth across the years, and MDCs.
When aggregating the individual level data from census tract to county, we observe that compared to 2018, every county observed a higher count of provider level claims across almost all MDCs. The percent changes range from ~-11 to 25 percentage points in magnitude (%).
NOTE: Each magnitude and percent change associated will become available when hovering over this image.
The COVID-19 pandemic has no official starting period, however, what we do know is that at various levels of health care, the pandemic has influenced the level, quality and specialty of care patients could access or seek through the context of a global pandemic. An unofficial start date to the pandemic period is March 23rd, 2020.
So, when comparing 2018 to 2019, the change in claim submissions at the county level was positive for almost all MDCS. This was not the case when comparing growth from 2019 to 2020, relative to 2020.
Almost uniformly across MDCs, the count of provider level claims decreased. The change across census tracts between 2019 and 2020 ranges from -36 6o 44 percent.
Health Claim submissions related to MDCs 25 and 20, Human Immunodeficiency Virus (HIV) Infection and Alcohol/Drug Use or Induced Mental Disorders respectively, are the only aggregate MDCs that across every county observed an increase in associated submissions. Whether these claims are a result of new patients seeking clinical care (not previously sought) or whether it is an actual result of the existing patients seeking more instances of care is not clear from these results.
However, when considering both MDCs:
From the previous visual, we can identify several patterns of occurrence across MDCs. Doing so, we are able to evaluate the variance within census tracts in each county. Loosely, here we are evaluating select MDCs and the distribution for each of its census tract’s percent difference. A value close to 0 (zero) represents a small percent change for a tract.
This allows us to conduct select exploratory analysis of census tracts driving a given county’s differential percentage change. Consequently, this analytic approach creates a space for data driven targeted support. As mentioned:
We are able to identify census tracts that represent cases on the extremes, but also observe that there are differences in direction and variance for each one of the MDCs.
Because of its unique characteristics, the 2020 data for HCCD will represent a great level of variation associated with pandemic period operations & logistics. As such, It is not surprising that MDCs experience different patterns of occurrence by census tract.
Each MDC carries its own pattern of change throughout the years, especially at the county level.
Next, our goal is to separately explore a singular instance of MDC in the data. This will allow the exploration of the historic and spatial patterns of MDC occurence.
For this example, we will explore the most frequent MCD claim type, MDC 8, Musculoskeletal System and Connective Tissue.
Focusing on just a single pattern, we are able to visualize the total count of a single MDC, MDC 8 for the year 2019. In order to showcase the advantages of aggregate and spatial level data merged to census data we focus on the year 2019 to incorporate additional open source information to the HCCD data.
In general, claim submissions related to Musculoskeletal risk distribute relative to the population size, across Delaware. Census tracts in Kent are more frequently a deeper color, representing a higher per capita rate of MDC 8 provider claims.
After exploring how these data aggregate and distribute spatially, it is possible to relate MDC count data to Census data to explore underlying mechanisms that influence health.
We expected MDC 8, Musculoskeletal System and Connective Tissue to be positively associated with age, where the census tracts representing the older chronological age groups, observe higher MDC 8 counts than the younger ones.
Our expectation for the positive linear relationship between age and average provider filing does not hold true. However, interestingly, we note an inverted parabola-like relationship for Sussex county alone.
From this particular example where the total count of Musculoskeletal claims are aggregated and normalized to population size of the insured and correlated across each census tract to median age, we note that for Sussex county specifically, the older population observe a decrease in average health claim rate compared to the younger population.
Lastly, the goal is to leverage HCCD data, census tract and MDC labels in order to understand if, where and for what purposes were telehealth services provided in DE. It is important to reiterate that the addresses labeling each claim corresponds to the member’s address not provider address. As such we assume that telehealth visits occur at the patients address.
During the COVID-19 pandemic period, providers were burdened at various levels including their barriers with operational and capacity logistics. Learning of the services that were provided via telemedicine during the pandemic can help us further understand:
Prior to discussing individual MDC instances of telemehealth during 2020, it is worth nothing that there is a large increase in telehealth provider services during during 2020:
Compared to the previous years, the percent of telehealth provider claims proportional to all provider health claims is highest in New Castle County in 2020.
In previous years, the total count of claims for new castle are the highest between the counties, but the percent of telehealth claims among all claims is the lowest.
Next we showcase the difference in telehealth provider claim submissions between 2019 and 2020, relative to 2019.
There are a number of different MDCs with a wide percent change, however, an interesting finding here is that the count of MDC 20, Alcohol/Drug Use or Induced Mental Disorders increased deferentially in New Castle County.
As part of HCCD, each member’s gender is reported. With this we can explore the gender bias in Alcohol related health claim submissions over the years.
First we explore the magnitude difference across reported genders:
For 2020, compared 2019 the count of claims increased several folds for each county. Particularly, for Sussex, in 2020, Females represent a larger proportion of the Alcohol/Drug use claims sample population.
In general, the proportion of females to males that uptake telehealth services varies greatly across MDCs and county. The f/m ratios across census tracts distribute as follows:
Excluding MDCs 12 and 13 (male and female reproductive system MDCs, respectively) we can observe that the ratio of F/M members for each year, MDC and county vary greatly.
However, for Kent and Sussex counties women presentation is differential among telehealth services provided and related to Blood and Blood Forming Organs and Immunological Disorders.
In this summary, we capture the labeling, aggregation and reporting at the census tract and Major Diagnostic Category (MDC) levels in Delaware. Here we present a count (and not estimate) based reporting scheme which showcases the MDCs that govern and influence each census tract. Here we show a method that allows us to understand the group level factors for which services are being paid out and by relationship sought out.
Leveraging this knowledge, we present various insights made from the data, but the application of this method can far extend the case example presented here on telemedicine. With the ability to link HCCD to population level data, health claims information can be applied in order to further learn of the socio-demographic factors driving the health services being provided (and in need of) here in Delaware.