Your browser (Internet Explorer 6) is out of date. It has known security flaws and may not display all features of this and other websites. Learn how to update your browser.
X
Post

Research Limitations with “New” Death Master File

By Earl F Glynn | Franklin Center

In late 2011 the Social Security Administration made changes to the Death Master File to delete 4.2 million records and to remove geographic information.

At the same time the SSA announced they would be reporting fewer deaths in the future.  About 36% of the deaths previously reported in the DMF will now not appear.

These changes to the DMF impose analysis limitations and may make the data nearly worthless within a few years. 

Reporters looking for dead people who are still registered to vote will become less and less effective in their searches using the DMF in future years.

Analysis will be most affected in states where large numbers of people move to and later die, such as Florida.


Selected statements about the changes:

4.2 million state records removed; fewer records in future

Social Security Administration Fact Sheet:

“We began disclosing certain state records on the Public DMF in 2002. After review of the Public DMF, we have determined that we can no longer disclose protected State records. Section 205(r) of the Social Security Act prohibits SSA from disclosing State death records we receive through our contracts with the States, except in limited circumstances. Therefore, we cannot legally share those State records on the Public DMF.”

NTIS Important Notice: Change in Public Death Master File Records:

“Effective November 1, 2011, the DMF data that we receive from [Social Security Administration] will no longer contain protected state death records. …”

“The historical Public DMF contains 89 million records. SSA will remove approximately 4.2 million records from this file and add about 1 million fewer records annually.”

Testimony of SSA Inspector General Patrick P. O’Carroll, Jr. to Congress, Feb. 2, 2012:

“The file contains about 85 million records, and it adds about 1.3 million records each year. …”

“SSA receives about 2.5 million death reports each year from many sources, including family members and funeral homes.”

Simple math from the SSA IG’s statement suggests 48% of death reports may be missing in the DMF now.

Geographic Data Removed

Death Master File Extract Output Record Specification, Nov. 2011:

“Revised November 1, 2011 to remove the State/Country Code of Residence, Zip code – Last Residence, and Zip code – Lump Sum Payment fields as a result of no longer publishing protected state records.”


If one attempts to find matches in the 85-million record DMF from the whole country, there will be many false matches with many names.

One approach to reduce the number of false matches is to extract a state subset using geographic clues that may be in the file for a dead person.

Matching against a state subset is easier and faster, and the geographic information in the subset is useful for verification of matches.

How to find a state DMF subset in 2010

In 2010 there were four fields that provided geographic information about a dead person:

  • SSN3:  the first 3 digits of the SSN, which reflects the state where the original SSN application was made.  For many this is where they were born, but not always.
  • State code:  a two-digit code maintained by SSA.
  • LumpZIP:  the five-digit post office ZIP code where a lump sum distribution payment was sent by SSA.
  • LastZIP:  the last known ZIP code maintained by SSA.

Let’s use the State of Missouri for an example of how to extract a state DMF subset:

1.  SSN3486 – 500 for Missouri

2.  State Code:  26 for Missouri

3.  LumpZIP, LastZIP63001 (Allenton) to 65899 (Springfield)

The range(s) of ZIP codes for a state can be found in several places.  One online resource from neighborhoodlink.com shows a list of ZIP codes for a state.

A free online file of ZIP codes can be filtered and sorted in Excel to find the ranges of  ZIP codes for a particular state.

The USPS ZIP lookup can be used to verify the limits of the ranges.

I did not attempt to validate particular ZIP codes since only about half of all 5-digit numbers are valid ZIP codes.  I’ll accept a ZIP for a state subset as long as it is within the correct range, whether or not the ZIP is technically valid.

We’ll want all DMF records that match any of the four conditions.  A relatively simple SQL statement in MySQL tells us there are 2,736,860 records in a Missouri 2010 DMF:

SELECT COUNT(*) FROM dmf2010
WHERE
  ((SSN > "486000000") AND (SSN < "501000000")) OR
  ((LUMPZIP > "63000") AND (LUMPZIP < "65900")) OR
  ((LASTZIP > "63000") AND (LASTZIP < "65900")) OR
  (STATE = "26");

Some additional SQL queries tell us which conditions contributed how many of the total records — many death  records hit on multiple conditions:

2,304,026 with proper SSN3 range for Missouri
792,313 with proper STATE code 26 for Missouri
1,729,023 for LASTZIP with Missouri ZIP
249,396, for LUMPZIP with Missouri ZIP

The high number of SSN3s in the total means most that died in Missouri applied for a Social Security Number in Missouri, which is not surprising.

With the removal of the geographic fields in late 2011, what happens to a Missouri 2012 subset?


How to find a state DMF subset in 2012

With the removal of the STATE, LASTZIP and LUMPZIP fields, the SQL statement to extract the Missouri DMF from an updated database is simple:

SELECT COUNT(*) FROM dmf
WHERE (SSN > "486000000") AND (SSN < "501000000");

With DMF updates through Aug. 1, 2012, the 2012 Missouri DMF has only 2,335,383 dead people.

The 2010 and 2012 DMF subsets can be compared with a Venn diagram (created from the MIT online Venn Diagram Generator).


Missouri DMF Comparison 2010 and 2012

Venn diagram showing overlap of Missouri Death Master File subsets from 2010 and 2012.

The SSA deletes erroneous entries in monthly DMF updates but most of the loss between 2010 and 2012 (the blue portion) reflects the Missouri portion of the 4.2 million dead nationally, who were dropped from the DMF file in Nov. 2011.

The diagram shows an addition of 59,557 new death records by Aug. 2012 from what was observed in early 2010.  These new death records have no geographic information except for the SSN3.

The diagram shows an overlap of 2,275,826 death records from the 2010 and 2012 files.  With this match geographic information from the 2010 can be used to update information in the new 2012 file.

Two possibilities explain most of the portion in blue:

  • People who died in Missouri but did not apply for their SSN in Missouri, or
  • Dead people with Missouri connections who were part of the 4.2 million purge records.

Obtaining an Oct. 2011 up-to-date national DMF the month before the purge might help with connecting new and old death information.

As geographic information for those not born in Missouri is connected with current death records the size of the blue area may shrink considerably.


Texas DMF Comparison 2010 and 2012

Venn diagram showing overlap of Texas Death Master File subsets from 2010 and 2012.

The 2010 extraction of  a Texas DMF was complicated by multiple ranges of ZIPs.  The ZIP code range of 75001 (Addison) to 79999 (El Paso) was found to be lacking some valid Texas ZIPs.

After studying several sources, I learned about these additional TX ZIPs:  73301, 73344 (Austin) and 88510-88595 (also El Paso).

Overall, the Venn diagram for Texas is similar to Missouri.

The size of the blue area is slightly larger in Texas than Missouri, and the size of the middle overlap is slightly smaller.  This is likely caused because a larger proportion of people moving to Texas from other states who later die, than what is seen in Missouri.


Florida DMF Comparison 2010 and 2012

Venn diagram showing overlap of Florida Death Master File subsets from 2010 and 2012.

The Venn diagram show how the loss of geographic information in creating the 2012 Florida DMF will greatly impact any dead voter analysis in Florida.

With such a huge migration of people to Florida from other states, who then die in Florida, the 2012 Florida DMF has become quite small.

With only 2012 Florida DMF information, the number of dead to match against voter records is less than half of what it was in 2010.


SSN Randomization

On June 25, 2011 the Social Security Administration implemented a “randomization” process to help protect the integrity of the SSN.

The Social Security Administration removed the geographical significance of the first three digits of the SSN with these new “randomized” SSNs.

Creating a “state subset” of the DMF for those receiving SSNs after June 2011 will not be possible.  Matching names against the full DMF for those receiving these randomized SSNs will result in a huge false positive problem.

Matching voter registration lists against the DMF will become ineffective in a few decades as all geographic information is lost to constrain the process to reduce false positives.


Data Sources


Related


Previous Stories using DMF

efg

Contact Info: Twitter: @WatchdogLabs, Facebook: http://www.facebook.com/WatchdogLabs

  • [...] Research Limitations with “New” Death Master File [...]

  • The Public Death Master File and the Social Security Death Index do not list the place of death. These databases do, hevweor, list the zip code of the address of record, that is, the zip code of the most recent address that the Social Security Administration had on file while the person was still living. For example, my Aunt Helen died in Ellis Hospital in Schenectady, Schenectady County, New York. The zip code for her address of record in the databases, hevweor, is 12019, the zip code for Ballston Lake, Saratoga County, New York.

    Wali

    October 10, 2012

  • As described in the article above, the ZIPs were removed about a year ago and are no longer in the file.

    efg

    October 10, 2012

  • I needed to sent them a NOTIZED leettr as to WHY? I wanted her death certicate ALSO I have to indict my relationship to the decease? As I was not close enough to get one. ONLY HER IMMEDIATE FAMILY would be allow to get it.I also noticed it was already on the death index however her daughter (who never had a close relationship with her) had given the WRONG BIRTH DATE on her certificate.She lied about dates and names for years. I see many alternate names on internet for her. Some legal some not. She was a person who would steal you ID and had over 20 that the family knew about.Banks tell us that people are going into the banks and closing accounts based on these certificates. I don’t believe that, but was told by one bank that this has happened.Use hers Pamela (O’NEIL)-IVIE-HEREFORD-KAIN she might have used yours.

    Saurabh

    October 10, 2012

  • Unfortunately, a lot of people can only tell if a loved one is deceased by the SSDI or state death records. Once I saw a police missing persons report and found the person in the SSDI-she had died in a different state. One man couldn’t be found by his family-luckily, California has online records available up to 1997 and he died in CA before that. He was from Canada but didn’t have a SS number.

    Steve

    January 5, 2014

Leave a comment  

name*

email*

website

Submit comment