By Earl F Glynn | Franklin Center
In his recent article, For good government groups, what did all that access earn?, my associate Mark Lisheron looked at the transparency community’s view of open and accountable government under Barack Obama’s presidency.
Since the departure of Norman Eisen as President Obama’s special counsel on ethics and government reform in 2010, the White House visitor logs show a linear downward trend in access by good government groups as measured by White House visits.
This article provides technical details of how we searched for good government visitors in the White House visitor logs.
The intent of this technical article is to
- explain an approach to quantify good government organization access to the White House based on official visitor logs.
- describe a defined and repeatable process for searching White House visitor logs for groups of visitors.
- enable journalists and bloggers to analyze White House visitor data more effectively.
This article looks at these topics:
- Updated White House Visitor Data
- Summary of Official White House Visits
- Defining “Good Government” Organizations
- “Fuzzy Match” of Names
- Extracting White House Visits and Meeting Information
- Statistics about Good Government White House Meetings
- “Top 10″ Lists for White House Visits by Good Government Organizations
Updated White House Visitor Data
On Friday the White House announced More than 3.1 Million Records Released in the most recent monthly update of visitor data.
I encountered no problems this month in applying a defined, repeatable process to clean-up the data. A recent problem with 600K duplicate records was addressed last month by the White House but I still removed about 8500 duplicates from the March release.
R scripts to process data (see online directory of files):
- WhiteHouse-Visitors-Process.R: Clean-up and standardize data. See online cleaned and enhanced files used in match process below.
- WhiteHouse-Visitors-Groups.R: Segregate data into “tourists” and “official visitors.”
Summary of March 29, 2013 release with visits through Dec. 31, 2012 (see Background information on White House visitor data):
- 3.13 million visitor records
- 2.10 million tourists (see Two-thirds of White House visitors records are tourists)
- 1.03 million official visitors
Summary of Official White House Visits
The script WhiteHouse-Visitors-Groups.R creates two kinds of summary files.
One of the files is named WhiteHouse-POTUSandStaff-Counts-Name-Year.csv and summarizes the number of POTUS and Staff White House visits for each of the nearly 522,000 unique official visitor names.
The intent of this file is to provide a quick way to see if someone has made an official visit to the White House, and if so, what year(s), using a single Excel file.
Let’s consider four visit records for persons name “Gary Bass”. See Table 1.
Table 1. Summary of White House visits by persons named “Gary Bass.”
This summary shows a “Gary D. Bass” visited POTUS 4 times and White House Staff 48 times over the last four years. The breakdown by year for the 52 visits is 11 in 2009, 19 in 2010, 16 in 2011 and 6 in 2012.
Is “Gary Bass” without a middle initial a different person, or one of the other three?
There is not sufficient information in the released White House records to answer this definitively, so some assumptions must be made.
In this case (discussed more below), I will assume the “Gary Bass” records, those without an initial, are most likely associated with Gary D. Bass.
Assuming the name without a middle initial is one of the other three people, based on visit stats one might guess that 18 of the 19 visits “belong” to “Gary D Bass.” (52/55)*19 = 18. This guess might have about 5% error.
Two “chronology” files created by the WhiteHouse-Visitors-Groups.R script will be needed in the discussion below:
The chronology files are sorted first by White House visitee and then by appointment time and location. The full White House visit chronology for any visitee can be extracted quickly from these files.
Defining “Good Government” Organizations
We identified board and staff members of more than 20 organizations that focus on government transparency and accountability based on online web pages and/or recent IRS 990 forms. See Table 2.
The list was not all-inclusive but was intended to provide broad representation of good government groups active at the national level.
Table 2. Number of names by “good government” organization matched against White House Visitor records.
|Good Government Organization||Search Names|
|Center for Effective Government||
|Center for Public Integrity||
|Citizens for Responsibility and Ethics in Washington||
|Electronic Frontier Foundation||
|Government Accountability Project||
|National Freedom of Information Coalition||
|National Security Archive||
|Open Government Partnership (Tides Foundation)||
|Project on Government Oversight||
|Public Citizen Foundation||
|Public Citizen Inc.||
|Reporters Committee for Freedom of the Press||
|US Public Interest Research Group||
See file Good-Gov-Groups.xlsx to review all 322 names matched against White House visitor data. Please send me suggestions about other groups/names to include in future analysis.
Because of duplicates among the groups above, the total number of names to consider was only 310.
“Fuzzy Match” of Names
Likely spelling errors have been observed in White House visitor names, either in first or last names. In a few cases, first and last names have been observed to be reversed. Middle names are sometimes present and sometimes absent.
An exact match of names would reject many likely matches, so I used a statistical approach for matching.
The Levenshtein distance is the number of edits (insertions, deletions, substitutions) needed to change one string to another.
The R function LevenshteinSim in the RecordLinkage package provides a way to measure the similarity of strings.
From the online help in R:
levenshteinSim is a similarity function based on the Levenshtein distance, calculated by 1 – d(str1,str2) / max(A,B), where d is the Levenshtein distance function and A and B are the lengths of the strings.
Each of the good government representative names was considered in turn by the 1-WhiteHouse-Lookup-Open-Gov.R script, and scored with levenshteinSim against the 522,000 White House official visitors.
Tentative matches were accepted for later manual review if the match score was in the top 99.999% quantile, or down to a levenshteinSim score of 0.750.
This process typically would extract approximately the five best matches, but allowed inspection of a larger number of similar names.
The lookup script produced two summary files:
- Good-Government-WhiteHouse-Match.csv: A one-line summary of matches for each of the 310 names.
- Good-Government-WhiteHouse-2nd-Looks.csv (Excel version): The “best” match and “good” alternative matches to consider.
Let’s consider the levenshteinSim results for “Gary D Bass”. The “2nd-Looks” file shows the results in Table 3.
Table 3. Best Levenshtein Similarity Scores Among White House Visitors for Name “Gary D Bass.”
The R script assigns the “Keep” column to 1 for a perfect match and 0 for a score of 0.750. Other scores were left blank for manual inspection.
Based on the assumptions about the name “Gary Bass” from above, I selected “Gary Bass” to also likely be Gary D. Bass. I manually marked this name as “1″ in the Keep column (shown in red in the Excel file).
The Keep=1 names were then selected from White House visit data for further analysis.
Extracting White House Visits and Meeting Information
Let’s first consider the visit summary and details files:
- Good-Gov-POTUS-Visits-Summary-With-Organizations.csv (Excel version)
- Good-Gov-POTUS-Visits-Details.csv (Excel version)
- Good-Gov-Staff-Visits-Summary-With-Organizations.csv (Excel version)
- Good-Gov-Staff-Visits-Details.csv (Excel version)
The files are sorted by visitee, appointment start date and location to group meetings/events together. Variable banding in Excel makes review of meetings easier.
The summary files show one line per visit with a “COUNT” of the number of good government visitors at the meeting to compare with the “Total_People” at the meeting. Normally one should study the summary file first to identify meetings with several good government people at the same meeting.
The details files show all people attending the meetings with a NOTE column to mark the good government representatives at the meeting.
Example 1. POTUS meetings with good government representatives.
Scan the POTUS summary file looking for meetings with a relatively low “Total_People” attending and a relatively high “COUNT” of good government people attending.
One good government POTUS meeting can be seen on March 16, 2011 in the summary file, with four good government representatives. The description was “transparency award.”
The same four and another person attended a “transparency award ceremony in the oval” meeting on March 28, 2011, as shown in Table 4.
Table 4. Summary of Good Government Visitors to POTUS on March 28, 2011.
The details file show the same five people were at both meetings, but one was not marked in the first meeting because of a name spelling variation.
Example 2. Last, small good-government POTUS meeting of 2012.
Most POTUS meetings have hundreds of attendees, so the small ones might be the most interesting.
The last, small good-government POTUS meeting of 2012 had one good government representative in a meeting of 10 people.
The summary file shows the meeting in Table 5a.
Table 5a. Summary File Information for Dec. 4, 2012 POTUS meeting.
The details file in Table 5 b tells who else was at the meeting.
Table 5b. Details File Information for Dec. 4, 2012 POTUS meeting.
This POTUS meeting appears to have been with a group of left-leaning journalists: Garance Franke-Ruta (The Atlantic Online), Arianna Huffington (Huffington Post), Rachel Maddow (MSNBC), Joshua Marshall (TalkingPointsMemo.com), Zerlina Maxwell (New York Daily News/Ebony Magazine), Markos Moulitsas (Daily KOS), Lawrence O’Donnell (MSNBC), Greg Sargent (Washington Post “The Plum Line”), Edward Schultz (MSNBC), Al Sharpton (MSNBC).
On Dec. 4, 2012 the Weekly Standard asked if an “MSNBC love fest” was going on at the White House.
Statistics about Good Government White House Meetings
Counts of White House meetings shown in Table 6 show the number peaked in 2010 and have declined since then.
Table 6. Summary of White House Visits by Good Government Groups by Calendar Year, 2009-2012.
Anyone attending a small White House meeting should have a better opportunity to be influential than in meetings with hundreds or thousands.
Table 7 shows stats that POTUS meetings often have a large number of attendees.
Table 7. Statistics about Meeting Size by White House Visitee Category.
The median size for POTUS meetings is huge (220 people) compared to the median size of a meeting with a White House staffer (4 people).
If we only consider the smaller meetings with White House staffers, Chart 1 shows a continuing decline in visits by good government groups.
Chart 1. White House Good Government Visitors to Staff by Month, 2009-2012.
Note: The small number of meetings in early 2009 shown in Chart 1 is because the White House only released information for a small number of specific requests for information during much of 2009. Judicial Watch is still trying to force the White House to release the records from early 2009 as part of a federal lawsuit.
“Top 10″ Lists for White House Visits by Good Government Organizations
Table 8. Top 10 Visitors to POTUS/VPOTUS/FLOTUS, 2009-2012.
Table 9. Top 10 Visitors to White House Staff, 2009-2012.
Table 10. Top 10 White House Staff Visitees, 2009-2012.
The unknown visitee shown as a question mark above is likely due to limitations of reports about visits at the Vice President’s Residence. It’s unclear if these visits were to one person, or possibly many.
- Fewer Records, More Visitors in Release by White House, WatchdogLabs.org, Feb. 25, 2013.
- 600K duplicate records in White House visitor logs?, WatchdogLabs.org, Feb. 25, 2013.
- Cleaning and standardizing White House visitor data, WatchdogLabs.org, Dec. 14, 2012.
- White House Worker and Visitor Entry System Data Fields, WatchdogLabs.org, Dec. 12, 2012.
- Background information on White House visitor data, WatchdogLabs.org, Dec. 10, 2012.
Contact Info: Twitter: @WatchdogLabs, Facebook: http://www.facebook.com/WatchdogLabs
Leave a comment