Making Government Data Less of a Disaster

Drew Robb

Updated · Jun 19, 2015

Anyone working in the accounts department or who owns a small business knows about the administrative practices of the IRS. Countless letters are generated, penalties are seemingly assessed arbitrarily, and tax returns and other documents get lost. The agency is apparently unable to capture and consolidate correct data on each of its “customers.” Of course, having 300 million people to deal with might be considered a mitigating circumstance.

Now let's move it beyond one federal agency. Imagine trying to get all of those agencies on the same page regarding how data is stored, which fields are to be entered into which specific databases, and how these systems talk to each other to enable integrated reporting and tracking. It sounds nearly impossible. Yet progress is being made by the Data Transparency Coalition.

“The information challenges of the federal government are far worse than those of anyone else,” said Hudson Hollister, CEO of the Data Transparency Coalition. “And the scale is far larger than any enterprise, as it now constitutes 21 percent of our economy.”

Where does one start when attempting to bring order to this chaos? The coalition decided to focus on spending. The basic idea is to be able to understand government as one enterprise by bringing about data transparency, starting with spending.

At the moment, spending data is reported in eight or more ways, which makes it impossible to track. There are different systems with fragmented ways of measuring, reporting and counting. This makes it impossible to extract any reliable intelligence. Efforts to glean insight from this data require dedicated teams of people working with printouts, spreadsheets and calculators.

Hollister gave the example of the Solyndra loan guarantee scandal. Had data transparency and standardized digitization been in place, the Securities and Exchange Commission (SEC) and Department of Energy could have  cross referenced their data, which would have made it clear that Solyndra should not have been awarded loan guarantees which eventually cost taxpayers $535 million.

But getting the SEC files in order is a monumental task. Almost all data comes in as text, Hollister said. Public companies file their various reports by sending in text documents, then the data has to be checked and entered manually. Other agencies face similar challenges.

Legislating Better Data

The name of the game, therefore, is to establish data standards to make everything consistent across government, beginning with spending. The effort began with the passage of the American Recovery and Reinvestment Act (often called the stimulus package) in 2009. Data standards put in place then made it possible to track spending on stimulus items, find shady deals and track down suspect contractors.

This success led to the Digital Accountability and Transparency Act (DATA Act) being signed into law by President Obama. This requires the Department of the Treasury and the White House Office of Management and Budget to transform U.S. federal spending from disconnected documents into open, standardized data, and to publish that data online. As part of this, government-wide data standards have been established for federal spending.

“That program is being extended with all agencies to use new standardized data fields and formats by 2017,” said Hollister.

While he's realistic about the likelihood of meeting that deadline, he said progress is being made.

Local Government Leads the Way

While the federal government struggles to scratch the surface of data standards, state and local agencies are making greater strides. The City of Charlotte, N.C., for example, has been successful in setting data standards for its agencies using iWay software from Information Builders.

“Pervasive information distribution allows all areas of the organization to gain insight from your data,” said Gerald Cohen, CEO of Information Builders. “For example, the City of Charlotte has established one centralized address system across city agencies.”

Without a centralized repository for managing address information, duplicate or erroneous address records and misspelled names create difficulties for various city departments. The city's Address Management Program uses master data management (MDM) to cleanse address and location data.

Twyla McDermott, IT program manager for the City of Charlotte, said that a master set of address records was vital in order to provide accurate location and mailing addresses for all agencies. The city also wanted to standardize its process for address creation and updating.

The system is designed to automate data profiling, as well as cleansing, matching, merging and data enrichment. Duplicates, discrepancies and exceptions are spotted and handled. Further, the city has added location data so the “golden address” has associated physical coordinates (for geo-coding). This ties in with local US. Postal Service data as well as the city's tax data files.

McDermott said this solved a major problem of citizens either putting wrong addresses on official forms and applications, or those addresses being wrongly entered by employees. Now the city has a validation process to check if the address is correct and resolve any conflicts.

The city and surrounding Mecklenburg County worked together to map 70 business processes across 15 city agencies and four county agencies concerning land development, building inspections, water turn-on, capital improvements and other functions involving addresses.

Data Consolidation and Dashboards

A similar project was carried out by the Michigan State Police Department. It needed to consolidate criminal justice data from various agencies, which was sometimes provided by Crystal Reports and other times on Microsoft Excel spreadsheets and a host of other formats.

The MSPD uses the Oracle Business Suite in conjunction with Information Builders' Law Enforcement Analytics (LEA) package to create a dashboard to track crime and enforcement statistics statewide or by district, to represent crime and crashes geospatially, and to manage metrics for trooper activity and the solving of crime.

It helped commanders increase the average amount of time officers spend on active patrol by several percent. Other metrics being monitored include traffic crash fatalities, crime closure rates and homicide rates.

“We're now confident in the numbers we use to keep the public safe, and to report to the feds and our legislature,” said Katie Bower, assistant division director of the Michigan State Police Criminal Justice Information Center.

Drew Robb is a freelance writer specializing in technology and engineering. Currently living in Florida, he is originally from Scotland, where he received a degree in geology and geography from the University of Strathclyde. He is the author of Server Disk Management in a Windows Environment (CRC Press).

  • Business Intelligence
  • Data Management
  • Research
  • Drew Robb
    Drew Robb

    Drew Robb is a writer who has been writing about IT, engineering, and other topics. Originating from Scotland, he currently resides in Florida. Highly skilled in rapid prototyping innovative and reliable systems. He has been an editor and professional writer full-time for more than 20 years. He works as a freelancer at Enterprise Apps Today, CIO Insight and other IT publications. He is also an editor-in chief of an international engineering journal. He enjoys solving data problems and learning abstractions that will allow for better infrastructure.

    Read next