What Agile Business Intelligence Really Means: Page 2
Improving Business Intelligence Agility
Remember, the point of this exercise is to ask how we would change each stage of the information-handling process if we were to focus on improving its ability to change rapidly in response to unforeseen changes. In the case of data entry, the unforeseen changes are new types of data. Often, the signal of a new type of data is an increase in data entry errors, since the inputter is attempting to shoehorn new information into an old data-entry format. So the first rule of agile BI becomes:
There's more information in bad data than good data.
More concretely, agility theory would suggest that instead of chasing the elusive goal of zero data-entry quality defects, we focus on examining bad data for patterns via analytics, and using the insights to drive constant upgrades in the global metadata repository (e.g., via a master data management tool) to reflect new data struggling to be entered. Also, we should feed those insights to new BI-product development to allow user interfaces that semi-automatically decrease data entry errors by matching user data entry needs better.
Data consolidation is about noting other copies of a datum (or other related records). It has long been noted that BI-type data is usually kept in "data archipelagoes," line-of-business or local data stores that do not adequately connect with a central data warehouse to allow consistency checks. What is much more rarely noted is that this situation is becoming worse, not better — suggesting that we are not only chasing a moving target, but one that we will never catch.
Here's where agility theory comes into play. Traditional IT calls for "discipline" — making each newly-discovered data source toe the line and feed into an ETL front-end to a more central data warehouse or cluster of data marts. Agility theory, on the contrary, emphasizes auto-discovery of each new data source, and automated upgrade of metadata repositories to automatically accommodate the new information.
ETL tools typically do not offer this type of auto-discovery; MDM and data visualization tools do. But remember: don't choose the tool for its quality achievements, but rather for the features that allow you to adapt quickly to new challenges.
The distinction between data consolidation and data aggregation is a bit narrow but important: data consolidation is about making sure that the new datum is consistent with previous data (if the system says you're a teacher of student A and the new datum says you're not, it's inconsistent); data aggregation is about giving everyone across the organization who needs the information access to it. Practically speaking, that means querying across data stores — because, as noted above, we are getting further and further from an architecture in which all data is in a central data warehouse's store. And in order to query across data stores, you need a frequently upgraded "global metadata repository" to tell you how to combine data across data stores.
Agility theory flips the traditional approach to BI data aggregation on its side. In traditional BI, the aim is to funnel all possible data to the data warehouse, and then maximize performance of the data warehouse on well-defined, often-repeated queries, with occasional resort to data visualization in order to handle outlying cases. In real agile BI, the focus is on handling ever more new data types, accessed wherever they are, and proactively searching them out. Data virtualization tools, today, are the best way to achieve this — they often offer auto-discovery of new data sources and data types, and they optimize performance of queries across data stores, leaving each underlying database to optimize itself. Over the long term, this delivers "good enough" performance over a much broader set of data. So the second rule of agile BI becomes:
By focusing on seeking out new data rather than improving performance on existing data, we end up improving overall performance.
Note that we have suddenly moved from "data" to "information". This is because we have added enough context to the raw data (its connection to other data in the enterprise's global aggregated data store) that it can yield insights — it is potentially useful. Now we have to start the process of making sure it is really used effectively.
Arbitrarily, the first step is to figure out where the information should be sent — who can use it. In traditional BI, first the data is placed in the data warehouse, then it is sent to well-defined individuals — reports to CFOs and LOB executives, analyses hand-delivered by data miners to their business-side customers. The weight of tradition, expressed in customized report software as well as analytics customized for the planning/budgeting/forecasting cycle, hampers changing the targets of information delivery as more agile organizations change their processes.
In agile BI, information goes first to those whose agility has the greatest positive effect on the organization: those involved in new-strategy development and new-product development. In other words, information and alerts about today's products and customer/Web trends show up at the innovator's doorstep today, not at the end of the quarter when reports are generated.
The other key difference in agile BI is that routing (or availability) is not as pre-defined. The point of BI for the masses plus ad-hoc querying is that particular end users may find connections between data — including outside data like social media — that internally-focused, request-driven data miners don't. And that can often mean a slight reduction in security. Balancing security and the organization's "openness" to and from the outside is always a delicate task, but agility theory suggests that more openness can lead to less downside risk. This is because more openness can mean better knowledge of threats and more rapid adaptation to these threats. That leads to the next rule:
Increased focus on exchange of data with the outside rather than defensive security can decrease downside risks (and increase upside risks!).
The next step is to get the information to the targeted person. The big emphasis over the last few years has been decreasing the time between information aggregation and information delivery. However, the organization does not necessarily fare best with the minimum time between data input and information delivery. Studies have shown that without historical context, users can "overshoot" in adjusting to the latest data. Moreover, speeding up information delivery in BI is a bit like engaging in an arms race: if the competitor matches your speedup, or you leapfrog each other, no benefit from speedup may be apparent. And finally, benefits and costs are "asymmetrical" — If you take one day to carry out a process while your competitor takes 5 days, speeding up to half a day often yields little additional benefit; but if you take six days to carry out the process, speeding up to 3 days usually yields a much larger benefit.
One approach suggested by agility theory sounds a little strange: focus on speed of change of information delivery. In other words, instead of focusing on delivering the same information faster, focus on the ability to alter the deliverable information in one's data stores as rapidly as possible as requirements or processes change. Now follow the logic of that: each new datum alters the information that should be delivered to the user. So to get more agile information delivery, you eliminate as much of the information-handling process as possible for as much of the data as possible. To put it another way, you redesign the process so that as much as possible, data is converted to information and routed at the point of input, in real time.
That's the real reason that an event-processing architecture can be agile. It intercepts data being input into the organization and does pre-processing, including direct alerting. The most sophisticated tools add enough context to expand the amount of data that can be so treated. Meanwhile, deeper analyses wait their turn. And upgrading the event processing engine can be a quicker way to handle new data types flowing into the organization's data stores.
Discussions of the value of BI tend to assume that its largest benefit is in improving the decision-making of its users. Interestingly, agility theory suggests that this is not necessarily so. Greater benefits can be achieved if the information delivered improves a business process (e.g., makes book-to-bill or customer relationships more effective), or if the information improves new product development (NPD).
That, in turn, means that real agile BI should focus on analysis tools that make an operational process or NPD better, rather than on better dashboards for the CEO. It should also emphasize BI that detects and suggests adaptations to unexpected change, rather than analytics that improves the forecast of what will happen if everything goes as expected. This means more emphasis on ad-hoc and exploratory information analysis tools, and more "sensitivity analysis" that sees the effect of changes in assumptions or outcomes. One interesting idea is that of "dials" — e.g., sliders that allow us to vary revenues and costs, as ad expenditures or sales change from their forecast levels.