Data Discovery: Worthless without governance?

Peter Baxter, MD EMEA at Yellowfin, discusses the relative merits of Data Discovery, on View the original CBR post HERE >

Over recent years there has been rapid growth in the adoption of self-service Data Discovery within the enterprise. Namely the use of desktop-based Data Discovery tools to dive into data and provide rapid on-demand visualisations and analysis. Such an approach has been likened to the use of Excel in terms of its potential ubiquity and because business analysts have relative autonomy to analyse data simply, without the need to rely on the IT department.

The growth of self-service Data Discovery is hard to ignore, but, can this approach meet the Business Intelligence (BI) needs of most organisations? Just like the spreadsheets before them, desktop-based Data Discovery tools come with significant limitations. These limitations mean we risk regressing back to a world of unsecure and ungoverned data silos, multiple versions of the truth and untrustworthy data analysis. And where does this leave us? Ultimately, with contradictory views of organisational performance and bad decision-making.

Understanding the limitations of Data Discovery

To appreciate the risk of relying on desktop-based Data Discovery to facilitate demand for self-service access to reporting and analytics in the enterprise, we need to understand the evolution of traditional BI solutions. Such technologies once started out for use on the desktop, just like many of today’s Data Discovery tools. Analysts would investigate data in their silo and then present that back to the business. It wasn’t until enterprises applied significant pressure that traditional BI vendors developed server-based solutions, which incorporated significant governance capabilities. Today however it seems like many organisations have been quick to forget the logic that underpinned server-based BI platforms in the first instance.

Today, with the popularity of desktop Data Discovery tools, analysts are once again working with data in their own world with IT having no visibility or control over what’s happening – just like the days of Excel. We’ve all experienced one form or another of ‘spreadsheet hell’ over the years. One version of the spreadsheet says ‘X’, another says ‘Y’, and there is rarely a way to easily understand which is right. In part this problem stems from a lack of governance and data ‘lineage’ – the inability to understand and control how the data has changed over time, and what calculations or amendments have been made, so that everyone can have a uniform and trustworthy view of the data. Data Discovery tools suffer from a similar shortage of metadata and the onus is on individuals, working outside a single platform that IT can govern, to manipulate and pull together data from different sources. History has already proved that humans make errors, that errors creep into analysis and that decisions are made based on that analysis. The case for governance should be clear.

In attempting to solve the need for greater BI accessibility outside the confines of the IT department, today’s desktop Data Discovery tools have created a significantly bigger problem – a lack of IT governance. According to Doug Laney, research vice president at Gartner: "As a result of the limited governance of self-service BI implementations, we see few examples of those that are materially successful — other than in satisfying end-user urges for data access." In fact, Gartner has also proclaimed that "through 2016, less than 10% of self-service BI initiatives will be governed sufficiently to prevent inconsistencies that adversely affect the business."

Similarly, working from a desktop means that security is a significant challenge. IT has no control over data access, security and freshness. Therefore, ensuring that the right people are performing the best analysis possible, on the most up-to-date information available, becomes impossible. Subsequently, the resultant misinformation is distributed to business users, who then unknowingly make poor decisions.

Providing the freshest insights to business users is, afterall, what BI teams aim to achieve. One such example is working towards the creation of a dashboard that aims to monitor data over time so business performance can be easily tracked. In a desktop-based Data Discovery scenario, it’s difficult to enable the real-time access required to support that outcome. Often you need to refresh a dashboard each time you want an updated view of performance, as the tool doesn’t connect to the database itself. This situation is a bit like driving a car but needing to refresh the speedometer or petrol gauge – it doesn’t support the rapid decision-making you require as a driver.

Enterprises need a governed environment

Whilst desktop-based Data Discovery tools offer a flexible approach that can superficially meet the self-service demand of business users and analysts, as well as help to rapidly visualise a static dataset, its use should be restricted to suitable scenarios – just like Excel. For enterprise deployments, it’s essential that analysts are working within a secure and governed environment. And this, of course, all comes back to how the BI environment is architected. For those of us who can remember back to desktop-based legacy BI, or have suffered from spreadsheet hell, the need to avoid making the same mistakes again couldn’t be clearer.

All smiles: Managing Director for EMEA at Yellowfin, Peter Baxter