Data Governance, a key element for Digital Transformation

compartilhe este artigo:

Data Governance

For the last few years, corporate environment, used to speak and hear lots of financial/management terms, witnessed a flood of technical jargon like Analytics, Artificial Intelligence, Data Lake, Deep Learning, Chat bots among others.

Digital Transformation definitely changed how companies deal with and price subjects which, so far, has been considered high risk investments and more associated to science fiction than corporate world business.

Why did it happen? You may ask.

Because a long line of success cases has already proven that Artificial Intelligence and Data Science are not just the latest buzzwords, but they do provide valuable results.

Even “late adopters” need to adapt themselves to this new reality or be left behind.

Nevertheless, after all the excitement generated by this new AI world, lots of questions remain open regarding how to get into this new hype field successfully and sustainably, making sure we are not joining a sinking canoe in the middle of a Data Lake.

Lots of questions remain open regarding how to get into this new hype field successfully and sustainably, making sure we are not joining a sinking canoe in the middle of a Data Lake.

At this point, we start to face many other jargon, such as MVP (Minimum Viable Product) and POC (Proof of Concept), which originally were designed to allow a continuous product development and test drive new ideas before mass adoption.

Although, they end up being used for half-finished short term projects without any kind of real foundation, which will never become sustainable products.

However, if appropriately used, these methods enable us to measure true quality of the path we are following to achieve our goals and fail fast, or at least, learn fast, applying the famous Agile Methodology.

Then,  many other terms are born, such as “squads”, “hubs”, “icebox”, “labs” among others

In this article we would like to point out a term mostly forgotten and avoided, that even though being considered the ugly duckling of this new AI world, need to be kept in mind and highlighted, Data Governance.

Data Governance isn’t a new subject, it has been around for quite a while, but in the new Big Data era, neglect it can lead to a Big Problem.

Here are some quick tips on how to deal with Data Governance and some hot point to keep in mind (and maybe on a post it at your desk).

Let’s talk about security?

When thinking about Data Collection it is important to always think of cyber security, because every data pipeline connecting your data storage to your data sources (IoT Devices, External Databases, …)

May require extensive firewall permissions which may initially speed up your project making it more “Agile”, but also making it more vulnerable to cyber attacks, both internal and external.

cyber security

Working together with your cyber security team and building up DMZ’s (Demilitarized Zones) is a must for any successful data related project.

Indeed, same concern applies not only to the collection, but also to subsequent consumption by dashboards and panels of predictive and prescriptive.

Identity management, even by manual processes, and user and service accounts access are critical to avoid unconscious exposure of information to audiences that even authenticated should not be authorized.

In addition, discussions about GDPR (General Data Protection Regulation) that directly address the specialization of personal data protection are also extremely relevant in the management of AI, since the user must give prior consent for the treatment of their personal data, so end-user systems that eventually become data sources for advanced analytics need to be adapted.

How to store my data?

Instead of following the herds, adopting all sort of data lakes like technologies, understanding your use cases, starting with “why” rather than “what” is an important factor in determining data repository technology in your data governance strategy.

Using a Data Lake for use cases that require rigid schemas or using a traditional database for storing unstructured data leads to inefficiency and increasing costs. Just “believing” in some technology and mention a jargon twice a day is no guarantee of success.

It is important that your work group can have a clear view of use cases and thus identify the rules of use.

In addition, it is always worth studying your own legacy, reuse, because it is often not a matter of revolution, but of evolution of data management.

As an example, Historians usually work well for the world of industrial time series and the question instead of replacing becomes complementary, integrating.

A very relevant point for repositories is also about the dictionary / data catalog, otherwise, for example, your data lake can become a big data swamp.

In addition, it is always worth studying your own legacy, reuse, because it is often not a matter of revolution, but of evolution of data management.

Contracts

By contract we mean the relationship between entities, setting the level of trust between parties with their duties and rights.

It is possible that your data sources to be used as data foundation for AI models may be under the management of a different department.

Do not ignore this fact, identify responsible people, set up forums, empathize, understand access performance concerns, deal with SLAs (Service Level Agreement) and OLAs (Operation Level Agreement).

Peopleware:

Your team needs at least a Data Steward, a Data Architect and a Data Quality Lead.

Data Steward: The focal point for data related problems of a particular subject.

Responsible for mapping the new data to be acquired, make sure the project is in compliance with data regulations, design ways to measure the quality of the data and also supervise the usage of the data

Data Quality Lead: Responsible for monitoring and assuring the stability and continuity of the data storage environment.

He routinely checks the quality and completeness of the data.

Well, at this point, we hope you deeply appreciate and understand that Data Governance is a foundation stone for a healthy and sustainable Digital Transformation in your organization.

We hope you make good use of all information above and leverage digital transformation at your work place.

Author:
Alexandre Gonzalez,Architecture Manager at VALE
Author:
Edson AntônioAI Manager at VALE
Translated to English by:
Rafael BarsottiAI Analyst at VALE

Discover the: VALE

Share on linkedin
LinkedIn
Share on email
Email
Share on facebook
Facebook
Share on twitter
Twitter

Confira nosso Encontro Digital sobre Governança de dados, clicando aqui.

Data Governance

Saiba mais sobre este encontro digital.

saiba +