Data about data:

Data management is a crucial task and evidently enough, that its carries powers, sufficient to destroy or to make anything out of meaningful data. The industry has been trying for ages to figure out how to do indexing of data and to find out right data how the information about data can be tagged eventually qualified as METADATA. One of the authors described as “It’s no longer hard to locate the answer to a given question: the hard part is finding the right question, and as questions evolve, we gain better insight into our ecosystem and our business” “Kevin well. The best example is google as to how it’s trying to find the information about every block in the market and trying to index for better search.

Metadata concepts have been around from long, the right content of metadata development as well as how to analyse that information is what is getting developed as we get into the worlds of power generators of data generating million of bytes every second everywhere. Data science and scientists around the world and autonomous bodies working around data qualifiers. In simple terms, metadata is “data about data,” and if managed correctly, it is generated whenever data is created, acquired, added to, deleted from, or updated in any data store and data system in the scope of the enterprise data architecture.

The better operated and statistical metadata is, the faster the analytics it can provide. Parsing through files vs indexing the right files and then detailing inside the files is a strategic linear and logic roadmap companies build and effectively utilise where to mine. It is a role of a system integrator to find the right balance in accessing the right data and making meaningful insights out of it. The approach can be categorized in two forms structure vs unstructured data. The analysis continues on the metadata itself by mining the age, type, user, accessed and other parameters which allow getting a better insight on data. Based on the outcome the decision is driven depending on the objective it was stated for like security, compliance, consolidation, reduction in cost, insight and all the other fundamental principles of master data management. It is paramount the amount of DARK DATA organisations able to discover after many of this assessment. While it is much easier to do the same on unstructured which is file driven, it gets challenging on the object based structure databases and block based.

According to Veritas, 80% of business data is unstructured: files, documents, presentations, emails, etc. Another stat from a Gartner study showed that at least 50% of data on networks is not classified and has an unknown value.

Dark Data can create a positive impact if managed properly. If it’s mismanaged, information security risks, along with wasted IT expenditures and storage costs, should be expected. When data is allowed to proliferate unchecked the impact may not be felt right away, but very quickly ballooning costs, inefficiencies, and compliance challenges can be overwhelming. These problems are considered by most organisations, “through 2020, less than 10% of organisations will find value in “Dark Data.”

References:

Dark data assessment

Issues in Crosswalking Content Metadata Standards

The benefits of metadata and implementing a metadata management strategy

Big data or data management abuse

Data is around since the inception of human beings and even before, how, where and what form is what is changing the causing the paradigm shift. Scientist and other many fields people are evaluating the data that our forefathers left us behind with either paper, wall or-or any other form and trying to find answers and work on the analysis of inception and how they survived and what left us behind with on history. Similarly, in today’s we generating data at a much more faster speed than one could imagine and it’s not the speed of data it’s the sources which are increasing day by day.

Four forms I qualify them as Generator (who generates and from where it originates) collector (where it can be stored and how from whom) chruning( How do I make it usable for the consumer) Consumer(To make meaningful insight out of it)

From a source, speed, type, storage and purpose the data needs to be evaluated and handled well if meaningful insights to be derived out of it. Data is Power: It cannot be stressed enough that the power, data possesses which help eliminate, predict and provide recommendations based on insights and multiple examples from physical forces, to nature catastrophic to shopping advice to finding the genes and relatives across the continents have proved enough of the benefit and power of data.

The term #BigData or #BiggerData is pretty much self-explanatory while size is certainly a critical part of it, but that alone doesn’t tell the whole story of what makes data a #BigData or #BiggerData. For a slightly more comprehensive overview, deep diving into these trends using five essential characteristics; Source, Velocity, Variety, Volume and Value will help distinguish these data trends and the value they bring to the organisation as described by IBM.

The data is pouring from clouds: Multiple verbatims exist in the market from ponds, lakes, swamps to rivers to oceans and many other. The context and the way to contextualise are to identify the right source, supply to the machine and expect the desired output. Similar to how the water reaches the oceans and in turns back to clouds and pours over back to us again. Identifying the right data set and doing the Master data management and following the stewardship matrix allows benefiting from the power of data.

References:

A Very Short History Of Data Science

Data lake or data swamp