Metadata are data describing other data (semantically data about data). It’s the information which allows to recognize the collected data. For example the most common metadata are the backup date, the size and author of a file… all the information allowing you to identify and locate the data in due time (it can be a document, an audio file, an image, in theory any kind of collected information).
Establishing a strategy of metadata management is necessary to guarantee the good shape of data on one hand (see below quality*, governance*) and to allow a smart exploitation among the organization on the other hand (by the entitled users). The data must be accessible at any given time, regardless of the changes met by the organization (new coworkers, new uses of the resources…). This implies that the metadata are usable from the start by others, which means they are thought in a logical way enabling their identification by other people than their sole creator.
Metadata have to allow the analysis of data but also guarantee the agility of the organization. A good (meta)data management cannot be separated from governance* and it cannot ignore the famous three Vs (volume, velocity and variety) of big data* (a 4th notion recently added is value). The increase of the big data* phenomenon has more than ever brought to light the strategic need for businesses to take advantage of their data for marketing and growth purposes.
Metadata are historically at the heart of archiving which is the main vocation of organizations such as:
- the music industry
- the world of publishing (print and digital)
In the age of the Internet, any entity involved with personal data detention is by definition concerned by metadata (see big data*), the greediest users are, among others:
- social media (with leaders Facebook and Youtube)
- eCommerce juggernauts (Amazon, eBay…)
- Google (see also SEO and metatags)…
What makes metadata so essential in 2015 ?
Big Data* analysis implies the use of the appropriated tools, and as said previously, the metadatum – even in its simplest form – is a key element of this analysis. Data only have value in a precise context and a smart metadata management strategy will facilitate the integration, sharing and collecting of those vital data, which then allows to translate and use them in a business-driven method, for example.
In short, no data without metadata. No evolution without a good database management.
* data quality : The BI tools, the data warehouses… provide the information the user needs to act and decide. This information is the result of an aggregation of data made in accordance with predefined rules. To provide “good quality data” to the user implies good quality data taken individually and coherent data aggregation.
* data governance : must define the rules for exercising the activities of data management, to ensure compliance with these rules and their implementation and to ensure their development and evaluate their effectiveness.
* big data : the black gold of the 21st century offers to organizations the possibility to better target their customers and to differentiate them by suggesting customized offers of services. It consists in all the information collected both off- and online which allows the analysis and the predictability of behaviours, representing essential data as used in marketing. Problem met by big data : the protection of sensitive or confidential data is understandably wished for individuals. A solution would eventually come with the anonymization of data inside the organizations.