Sift out and only then move to the cloud

By 2025, people will have generated 163 zettabytes of data globally. This corresponds to a factor of 10 compared to today. Therefore, the rule for companies is: Data quality comes before quantity, and only then move it to the cloud.

Andreas Bechter - October 12, 2017

Data quality, cloud, data quality — © depositphotos, buchachon_photo

The drivers of data growth will no longer be private users, but companies. They generate and store this data in the hope of being able to turn this information into valuable knowledge through analysis. They are digitizing their manufacturing and delivery processes, evaluating customer behavior and market movements in order to find clues for a competitive advantage from the additional data. This is how companies generate vast amounts of data every day, but only a fraction of this information can actually be used profitably. There is a lack of ways to organize all this unstructured information in order to keep track of it.
Not only does the abundance of emails, documents and image data make companies more vulnerable to dangerous security breaches, it also increases the risk of data breaches that could leak personal information. The problem will intensify as the amount of unstructured data grows 49 percent year over year, according to the "2017 Veritas Data Genomics Index" shows

Displaced is not fixed
In addition, all this data must be stored, backed up and kept highly available. Many companies are therefore flirting with the cloud, because they do not have to invest in their own server rooms and additional personnel. The resources are ordered at the click of a mouse and only the storage actually used is invoiced. According to a Veritas study, 74 percent of companies worldwide work with two cloud infrastructure providers, 23 percent even with four partners or more. Hybrid approaches, where data is stored both in the cloud and on-premise, are particularly popular. However, the use of cloud services raises some issues that an IT manager should not underestimate. Data becomes more fragmented, distributed to different locations, making it harder to keep track of it all.

Store only in safe countries
The lack of knowledge about where data is stored and its content may become problematic next year. According to the new General Data Protection Regulation (GDPR), companies will only be allowed to store data from EU citizens in secure countries from May 25, 2018, and will have to delete information as soon as customers demand their "right to be forgotten. If companies do not comply with the requirements in a timely manner, they could face severe penalties.
It is important to know where exactly and in how many places in the digital network network relevant customer data actually reside. The bottom line is that a digital inventory is due, covering both on-premise infrastructures and cloud topologies.

The big cleanup
It will be crucial for companies to be able to scan all their data quickly and classify it clearly by tag. This is the only way to differentiate between valuable and unimportant data. If a central, uniform technology is used for this purpose, uniform guidelines can also be applied to the entire data stock, regardless of where it is stored. This enables companies to optimally manage and protect sensitive or critical information that is often subject to strict retention periods.
Another Veritas analysis shows how much data in companies is actually worthless. German IT managers classify only 15 percent of their information as business-critical. The rest is divided into data that companies need to examine more closely. Around 19 percent is so-called ROT data: it falls into the category "Redundant, Obsolete, Trivial". In other words, they have no business value whatsoever and can be deleted without exception. That leaves 66 percent. This information is called dark data and cannot be precisely classified. Here, IT must act as a detective and classify. Here, too, experience shows that a large proportion falls into the RED category.
However, companies should automate manual processes as much as possible when it comes to classification. This is because manual processes harbor sources of error, are often difficult to enforce across the entire environment, and are overall labor-intensive and inconsistent. As an automated process, classification should be a standard element in efficient information management. This allows important data to be protected and maintained according to its value. Guidelines specify exactly how long it should be archived or in which cloud it should be stored, for example. Unimportant data, on the other hand, can be consistently deleted to free up storage. These clean-up actions will have to be carried out continuously as an everyday duty, because the immense data growth does not allow any other conclusion.

About the author: Andreas Bechter, Technical Product Manager at Veritas

(Visited 38 times, 1 visits today)

Sift out and only then move to the cloud

By 2025, people will have generated 163 zettabytes of data globally. This corresponds to a factor of 10 compared to today. Therefore, the rule for companies is: Data quality comes before quantity, and only then move it to the cloud.

More articles on the topic

Calls from fake authorities at record high

Study: Only 2 percent of Swiss companies are optimally prepared for cyber threats

AI is much more convincing than humans