Why Data Destruction is Important for your Business
One of the biggest companies on the planet had a monumental data problem. A recent investigation from Wired found that Amazon — from its web services to its customer-facing retail and delivery businesses — had grossly mismanaged its seemingly unending trove of consumer data. In just one example of the massive data headache, Amazon InfoSec leadership found itself facing, “the names and American Express card numbers of up to 24 million customers had sat exposed on Amazon’s internal network, outside a ‘secure zone’ for payment data.”
At this point in mid-2017, the logistics juggernaut had not yet faced a leak, breach, or cyberattack of any magnitude, but as the company began developing measures to prevent hacks, it found it had an even more severe problem: Amazon did not actively keep track of its data flow or the storage of its users’ sensitive data. As it kept amassing user data as more customers used its services, Amazon’s data cache “had become so sprawling, fragmented, and promiscuously shared within the company that the security division couldn’t even map all of it, much less adequately defend its borders.”
Outside of the obvious data privacy infrastructure that the company would need to build over the next three years, Amazon would also need to develop an internal practice to effectively wipe out unnecessary user information: a process known as data destruction. This real-world example of wide-ranging and consequential data mismanagement has proven the importance and urgency of effective data destruction policies.
What Is Data Destruction?
In order to better understand the role of data destruction, it’s important to realize how data is produced and where it goes. To give a brief overview of how data collection has evolved over the past 25 years, it wasn’t until 1996 that processes shifted from primarily on-paper to digital files. In this instance, technological innovation afforded the opportunity for people using personal computers to store files of all types.
As technology continued to advance, hard drives were capable of storing more information, and people were able to create and save files of all sizes and types on personal computers and related devices. The most recognizable forms of data storage developed were “magnetic storage (HDD, tape), optical discs (CD, DVD, Blu-Ray), and semiconductor memories (SSD, flash drive),” according to Melvin M. Vopson, Senior Lecturer at Portsmouth University.
The inventory of data-capable devices and memory-based tech is constantly expanding, which multiplies the digital information and data stored on (and therefore collected from) devices used daily by much of the world population. From social media platforms and online shopping done on mobile devices and computers to even the “smart refrigerators” popping up in home kitchens, different kinds of data are gathered for a variety of reasons and uses. The amount of data we produce, exchange, and store is also exponentially increasing. Between emails, texts, social media posts, and browsing or streaming behaviors, our data production and consumption has exceeded 33 zettabytes, or 33 trillion gigabytes, as of 2018. This incomprehensible figure will only compound in the coming years, as it is expected to rise to well over 175 zettabytes by 2025.
With all this data circulating across the planet, we need to consider where our information is being stored and to whom it’s accessible. To simplify, practically all data in the world can be found in three different spaces:
- Endpoints: Probably the most recognizable data storage units, endpoints consist of physical devices that include computer hard drives and mobile phones.
- The Edge: The edge can be viewed as necessary / critical infrastructure that supports data storage and exchange. Banking and industrial servers, 5G and other telecommunication towers, and government agencies are good examples of these kinds of storage points.
- The Core: By far the largest and highest-level data storage points, the core consists of web servers and cloud storage servers. These data centers and facilities are located across the planet, with the United States hosting nearly 40% of all web and cloud storage servers and China, the United Kingdom, Germany, Japan, and Australia composing another 29% of the world’s share.
From this breakdown of where data is housed and how data consumption, exchange, and storage is increasing rapidly, comes the case for the significance of equal or greater growth in its crucial counter-component: data destruction. Because our personal security information and identities live in the data we create and circulate, it’s important that the programs and platforms we use are mindful of erasing old, outdated, and unnecessary data.
One important specification: Data destruction is not the same thing as data deletion. Simply deleting data does not ensure that it will be irretrievable or remain encrypted — on the contrary, deletion alone leaves some data forms highly discoverable. Instead, it’s the responsibility of data science and IT professionals to guarantee that personal data is fully erased using a variety of data destruction software, a practice that makes it practically impossible for “bad actors” to recover sensitive, formerly protected data.
Why Is Data Destruction Important?
To understand the significance of data destruction, it’s helpful to look at it in the way it affects standard web users. Breaches, leaks, and hacks pose serious problems to users who have unknowingly given their data to organizations that don’t sanitize it properly. Effective data destruction policies ensure that user data doesn’t become compromised. This protects real people from the threat of having their personal information gathered, analyzed, sold, or even hacked against their will and used to the point of destructive impact, such as identity theft and ultimately financial loss. In other words, when proper data destruction practices are shirked, users can have their credit card, social security, birthdate, and address information stolen. f
In recent years, data scientists and researchers have begun to plan ways to implement strategies and practices that ensure user data is disposed of appropriately. In a 2017 academic article titled “Cloud Based Deduplication and Self Data Destruction,” scholars began planning erasure methods that centered on encryption and security measures in cloud storage spaces. Since most data is collected and exchanged through cloud services today, it makes sense that the horizon of data security centers on protecting cloud positions. Through their proposed algorithm, they aim to ensure that “data will get deleted after a specific time interval which the user has specified at the time of data storage.”
Separately, researchers are looking at additional benefits that come from polished data erasure methods. In a 2019 article in the academic journal Concurrency and Computation: Practice and Experience, scholars in the study posited that as coding storage systems have refined repair systems, two simultaneous benefits will arise:
- Data will be able to be retrieved more easily when accidentally or improperly erased.
- Data transmission, encryption, and destruction processes will become more refined as the researchers’ new method “significantly reduces the link cost of repairing data blocks, while promising similar or faster repair speed.”
From this and similar innovation types that continue to come forward, the constant evolution of data sanitization practices result in the greater ability to protect users and their sensitive, personal information.
The Evolving Landscape of Data Destruction Policy and Laws
Data collection, analysis, buying / selling, and erasure policies really vary from country to country in the same way that they differ between organizations and companies. Because there’s no exact or prevailing global mandate that guides how data is gathered, businesses across the world must abide by laws established in their home country as well as that of their end-user.
The European Union’s GDPR Model
In 2016, the European Union passed critical data security legislation. They found that too many sites collected too much user data without consent and without much responsibility. Legislators as a result chose as their primary goal centrally to “strengthen individuals’ fundamental rights in the digital age and facilitate business by clarifying rules for companies and public bodies in the digital single market.”
A travel site, for instance, before the passing of the General Data Protection Regulation (GDPR), would be able to collect data of specific users whenever they would visit the site. From their IP address to their clicks on different links on the site, and sometimes even from users’ email addresses that they may involuntarily provide, marketers would be able to use their data both for their own purposes and to sell to others for leads. In this example, the user would not voluntarily consent to offering their data up for these purposes. Legislators in the EU found that this was a problematic and disenfranchising practice, and decided to criminalize what they considered an unlawful data collection method.
Since the law was applied in 2018, websites across Europe must now offer visitors the option to accept or decline data collection services. In the context of data destruction, individuals now possess the right to demand that organizations properly erase their personal data. Upon this request, organizations must comply or face the threat of fines, litigation, and even criminal charges in some cases. The EU has labeled this aspect of the GDPR as “the right to be forgotten.” Centrally, this groundbreaking data destruction policy ensures that this right can be applied under these conditions:
- The user’s personal data is no longer necessary for the original reason the company or organization gathered and analyzed it.
- The user offered consent to the data collection practices originally but now wishes for that agreement to be completed and nullified.
- The user finds that a subsequent organization collecting their data has no legitimate reason for gathering it.
- An organization collects and analyzes a user’s data illegally.
- Children’s personal data is processed by an organization.
The United States’ Policy Stance on Data Destruction
The United States, in contrast, does not have an overarching, monolithic federal law that protects user data in the same capacity. According to a recent article in the New York Times technology column Wirecutter, “data collected by the vast majority of products people use every day isn’t regulated.” Instead, consumers are protected by other regulations that safeguard in specific contexts. For example, the Family Educational Rights and Privacy Act (FERPA) protects students’ educational records, which can also cover the exchange of that type of information. Conversely, the Health Insurance Portability and Accountability Act (HIPAA) covers the communication or exchange of outlined and detailed health information, which leaves medical and/or health data primarily unprotected.
As Wirecutter states, there are three things that both website visitors and website administrators in the United States should know:
- Companies are allowed to (not restricted by law) share, use, or sell data they gather, and to do so without informing users.
- There is no national mandate or legal measure that forces companies to let users know if their data has become compromised in a hack, leak, or data breach.
- Companies can use the data you give, whether provided voluntarily or involuntarily, to sell to third party data brokers, who can then re-sell the same data information, all without obtaining the user’s permission – and most often without the user’s knowledge.
While none of these factors explicitly mention or directly involve data destruction, each indirectly underscores its importance. When companies and websites bear no responsibility or consequence regarding gathering, analyzation, and the sale of user data without user permission or knowledge, they aren’t obligated to delete or destroy user data. This point is made even more significant considering how data collectors and sellers aren’t required to notify users of a data breach, leak, or hack. Because there’s currently no data destruction responsibility or policy, websites and businesses can collect unlimited amounts of data with no recourse for not disposing of it appropriately. While some businesses have begun to implement data destruction protocols that aim to ensure user privacy, future data scientists can become active agents of change in a landscape where data is becoming increasingly vulnerable.
In recent years as need and solution come together, scholarship has been dedicated to planning and improving data destruction, improving benefits to user privacy on a legal basis through more reputable and iron-clad methods destruction and protection.
A 2019 academic article outlined the significance of data protection as a means to mitigate imminent threats on privacy. Specifically, researchers of the article in The Journal of Public Policy & Marketing have proposed an updated framework based on the existing European Union General Data Protection Regulation.
This new model “builds on parallels between property and privacy and suggests that interdependent peer protection necessitates three hierarchical steps, ‘the 3Rs’: realize, recognize, and respect.” The authors of the study have found that regulators have historically failed where private companies and other data collectors have worked best, ensuring the protection of user data at these three junctures.
Data destruction and erasure factors heavily in this new framework. In order to act ethically in the ways that they collect and analyze user data, companies and organizations across industries must ensure that sensitive data will ultimately go through erasure or wiping. In the instance that breaches, leaks, or hacks occur, users will have a much greater chance of remaining safe if their previously used data has been cleaned and erased, rendering it non-existent and therefore inaccessible.
How Data Sanitization Improves Business Operations
As efficient and safe data destruction practices reliably ensure that users’ data remains safe, businesses also benefit greatly from advanced, proactive data sanitization processes. As one critical aspect to data security, a robust internal data erasure policy can help a business stay focused on its own growth goals.
The consequences of poor or flimsy data protection policy are massive for organizations of all scales, but especially for smaller organizations. In fact, 60% of all small businesses in the United States have to close down within six months of a breach, hack, leak, or cyberattack. The reasons for closure are varied and expansive, but center on the shortcomings of data sanitization.
When a company experiences a cyberattack and has a cache of unpurged, unnecessarily stored user data, that data ultimately becomes compromised, and the company held liable for that breached data. Hit with steep and immediate legal defense fees, many small businesses are unable to continue operating and must close permanently. Data sanitization software and related implementation policies are reliable preventative measures that protect user data and in turn are key to keeping businesses going in the case of a breach or attack.
Careers that Implement Data Destruction and Data Sanitization
In addition to growing and sought-after data science careers in analysis and storytelling, practically all industries across the globe now require having a data science professional to ensure sensitive data is safeguarded from cyberattacks and leaks. This added layer of expertise translates into substantial job growth, salary potential, and greater demand for experienced thought leaders with degrees in the related fields of data protection.
Outside of the obvious database administrator and data analyst careers that build critical infrastructure to keep user data safe, there’s also the growing field of criminal intelligence. In just one potential industry fit, explore how data science for law enforcement employs advanced tools and processes like data destruction to keep user data protected, businesses open, and cyberattacks from happening.
As we increase our understanding that proactive issuing of data destruction policies prevents potential damaging instances from occurring, the intersection of data science and cybersecurity is becoming more and more pronounced. Within an already broad and demanding field of job growth, layers of influence are becoming more precisely defined and pursued as lucrative career choices, and nuanced expertise more in demand.
Return to Discover Data Science Articles