A Brief History of Data Loss Prevention

By Vina Nguyen


We’ve come a long way in this era of connectivity. In 20 years, we’ve gone from dial-up as the main mode of connection to businesses hosting their corporate data on the cloud. Let’s take a trip down memory lane to learn how the internet evolved, how data loss prevention (DLP) solutions fulfilled the security gap, and where we go from here regarding DLP both now and in the future.

2000-2010: Dialup, Piracy, and the Birth of DLP

If you’re old enough to remember dialup, you remember that the internet was an optional activity at home that required negotiation of phone lines in the home. In the 2000s, broadband started taking over, and by 2010 over 2 out of 3 adults used broadband. In this era, we also witnessed the crash of the dot-com bubble, with a few future winners like the fledging Amazon.

The potential of the internet came with the potential for illegal and criminal activity. This era also included the infamous Napster platform where at one point, 80 million users downloaded free music illegally. Besides the fact that this completely transformed the music industry, it showed the potential of the internet to steal intellectual property at scale and at so fast a speed that the impact was essentially irreversible. Software piracy and the hacking of prominent sites like Microsoft, Yahoo!, and CNN.com showed how vulnerable even industry leaders were to have their source code stolen or reputation damaged, meaning substantial loss of profits and a weakened competitive edge.

Encrypted traffic was not a standard yet, which meant data in transit on the internet was almost always in the clear. Anyone could tap a network line and read the traffic moving on it. As such, the main approach for DLP here was to ensure that any sensitive data or intellectual property never saw the outside web. The main techniques included (1) blocking any actions that copy or move data to unauthorized devices and (2) monitoring network traffic with basic keyword matching. In 2007, the approach of deep-content inspection was introduced, where techniques for proactively scanning and identifying for sensitive data in content vs. just the network.

Government also started to take notice of cybersecurity as an increasingly strategic domain. Seeing the need for centralized response, in 2003 the United States (US) released the National Strategy to Secure Cyberspace. In this agenda, the US outlined goals for preventing cyber attacks and protecting sensitive data for government, private sector, and military applications among others.

2010-2020: Encryption, Compliance, and User-Centricity

To protect data in transit, HTTPS, a secure protocol for encrypting traffic between browsers and web servers, began to gain significant traction. Once offered as an opt-in option, HTTPS became the default setting of most websites today, with 93% of web traffic using HTTPS. With the increased public awareness and a necessity for business, browsers now warn consumers of non-HTTPS sites as well. Add the shift to cloud service popularity made possible by this evolution, and DLP solutions started to incorporate data protection on the cloud vs. only on-premise.

For businesses, DLP also became a subject of legal compliance as governments around the world passed laws to protect the privacy and data of consumers. Most notably, the European Union passed the General Data Protection Regulation (GDPR) in 2018 to establish the world’s highest standards on how organizations are expected to protect the personal data its citizens, with costly financial penalties if not met. It also grants its citizens a set of rights to access, modify, restrict, and move their own personal data across any organizations processing their data.

In the midst of this changing environment of both internet technology and increased government regulation, DLP solutions increased the scope of “data” and progressed on content examination, with a focus on user-centric monitoring on both the endpoint and the network.

Content examination and data discovery moved beyond simple keywords to advanced pattern matching for further breadth and accuracy, and metadata like file owner and location were considered for contextual awareness. In addition to parallel efforts of educating employees, the user continued to be the main focus of DLP, with features like alerts on policy violations (a pop-up if an attempt to upload to an unauthorized site) and continued monitoring of network traffic and file transfers from the endpoint.

With regulatory compliance as an added requirement for DLP, reporting and auditing for data like personal identified information (PII) became standard features. Meanwhile, the popularity of the cloud expanded the scope of domain where DLP needed to protect data.

2020+: To Data-Centricity and Beyond

Education and the focus on users following policies will go far, but ultimately, users are fallible. With remote work becoming increasingly common post-pandemic, and the onus of security on the employee, the risk of data breach goes up. Take it from IBM’s latest Cost of a Data Breach Report, which states that when remote factor was a factor, the average cost of a breach increased by 1 million dollars.

Rather than relying on error-prone users and monitoring the systems that data resides on, modern DLP solutions are moving toward data-centric approaches. Such approaches include traditional techniques of content plus tracking data lineage—where data has been and where it is copied or moved over time. To achieve this, modern DLP integrates with core infrastructure of an endpoint, such as the operating system or web browser, to monitor data without having to decrypt it on the network level. Comprehensive solutions will apply policies across endpoint, cloud, and mobile layers vs. specializing in one type of platform.

Where do we go from here? For one, contextual heuristics via machine learning to detect malicious activity that policies and patterns alone can’t detect. While back then, consumers were at the mercy of corporations, now, with the EU leading the way, comprehensive data privacy laws mandate obligations of handling personal data that DLP solutions will enforce. Integration across cloud and mobile devices will become more commonplace as the working environment becomes more distributed.

Time will tell, but one thing’s for sure: we aren’t in 2000 anymore.

About the author

Vina Nguyen is a B2B technical copywriter, specializing in cybersecurity, SaaS, and artificial intelligence. She aims to inspire by simplifying the complex in all things technology. Before she was a writer, Vina spent over 10 years as a computer scientist, where she analyzed software, designed cybersecurity products, and built machine learning models for both public and private organizations. Vina can be found exploring Washington, DC or at www.vinawrites.com.