ai architecture artificial-intelligence blog blogging Book Review business career Compliance Content Corporate Life Customer Support cybersecurity data data-science DevOps education entropy fitness garmin leadership Licensing life marketing microservices Monitoring music Observability Operations Partnership philosophy Product Management Products saas Sales Security software-development technology User Experience wordpress writing

Norton’s Law

Published by

on

Simpson's meme for Norton's Law -- bus driver says don't make me tap the sign, then taps the sign which says "all data approaches public or deleted over time"

There’s a great xkcd about the 10,000 people hearing about a thing for the first time today:.. so perhaps today is your day for hearing about Norton’s Law.

In 2015, Quinn Norton wrote Norton’s Law in Hello, Future Pastebin Readers. To wit: over time, all data approaches deleted, or public. The subsequent years of data theft from government and corporate victims have convinced me that this is completely accurate. But before we discuss the present, let’s look at the past.

5,000 years ago is the estimated date for the Deluge tablet, containing The Epic of Gilgamesh. While very few of us can read it directly, the data it contains is very clearly public in many translations with lots of explanatory material attached. Same for the complaint against Ea-Nasir – public in translation, even if a given copy of the original is hard to access. So, public. There was obviously lots of other writing in ancient Mesopotamia, but it has been lost. Not only did scribes clear their wax slates, the fired clay slates weren’t totally indestructible. Some of the data didn’t get saved. The clay tablets were smashed, the papyrus got wet, the parchment was burned, the paper was shredded, the tape was unspooled, the disk was demagnetized, the crystal was cracked, the metal corroded. That stuff, it’s deleted. The Library of Alexandria? Deleted. The Homeric epics? Public. Notes on shreds of scrap? Mostly deleted of course, but once it becomes really old and rare…. Public. Board meeting notes from the East India Company? Public. We can keep going up to your employer’s database’s customer table.

The people responsible for a dataset are going to eventually lose interest in it (because it’s no longer important to them) or lose custody of it (because they left the organization, but the dataset is still there without an owner). When that happens it might get a new owner, or deleted on purpose, or it will get forgotten. It falls from attention and is no longer on the list of things that are being secured. The assumed defaults don’t include forgetting: someone will take ownership, make decisions, preserve or actively delete it. This only happens when the data set or leaving owner is well known within the organization. Otherwise, it’s forgotten and most likely lost until a search occurs.

But maybe someone else from outside of the organization is interested in that data or thinks they might be… that someone might ask for it, buy it, or steal it from the people who don’t care anymore. Furthermore, technology has made it possible to get copies more easily with every decade. From the printing press to the camera to the mimeograph to the computer to the network, documents have gone from difficult to trivial to copy. Search engines have also made finding it relatively trivial. Neither innovation has changed the data owning organization’s priorities though. If the data owners don’t care about the original… they probably care even less about copies, authorized or otherwise.

So, during the time when its owners care about it, the data is important and at least somewhat protected, to the degree that protection of stuff is prioritized. But after that time, a window of opportunity opens. The data is no longer considered interesting, but it’s also not remembered. It only gets noticed when someone goes actively looking — and that’s only going to happen if storage cost is getting to be an issue. Since value and volume are inversely proportional… the formerly important data is very unlikely to be a storage cost problem. It’s only saved from notice and deletion because it’s not on an active list of things to delete, and it doesn’t get put on that list unless someone looks at it and decides it needs deleting. That does not mean that it stays around forever though: while no one is trying to actively clear it up, no one cares about salvaging it either. It isn’t remembered when the organization moves storage from on-prem to cloud or AWS to Azure. It’s not updated or converted when the organization abandons WordPerfect for Microsoft or Confluence for Notion. Accidents happen. The most impactful technology change is one we can forget in the cloudy world of SaaS… physical media becomes obsolete, or fails. Magnetic media does not last. Modern printed material? Might make it through your lifetime, but probably not your kids’ unless your family has a dedication to careful archival document storage.

Wait long enough and it’ll be easy to get data from its owner (with or without permission) because they won’t care. but wait a bit longer and it’ll be harder to get with permission because they forgot it exists. Wait too long, and it’s lost to an automatic process or technology change, and permission is irrelevant. So it’s pretty interesting to see investments into technology that might restore those multi-century physical document life timelines from parchment and clay. Does that imply more data becoming public since it is less likely to be deleted? Well, that’s still to be determined… data needs to be recognized as useful to go onto a special media after all, and maybe that decision will be sensibly tempered by caution.

Updates

Sensitive location data could be sold off to the highest bidder.

Glassdoor unmasking users

Beleaguered Paramount Wipes Historic MTV News and Comedy Central Archives


Discover more from Monkeynoodle.Org

Subscribe to get the latest posts sent to your email.