Privacy through fake data?

Question

With companies and governments hungry for all the data about people, I was wondering if it was possible to gain some privacy by drowning relevant information in a sea of random data. For example a browser extension which keeps searching for random words and expressions in the background. Maybe sending generated emails to generated addresses. Or producing made up location data on my phone.

Would these measures help at all or the data miner AIs would just see right through them?

The methods you are sketching are known under the name steganography. Read something about this topic to learn more. — , Sep 07 '15 at 08:53

AN6U5 · Answer 1 · 2015-08-19T03:26:00.770

This was the idea behind paranoid linux which began as fiction and became a real project that did not reach fruition:

"Paranoid Linux is an operating system that assumes that its operator is under assault from the government (it was intended for use by Chinese and Syrian dissidents), and it does everything it can to keep your communications and documents a secret. It even throws up a bunch of "chaff" communications that are supposed to disguise the fact that you're doing anything covert. So while you're receiving a political message one character at a time, ParanoidLinux is pretending to surf the Web and fill in questionnaires and flirt in chat-rooms. Meanwhile, one in every five hundred characters you receive is your real message, a needle buried in a huge haystack. ~Cory Doctorow (Little Brother, 2008)

When those words were written, ParanoidLinux was just a fiction. It is our goal to make this a reality. The project officially started on May 14th, and has been growing ever since. We welcome your ideas, contributions, designs, or code. You can find us on freenode's irc server in the #paranoidlinux channel. Hope to see you there!"

I think the most modern equivalent is Pirate Linux along with the TOR Project.

BTW, Cory Doctorow's book "Little Brother" is kind of a fun read if you are a geek. If you are reading this then you probably ARE a geek :-)

The reality is that it is very difficult to hide one's true intent when high quality clustering, classification and anomaly/novelty detection is used. I work with guys who have done this type of behavior detection for some very high profile three letter agencies and it is very very easy to detect nefarious behavior if you have enough data. So much so that undercover good guys sometimes clearly show up in the data sets.

But would this tactic actually work? If humans were manually checking stuff then yeah I can imagine. However a data combing algorithm which can predict pregnancy based on browsing history (or that image parsing neural network at Google) probably has a very different view on the world. Frankly they seem magic to me. :)
("Little brother" looks interesting. Now that I visited that page I'm on some kind of an agency list right?) — ZoltanE, Aug 16 '15 at 07:57

score 0 · Accepted Answer · answered Sep 06 '15 at 10:33

There's an inherent problem that anybody who chooses to use a system designed primarily for extreme privacy is interesting to the surveillance org.

So the surveillance org will likely learn about any such system or class of systems if it becomes significant, and develop means of identifying its traffic (and possibly filtering out the fake traffic), and then give special attention to the users sending that traffic.

You can also consider the opposite approach: instead of auto-generating traffic for existing users with false negatives, auto-generating new users with traffic full of false positives. However it suffers from the same cat-and-mouse dynamic.

The solution is for the tools used by most users (Android, iOS, telcos, Gmail, Facebook, Google Search, OEMs) to feature security capabilities client-side by default AND to not provide backdoors to surveillance orgs. That would also make it more difficult to provide useful features like recommendations and well-targeted adverts. For various reasons, in large part because most consumers do not demand it, they do not prioritise this, and that may not ever change.

Further reading: https://blog.kaspersky.com/chrome_ext_encrypt_data_leaving_browser/5063/

Privacy through fake data?

2 Answers2