Is there any digital watermark technology for raw text?

Question

I have a demand to protect my data which can accessed by my client privately. My data is not in any document but in plain text, e.g., a JSON string. I want to protect my data against being modified, then redistributing without my permission.

I researched online for a while, and concluded that,

there are digital watermark methods for protecting images, audio, videos or documents (pdf, MS Word doc or other format),
but there is no such methods for raw text.

I want to know if there is any cryptographic way to achieve my goal?

Thank you for your doubt. I updated the origin post. The raw text can be regarded as https://en.wikipedia.org/wiki/Plain_text, especially JSON string in my case . — Junwei WANG, Oct 12 '16 at 09:28
We shouldn't have to click on your StackOverflow post to get the context. Also, cross-posting is against the rules. So, please edit to add the context and make your question substantially different from the SO post in some way that directly relates to cryptography. — mikeazo, Oct 12 '16 at 13:07
@mikeazo while I agree that the question should be self-contained, I have to disagree with posting on multiple SE sites being bad by definition. From what I can tell it's totally fine to post basically the same question on multiple sites, as long as you want to answer about different aspects of a question (for example about the crypto here and about the implementation on SO). — SEJPM, Oct 12 '16 at 14:52
@SEJPM I second your cross post argument. We live a complex world outside theory and some times questions are asked that require a multi-disciplinary approach. — Paul Uszak, Oct 12 '16 at 18:13
I don't see how this question relates to programming at all, to be fair. It's fine here or possibly at the security site, but programming it aint. — Maarten Bodewes, Oct 12 '16 at 22:07
I updated the post as @SEJPM pointed. Please help me solve my problem. — Junwei WANG, Oct 13 '16 at 02:09
@mikeazo: Could you give me a link to that rule? Please post that link to "Is cross-posting a question on multiple Stack Exchange sites permitted?" to help others who haven't seen that rule? Thank you. — David Cary, Oct 13 '16 at 14:15
@DavidCary, the rule is the most upvoted and accepted answer on that same question you linked to. — mikeazo, Oct 13 '16 at 14:18
Watermarking doesn't prevent redistribution, although it may be able to detect it. It doesn't even try to prevent modification. — dave_thompson_085, Oct 14 '16 at 00:53
@dave_thompson_085 Exactly, the watermark is hard to remove, hence, it helps you track the redistribution. — Junwei WANG, Nov 21 '17 at 18:49

score 1 · Answer 1 · answered Oct 13 '16 at 06:40

I assume the request for clarification of "Plain-Text" has to do with content, not that it's an ASCII or UTF string.

There is not a "Hello World!" string that I could make that you couldn't then copy and call your own, where I could 100% prove that it was mine to begin with. For example anyone can just delete a PGP signature wrapper.

However, if the content of your string is structured in a way that was uniquely identifiably yours then I would call that a watermark, as any change to the content of string would change its context. For example using a private RSA key to encrypt your string then coding it back into ASCII with Ascii85, or if your string contains obfuscated JavaScript, or even simply something you've published under a copy right

That should afford about the same level of watermark you'd see in the other media, but the thing is any media which can be displayed with out the watermark is by the reflexive property media that can be stripped of its watermark and not still remain self judicating. ie copy right it, or have context that's destroyed when changed.

score 1 · Answer 2 · edited Apr 13 '17 at 12:48

Wikipedia:

A digital watermark is a kind of marker covertly embedded in a noise-tolerant signal such as ...

I highlighted the important keywords, which cause a problem for the applicability in texts / strings:

Covertly: While cryptography does not adress the issue of covert communication or data, steganography does adress this issue. Also, there was a question recently about steganography in texts, which might offer some more information on that. One problem to point out is that the carrier data needs to be much larger (e.g. a factor of 50, preferably more). And then you need to specify how you actually embed your watermark in texts. The main difference between steganography and digital watermarking is the goal (hiding information vs. authenticity, integrity or marking ownership), but the methods are very similar.
Noise-tolerant: This is much more difficult to determine for text than in images, audio, video, etc., where the low order bits are considered noisy. Changing the lowest bit of a pixel in an image is a very small change, while there are no such small changes in text messages.

Is there any digital watermark technology for raw text?

2 Answers2