6

Background

When making web applications you sometimes need to pass data along with the form which isn't visible to the user. Database IDs are the most common, but texts and IDs are sometimes necessary too. For this reason there is a <hidden> field in HTML, but it has a drawback - any hacker with a basic understanding of what he is doing can modify the value. It would be better if such values would be tamper-protected, if not outright encrypted.

Question

I need to pass a value through hostile territory. The value is most often an integer (and that most often less than 65'000, so mostly just two non-zero bytes, rarely three, almost never 4). However sometimes the value can also be a string (short, a few hundred characters at most). I'm OK with using different algorithms for different kinds of data.

I want to tamper-protect the value, so that it cannot be changed en-route without detecting. Hiding the data isn't necessary, but doing so would be a bonus. So either tamper-protection or encryption would work.

Whatever the algorithm, I want to have the output as short as possible without sacrificing too much of security. This is because it's a webpage and shorter data means faster response times. Also, the HTML code becomes easier to read if there aren't lengthy random strings in it.

What algorithms would you suggest for this purpose?

Vilx-
  • 1,095
  • 1
  • 8
  • 11

4 Answers4

8

You are writing out data and reading it back on the same server. You want to ensure that the data that you read back is the same as the data that was written out.

For this use case, symmetric cryptography seems appropriate. Have a single symmetric key that doesn't leave the server. You need to rotate the key only if the server is compromised; this will invalidate sessions in progress, which is likely to happen for other reasons in case of a server compromise as well.

You can take add an HMAC checksum of the data; with HMAC-SHA-256 (a strong, recommended choice), this adds 64 bytes if you encode the checksum in hexadecimal or 44 if you encode it in base64 (or 32 if you include it in binary form but that tends to be hard to parse). There'll probably be a few more bytes' overhead to put the string somewhere. Given the context — a web page — I don't see any point in weakening the cryptographic strength to save a few more bytes.

If you want to encrypt the data as well, use AES-128 with an authenticated encryption mode (CCM or GCM are common choices). There will be an overhead of 16 bytes of IV, up to 16 bytes of padding, 16 bytes of MAC, and a multiplication factor due to the encoding requirement since the encrypted data can use all byte values (4/3 factor for base64). You'll need to decrypt the data before doing any processing with it; in particular, if some of that data is used on the client side, you'll either need to duplicate it or to combine authenticated encryption with a MAC of that plus the client-readable data.

Note that protecting the data in this way only authenticates it — it proves that the data was produced by your server. An attacker can still take the data from one page and send it with another page, or a later version of that page. How to avoid this depends on when the data is legitimate. You may want to include the URL or the user ID in the hashed data, for example, to avoid the data being used out of context.

  • Good point about the reuse. I'll keep that in mind. And... I guess you're right about the saving a few bytes too. I just thought that a SHA-1 seems so... unsightly. :) – Vilx- May 21 '14 at 10:12
  • The question asks for "the shortest output"; hence the result of HMAC should be at least truncated; 64 bits seems appropriate in the context. – fgrieu May 21 '14 at 11:21
  • 2
    I'd go with truncated HMAC-SHA-2 with a length somewhere between 64 and 128 bits. Truncation is fine with HMAC, but do not truncate a GCM / GHash MAC. – CodesInChaos May 21 '14 at 11:52
  • @CodesInChaos - Truncating a HMAC does not severely affect it's security? – Vilx- May 21 '14 at 13:07
  • 3
    @Vilx- With HMAC the chance of accepting a forgery is $2^{-n}$ for an $n$ bit MAC. With GCM every successful forgery reveals a big part of the key whereas for HMAC the only effect is that you accept a forgery. So if you really need to go below 128 bit MACs, I recommend against GCM. Another issue with GCM is that it needs a nonce and reuse is fatal, whereas HMAC doesn't need a nonce. – CodesInChaos May 21 '14 at 13:09
  • If you encrypted with AES-128, would you need HMAC too? – Bohemian May 21 '14 at 15:02
  • @Bohemian: if you care about integrity, you need to add something that actually gives an integrity guarantee. Common encryption modes, such as CBC or CTR, do not. – poncho May 21 '14 at 15:13
  • This answer is not sufficient, because it doesn't provide freshness (it doesn't prevent replaying of old values). Also, in practice you probably want to encrypt by default, too, because if you make encryption optional it is too easy for there to be some value that was confidential but where you forgot to enable encryption. – D.W. May 22 '14 at 22:31
  • @D.W. Given the information in the question, there's a generic approach to authentication and confidentiality, but not to freshness. That's why I explain in my last paragraph that replays, and more generally uses out of context, are possible. – Gilles 'SO- stop being evil' May 22 '14 at 22:38
  • @Gilles, as you say, there are two separate issues: (1) uses out of context, and (2) replays. In my opinion, a good solution needs to solve both ("you may want to" is not enough). In my opinion, this is something that is not optional; it's mandatory, if we want to deploy this in practice and be secure. So, I think this answer would benefit from more on how to handle the practical security challenges (the hard parts of this problem are not the crypto algorithms but how to ensure they'll be used appropriately in practice). Optional security will often fall short, because it be left disabled. – D.W. May 22 '14 at 22:42
3

I wrote this response while thinking I was on Information Security. Oops. Anyway, I think it may be helpful, so sorry if this is not exactly "cryptography" POV. Point of this response is: in this case encryption or hashing is not a good solution. It has a lot of problems, because there is very little entropy and it has to be working over HTTP.

Full programming-and-security POV response:

While this does not answer the question, consider avoiding at all "hidden" values that can be tampered with.

If this is a resource id, simply check if user has a right to make operations on that resource after receiving the form. Otherwise, read on.

Give each form instance a random token (long enough that it cannot be easily guessed by bruteforcing), and put somewhere server-side the "secret" data connected with the form token. Upon receiving the form, pull up the secret data connected with the token, and then delete the token and data on the server-side.

How to do that:

  • If you are using sessions, put the sensitive data in a associative array/map/hash (or whatever name of that data structure appeals to you) inside the session.
  • Otherwise put them in a database (remember to delete old non-used entries every hour or so)

Why you should do that:

  • Protection against CSRF, which is the SQL Injection of the modern day
  • Your "tamper proof" data is quite short, there is not a lot entropy in that. If all you did is hashing the data using a secret key, determined attacker may be able to do a "rainbow table" of all possible values.
  • Protection against replay attacks (sending the same data twice)

Things to consider:

  • Birthday paradox, if you are using single DB to hold all the data, the tokens have to be extra-long to avoid collisions

Last but not least, I have no idea what are you doing, but saving 32 bytes from each request (sha256 length) seems a bit pointless. You can achieve a lot more by optimizing images, removing whitespace from HTML, minifying JavaScript and CSS, using compression... 32 bytes is not that much in terms of webpages.

OhJeez
  • 131
  • 2
  • I wrote this response while thinking I was on Information Security. Oops. – Well, in that case: please feel very welcome to Crypto.SE… ;) – e-sushi May 21 '14 at 15:15
  • This seems like the best answer in this situation: access should be controlled by permissions, and once you have permissions in place it shouldn't really matter how the request came about - either the user is allowed to see the record or they're not. – nobody May 21 '14 at 15:25
  • This has less to do with visibility and more with tampering and the stateless nature of The Web. The current mantra is to make your webpages as stateless as possible. So, basically, when I open a document for editing in my browser, all the data should be on client side, and not server. Then, when I save the document, all the data should be posted back. The problems arise when permissions only allow a person to edit some parts of the document, but not others. Without a copy of the original document, I cannot tell on server side which fields were modified. – Vilx- May 21 '14 at 16:11
  • Storing a copy on the server side however becomes pretty tedious pretty quickly, especially since a person can open multiple browser tabs and edit several similar documents at once. It also leaks memory, because it's hard to tell when a browser tab has closed and when I can dispose of the server-side copy. Therefore a more elegant solution is to always pass all data to the client, but protect the sensitive parts. – Vilx- May 21 '14 at 16:16
  • The attacks you mention however can all be mitigated by using frequently changing keys and tokens (on opening of a document). Thank you for pointing them all out, I will definitely keep them in mind when designing my tamper-protection! – Vilx- May 21 '14 at 16:20
  • 1
    @Vilx- You cannot do everything client-side, because CSRF and replay attacks. Without server-side state you cannot tell replayed request from the first one. Changing keys and tokens on opening of a document is nothing else than having to update server-state on each client request, which kind of defeats the purpose. You are trying offload security to client, which is a very bad idea. Send everything, but accept only allowed changes, and what is allowed should be based on server state. – OhJeez May 21 '14 at 17:37
  • Hmm... true. Well, how about this - I don't want to save the document on the server side, but I can generate a new key/salt/whatever when the user logs on and save that in the session. Then another key/salt/whatever is generated for each opening of the form. BOTH are used to encrypt or hash the values. That way you can't replay or CSRF anything, even within the same session. – Vilx- May 21 '14 at 21:35
  • @Vilx- then just store in the session the thing you want to encrypt and don't send it to the user. Much simpler. – OhJeez May 22 '14 at 06:23
  • @OhJeez - True, but this presents me with a memory leak. How do I know when I can remove it from the session? Also I need to differentiate between different open browser tabs. This gets pretty hairy pretty quickly. Encryption is actually simpler (if I have the proper primitives, of course). – Vilx- May 22 '14 at 07:49
  • 1
  • no it does not, sessions are gc'ed after they expire. 2. you don't need to differentiate in any way. 3. Getting data from server, encrypting it, deleting it (?) and sending to user, just so that user sends it back, it gets decrypted and saving it again - THAT is hairy. What if user does not send data back? What if he tries to open it twice? Stick to proven solutions. Don't experiment with such things if strangers on internet can point out few common fallacies in your experiments at first glance.
  • – OhJeez May 22 '14 at 17:05