10

Problem Overview

I want to securely store log files so the contents are secret, and they can't be modified without detection.

The files will be encrypted using authenticated encryption (AES in GCM mode), with a random IV and symmetric key for each file. The symmetric key will be encrypted using the public part of an RSA key pair. Both the IV and encrypred symmetric key will be included in the additional authenticated data.

This gives me confidentiality, integrity and authenticity - but only for each individual log file.

For example, let's say I have log files 2013-01-01.log, 2013-01-05.log and 2013-02-09.log - an attacker could delete 2013-01-05.log without detection.

I've come up with 2 possible solutions.

Possible Solution 1

The program could maintain an encrypted (and possible RSA-signed) 'counter file', which would contain a sequence number that would be incremented every time we write a new log file. The sequence number would become part of the log filename, and would also be included in the additional authenticated data. We could therefore detect any 'gaps' from missing files.

Possible Solution 2

The program could maintain an encrypted (and possible RSA-signed) 'database file', which would contain the filenames of all previously written log files. We could therefore detect any 'gaps' from missing files.

The Question

I'd like feedback on my 2 possible solutions - do they work, do they need some changed, have I missed anything?

Or are there better solutions to my problem?

MurrayA
  • 357
  • 4
  • 10
  • Solution 1 does not handle the case where the attacker renames the file. – Thomas Apr 24 '13 at 16:07
  • In what sense does that give you "integrity and authenticity"? $:$ (Hint: it's not the right sense.) –  Apr 24 '13 at 16:35
  • @RickyDemer the authentication tag provides integrity and authenticity, doesn't it? – MurrayA Apr 24 '13 at 16:41
  • @Thomas what would they rename it to though - wouldn't we still be able to detect the gap? As a preventative measure, what if the filename is included in the additional authenticated data? – MurrayA Apr 24 '13 at 16:43
  • Why would you include the IV in the ciphertext? I can't think of a formal proof that this is bad but it doesn't sound good. Also when you say a symmetric key for each file, you mean you have as many keys as you have files? Anyway it would make more sense to me to copy the logfiles to a secure location and then look for changes (i.e leave the originals in place as a form of intrusion detection) – rath Apr 24 '13 at 16:48
  • @MurrayA Not if there is more than one possible date between two log files. Yes, including the filename somehow in the authenticated data would fix that. – Thomas Apr 24 '13 at 16:48
  • 1
    Why not just use HMAC(key, filename || contents)? There's no need for a unique key for every file. There's no need for encryption. You can also use something like chattr to set the immutable flag, which can only be removed on reboot. – Stephen Touset Apr 24 '13 at 16:49
  • 1
    @rath the IV isn't included in the ciphertext, it's in the additional authenticated data (i.e. in plaintext). Yes, there would be one unique symmetric key per file. – MurrayA Apr 24 '13 at 16:50
  • 2
    @StephenTouset how is that any better that just using authenticated encryption? How does it allow for detecting if someone deletes a file? – MurrayA Apr 24 '13 at 16:52
  • More cryptography is not automatically better — in fact, it is usually worse. Nowhere in your question have you indicated that secrecy of the contents of the logfiles is important, so encryption should be considered unnecessary. So I turn the question back on you: in what way is authenticated encryption better? And file deletion can be detected simply by the lack of presence of a file for a particular date. – Stephen Touset Apr 24 '13 at 16:55
  • @StephenTouset sorry, I should have said - secrecy of the contents is also important. In which case AE may be considered 'better' because does the encryption and MAC for us in one go - so it may prevent making mistakes during implementation. BTW, it might also be normal for files to be missing for a particular date, for example if the system was down for maintenance, or if simply no data was logged during the period. – MurrayA Apr 24 '13 at 17:02
  • The last concern is trivially solved. If there are no logs for a date, you can still write an empty file. – Stephen Touset Apr 24 '13 at 17:16
  • @StephenTouset but what if the system was down for maintenance (i.e. it stopped logging for a while) - with that scheme (no 'counter' or 'database'), how could we determine whether files were deleted, or the system was down for maintenance? – MurrayA Apr 24 '13 at 17:25
  • 1
    So far, none of the proposed methods give any integrity protection after a compromise of the logger. –  Apr 24 '13 at 17:25
  • 1
    @RickyDemer do you have any suggestions? – MurrayA Apr 24 '13 at 17:29
  • Detecting the logger's integrity or changing settings are security issues, not cryptography issues. They have solutions such as file integrity monitoring tools, patterns for logging servers, etc. You might want to search security.stackexchange.com for more info. – John Deters Apr 26 '13 at 12:38
  • @JohnDeters while I agree in principle, both of the solutions I proposed involved cryptography – MurrayA Apr 26 '13 at 12:58
  • Sorry, I was referring only to the direction the comments have taken, not to the solutions. – John Deters Apr 26 '13 at 13:11

5 Answers5

9

"Efficient, Compromise Resilient and Append-only
Cryptographic Schemes for Secure Audit Logging" ​ (PDF)
gives a publicly verifiable approach that allows fine-grained verification,
but it is in the Random Oracle Model.

The Simple Method:
The verifier and logger start with a seed for a
forward-secure pseudo-random number generator.
To denote a valid ending of a log, put the string of
the next $b$ bits of the PRNG's output into the log.
To add a log entry, get the next $\:b+k\:$ bits of the PRNG's output,
put into the log the encryption of the log entry and the mac of that
ciphertext using the last $k$ of the $\:b+k\:$ bits of PRNG output as the
mac key, then erase those $\:b+k\:$ bits and the previous PRNG state.

4

Take a printer, and have the log file come out of the machine on paper. Ensure fire doesn't exist near the paper. Anything else will not work: if an attacker can wind back time log files can die and you cannot tell. All techniques for assuring time cannot be run backwards amount to doing this in some form, perhaps by sending data to another computer. But if you can do that, just send the log files and the attacker cannot even delete them!

Watson Ladd
  • 838
  • 4
  • 10
3

Edit: You've clarified in the comments that confidentiality of the logfile's contents is important.

Given an AEAD function $C = E_k(iv, plaintext, aad)$, a safe construct is

$$ C = E_k(iv, contents, filename). $$

There is no need to include either the key or the IV in the additional authenticated data, and I would recommend against doing so. It is plausible that a particular AEAD scheme could leak the contents of the authenticated data, since this is not generally a design requirement.

This scheme will allow you to detect manipulation of the contents of a file, file renaming, and file deletion. You can detect logfile deletion simply by the lack of presence of a file for a particular date. If you have no log data for a date, simply generate an empty file and encrypt it with this scheme; the filename's inclusion in the authenticated data will prevent an attacker from being able to "replay" an encrypted empty file for other dates.

You may additionally want to use something like chattr to set the system-wide immutable flag for your logfiles. This will prevent any user from being able to modify or delete files, without changing the flag as root and rebooting the system first. Similarly, you should consider setting the append-only flag for live logfiles that are still being written to. This will help prevent tampering with them before they can be permanently archived.

Finally, permanent storage media are great solutions to this problem as well. If your need is great enough, burn the signed logfiles to a DVD-R periodically.

Stephen Touset
  • 11,002
  • 1
  • 38
  • 53
  • Changing the permissions on completed files is a good idea – MurrayA Apr 24 '13 at 17:05
  • 1
    I mentioned attributes, not permissions. Limiting permissions on sensitive files is simply good hygeine and should go without saying. – Stephen Touset Apr 24 '13 at 17:09
  • I'm using Windows, so there is no immutable flag; the next best thing is to change permissions after the file is closed – MurrayA Apr 24 '13 at 17:10
  • You could always run *nix on a remote logging server. This also has the advantage of requiring an attacker to break into two boxes to undetectably compromise one of your services. – Stephen Touset Apr 24 '13 at 17:14
  • 1
    Regarding not including the IV or key in the AAD, if I was doing encrypt-then-MAC rather than AE, isn't it good practice to include the IV in the MAC? So shouldn't at least the IV be included in the AAD? http://crypto.stackexchange.com/a/224/1254 – MurrayA Apr 24 '13 at 17:18
  • Yes. $:$ I don't know what "AAD" is, so it might be correct to include the IV in it. $:$ However, with $\hspace{.4 in}$ AEAD, the only thing to worry about is IV generation. $;;$ –  Apr 24 '13 at 17:35
  • @RickyDemer Additional Authenticated Data, or just 'AD' if you prefer - I mean the plaintext that is included in the MAC (GHASH) used by authenticated encryption. Assuming AES in GCM mode, I don't believe IV generation is an issue, as the only requirement is that it is non-repeating (doesn't even have to be random) http://security.stackexchange.com/a/20493/2541 – MurrayA Apr 24 '13 at 17:37
  • Yes, in this case, since you will have to keep state anyway, IV generation won't be an issue. $;;$ –  Apr 24 '13 at 17:42
  • As with any crypto, I would strongly recommend against implementing your own AEAD scheme. Windows implementations exist — investigate them before rolling your own. – Stephen Touset Apr 24 '13 at 17:42
  • @StephenTouset I'm no cryptographer! I was planning on using the CNG implementation that ships with Windows – MurrayA Apr 24 '13 at 17:46
  • Then don't implement encrypt-then-MAC yourself. :) – Stephen Touset Apr 24 '13 at 17:53
  • @StephenTouset I didn't realise I was?! Only thing I'm considering is including the IV and encrypted key in the additional data. I believe you advocate including the IV in the MAC here: http://crypto.stackexchange.com/a/6076/1254 – MurrayA Apr 24 '13 at 17:56
  • I do. I was referencing your comment about "if I was doing encrypt-then-MAC...". It sounded like that was your plan. – Stephen Touset Apr 24 '13 at 22:50
2

I'm thinking there's a third potential solution. Each time you close a log file, you could append the name of the next new log file, timestamp it, then sign the log file. When it is time to create a new log file, you would read the previous log file, validate the signature, validate the time stamp, read the new log file name, and create it. You'd kickstart the whole thing by self-signing the first empty log file.

This permits you to decrypt each file on its own, which you probably do frequently for ordinary troubleshooting and maintenance activities. When you need to audit the log files, which is probably a less common annual activity, you would walk the chain of all files ensuring that all signatures are valid and that none are missing.

This isn't a perfect or complete solution, of course. Off the cuff, I think a bad guy could tamper with the system clock, setting it back to the end of two log files ago and delete the most recent log file. But clock tampering might leave other evidence in other log files. You should still export the log files to a separate secured server as soon as you close them.

John Deters
  • 3,728
  • 15
  • 29
  • I like the ralative simplicity of this. I guess using an external, secure timestamping service would alleviate the problem of bad guys changing the system clock – MurrayA Apr 26 '13 at 06:51
0

I've been toying around with a solution to this problem using blockchain tech.

Say for every client that wants to make a server request, they have to first record their request on a blockchain. Then that client would submit that same request to your server.

Your server would check that the received request also exists on the blockchain before processing the request.

Responses would work the same way but in reverse. The server would publish a response on the blockchain and sent a normal response to the client. The client could then check to make sure that the response from the server also exists on the blockchain.

Something like this

This is probably way overkill, but it's an interesting concept.

  • Blockchains can grow in size according to number of transactions, whereas a simple tally+digest of the last block plus metadata is all that's required in the simpler schemes. – MikeW Mar 28 '19 at 14:07