I have looked into this question before and came up very short. From it, I believe that it may be the case that a cipher with this property is simply the Vernam Cipher (OTP) in disguise. It is still an interesting question as to how the use of additional resources or changes in protocol can potentially implement an equivalent to OTP with less effort. While it, itself, might not be exactly considered a cipher (or even a huge deviation from the one time pad), I found the following information theoretically cryptosystem to be incredibly interesting when I stumbled upon it a couple years back:
There is a more recent scheme (early 2000's), by Michael Rabin, called hyper-encryption, that can be shown to be information theoretically secure given the assumption that your adversary is space bounded (most others have assumptions on time bounds). As required by the scheme, there must be a very large, public, source of random bits.
The basic premise is for the sender and recipient to use the key to methodically pick out a one time pad from the random source of bits (using randomness extractors). In order to get this pad, the adversary would have to not only store the ciphertext, but also would need to store almost all of the random bits for use later. Their limited ability to store information would prevent them from doing this.
After the transmission is complete, access to the random bits is closed. The theory here is that, even if the key was revealed afterwards, that the adversary would not have enough information to establish anything about the plaintext. This remains true even if they had unlimited computational power, since no amount of post-analysis will get around the fact that the one time pad used depended on information that the adversary did not store. Since this security is not compromised even if the key is revealed after the transmission, the scheme is said to have perfect forward secrecy.