Currently we do not think that 512 bit DH is secure. You can look for more secure parameters (i.e. above 1280 bits at the very least) is secure. Secure key sizes can be found at keylength.com. For DH you can e.g. look at the NIST recommendations for "Discrete Logarithm" (the underlying mathematical problem that is the base of DH-based cryptography).
Using Diffie-Hellman as base for AES-256 is a bit misleading, as it is not likely that your DH provides the same level of security as AES-256. As you can see in the tables DH requires a 15K key size (!!!) to reach the same security level. Claiming a security level of 256 bits would be unfair (but read the next section)
What DH generates is a shared secret. This shared secret has a lot of pseudo-randomness. As such it can be used directly by using the leftmost (or rightmost, but leftmost is more common) bits of the shared secret as a key. More information here (note the author of the question!). This is however not considered good practice, as you preferably do not want to have any bias of bits in the secret (AES) key.
So commonly we use a key based key derivation function (key based KDF or KBKDF) to condense (or extract) the pseudo-randomness of the master secret into one or more keys. That way each bit of the key depends on each and every bit of master secret material and removes any possible bias in the bits of the resulting secret key.
Quite often cryptographic API's offer combined key agreement / key derivation functions where a KDF is already used upon the result. Sometimes that means indicating a hash function, which is then used as (only) primitive within the KDF. The compression functionality within a hash is, as you might imagine, perfect for condensing the pseudo-randomness.
If you do not use a KDF then you might want to use a 256 bit AES key as it will have at least 247 bits of pseudo-randomness, as specified in the answer on my question. A 128 bit key will have less than 128 bits after all.