From the looks of it now each glSet has to include glBind(something) inside of it
Not exactly. It's the other way around, as described several paragraphs below.
Even if it were true, remember that GL commands from the client app to the GL server (aka driver) have a lot of dispatch overhead compared to a regular function call. Even if we assume that the DSA functions are just wrappers around existing functions, they're wrappers that live inside the GL server and hence can have (a little) less overhead.
if OpenGL still being a state-machine cannot take advantage of streamed changes applied to a single something.
GPUs aren't state machines. The GL state machine interface is an emulation that wraps DSA-like driver internals, not the other way around.
Removing one layer of wrapping - a layer that requires an excessive number of calls into the GL server - is clearly a win, even if a small one.
The state machine approach also doesn't make a ton of sense when dealing with multiple threads; GL is still terrible in this use case but drivers often use threads behind the scenes, and a state machine requires a lot of thread synchronization or really fancy parallel algorithms/constructs to make things work reliably.
The DSA extension continues to phrase its operation in terms of state changes because it is, after all, an extension to an existing state-based document and not an entirely new API, so it had to be ready to plug in to the existing GL specification document's language and terminology. Even if that existing language is pretty terribly suited to its job as a modern graphics hardware API.
Please explain the reasoning behind and advantages of the new DSA.
The biggest reasoning is that the old way was a pain. It made it very difficult to compose libraries together that might each modify or rely on GL state. It made it difficult to efficiently wrap the GL API in an object-oriented or functional style due to its deep procedural state management roots, which made wrapping the API in various non-C languages difficult and also made it difficult to provide efficient graphics device wrappers that abstract OpenGL from Direct3D.
Second was the procedural state-machine API overhead, as described previously.
Third, the DSA functions changed semantics where appropriate from the old APIs that allowed for improved efficiency. Things that were previously mutable were made immutable, for instance, which removes a lot of book-keeping code from the GL server. Calls by the application can be dispatched to the hardware or validated sooner (or in more parallel fashions) when the GL server doesn't have to deal with mutable objects.
--
Additional justification and explanation is given in the EXT_direct_state_access extension specification.
--
Hardware changes that are relevant to the API design are rather numerous.
Remember that OpenGL dates back to 1991. The target hardware wasn't consumer-grade graphics cards (those didn't exist) but big CAD workstations and the like. The hardware of that era had very different performance envelopes than today; multi-threading was rarer, memory buses and CPUs had less of a speed gap, and the GPU did little more than fixed-function triangle rendering.
More and more fixed-function features were added. Various lighting models, texture modes, etc. were all added, each needing their own piece of state. The simple state-based approach worked when you had a handful of states. As more and more states were added, the API started bursting at the seams. The API became more awkward but didn't diverge too far from hardware modes, as they were indeed based on a lot of state switches.
Then, along came programmable hardware. The hardware has become more and more programmable, to the point where now, the hardware supports a little state, some user-supplied programs, and a lot of buffers. All that state from the previous era had to be emulated, just as all the fixed-function features of that era were being emulated by the drivers.
Hardware also changed to be more and more parallel. This necessitated other hardware redesigns that made graphics state changes very expensive. The hardware works in big blocks of immutable state. Because of these changes, the driver couldn't simply apply each little bit of the state that the user set immediately, but had to batch the changes automatically and apply them when needed implicitly.
Modern hardware operates even further from the classic OpenGL model. DSA is one little change that was needed some 10+ years ago (it was originally promised as part of OpenGL 3.0), similar to what D3D10 did. Many of the hardware changes above need far more than just DSA to keep OpenGL relevant, which is why still more big extensions that drastically change the OpenGL model are available. Then there's the whole new GLnext API plus D3D12, Mantle, Metal, etc. not a single one of which keeps the outmoded state machine abstraction.