I don't want to repeat what has already been written in other answers. Instead I will focus on two things: one, a comparison of features, and two, a pair of examples.
I hope you will see in the end that Operating Systems and Hypervisors are closely related and also that there is no real sharp boundary, but rather a sliding scale of "Hypervisory-ness".
Comparison of Operating Systems and Hypervisors
Similarities between Operating Systems and Hypervisors
There are some things which are similar between Operating Systems and Hypervisors: both of them provide isolation between and scheduling of "tasks" (processes in the case of the OS, VMs in the case of the Hypervisor). Both of them provide resource management, resource isolation, and resource virtualization. Both of them provide memory management. Both of them provide a storage abstraction: in the case of the OS, you have a hierarchical file system, in the case of the Hypervisor, you have images; you can think of images like very big files and you can think of the way images are bound to VMs as simple directories.
Differences between Operating Systems and Hypervisors
However, there are also important differences: Operating Systems manage and schedule a large number of small processes, whereas Hypervisors manage and schedule a small number of large VMs. Operating Systems manage a large number of small files in deep hierarchies, Hypervisors manage a small number of large images in flat (often single level) "directories".
Case studies
But the similarities between the two have not gone unnoticed.
You need to be able to manage your VMs somehow, start and stop them, create, configure, and delete them, download and upload them, etc. For these management / service tasks, most bare-metal Hypervisors have some traditional OS features as well such as a console, GUI, web server, etc. All of these need to be implemented by the hypervisor.
In addition, the hypervisor needs to be able to use the hardware of the host platform, at the very least network interfaces and storage, but for the service features possibly also keyboard, mouse, display, etc. For this, it needs drivers. On many platforms, the hypervisor will also be – at least partially – responsible for thermal management, for which it will again need drivers to read the temperature and fan speed sensors, control the fan controller, etc.
These are all things that an OS also needs. However, hardware vendors will typically only provide drivers for the most popular Operating Systems (e.g. Windows, macOS, Linux) but not Hypervisors (e.g. ESXi). And third-party or Open Source driver developers will also typically focus on systems with the largest installation base, which are again the popular OSs. As a result, driver support in hypervisors is often lagging behind the "big three" OSs, in availability (it takes longer for a driver to be available at all), features, performance, stability, reliability, and quality.
Think, for example, of the rapid evolution of storage solutions in the last years with ever-changing interfaces, protocols, APIs, and features between AHCI (SATA), SCSI (SAS, FiberChannel, iSCSI), NVMe (M.2, U.2, U.3, EDSFF), and more "exotic" ones such as Optane.
Xen
The developers of the Xen bare-metal hypervisor have recognized this and have come up with the following solution: in Xen, there is a Domain 0 (Dom0) ("Domain" is Xen's term for a VM), which is a VM with special privileges that can, for example, access hardware directly. This Dom0 runs a normal OS with some slight modifications. (Xen itself maintains support for Linux and there is partial support for NetBSD and OpenSolaris / Illumos.)
In this way, the Xen developers don't have to develop drivers for the tens of thousands of network cards, storage controllers, etc. and they don't have to develop a console, a terminal, a shell, a GUI, or a web server. All they need to do is to develop a way to pass devices from the Dom0 to the Xen Hypervisor.
This leads to some interesting questions: can you call Xen a Hypervisor if it cannot do its job independently without the help of the Dom0 (which is a full-blown OS)? What is the Hypervisor in Xen's case? Is it Xen? Is it the combination of Xen and the Dom0? Does that mean that a Hypervisor is an OS plus something extra?
It turns out this architecture leads not only to some awkward questions, it also leads to a slightly awkward and complicated design, where Dom0 runs on top of the Hypervisor (i.e. depending on the Hypervisor's services) while at the same time providing services to the Hypervisor (i.e. the Hypervisor depending on the Dom0's services).
Linux Tasks
Let's compare this to modern (para-)virtualization on Linux.
Many Operating Systems have several different types of "Scheduler Entities" which differ mostly in their "weight" and level of isolation. Most Operating Systems have some sort of concept of Processes (more heavyweight, but better isolated) and Threads (more lightweight, but less isolation, e.g. Threads of the same Process share their memory). Some Operating Systems have even more, e.g. Windows NT has Processes, Threads, and Fibers.
Linux has gone down a different route, though. When developers asked for native Kernel-scheduled Threads to be added to Linux because Processes were too heavyweight and slow, Linux's Lead Architect Linus Torvalds decreed: if Processes are too heavy, then the solution is not adding more complexity to the kernel by adding an additional type of Scheduler Entities (like most other OSs have done) but making the Processes lighter, thus fixing the root cause of the problem. As a result, unlike many other Unix-like Operating Systems which have both Processes and Threads in their kernels, Linux only has a single type of Kernel-Scheduled Entities (KSEs): Tasks.
Tasks are the only KSEs in Linux: on a fundamental level, there are no Processes nor Threads. Instead, when creating a Task, you can pass a set of flags which define which parts of the system are shared with the parent task and which aren't. So, by passing the appropriate set of flags, you can create a new process (by telling the OS that you don't want to share memory) or a new Thread (by telling the OS that you do want to share memory).
This "sharing" (and conversely isolation) is defined through Namespaces. They are actually much more sophisticated than a simple "share / don't share" flag mechanism. First off, there can be many Namespaces, and each Namespace can contain many Tasks. Secondly, there are many kinds of namespaces, not just for sharing memory: there are network namespaces, filesystem namespaces, process ID namespaces, user ID namespaces, namespaces for hostnames and name resolution, even time namespaces (allowing different Tasks to have different system clocks), etc. Thirdly, they can be nested.
Using only those two features, Tasks and Namespaces, a lot of features that exist in other OSs can be implemented:
- Processes (Tasks which share everything except instruction pointer, stack, and memory),
- Threads (Tasks which share everything except instruction pointer and stack),
chroot
(Processes which have as the root of their filesystem namespace a sub-node of the parent namespace), or
- FreeBSD
jail
like functionality, or
- Solaris Zone like functionality.
The developers of OS-level Container solutions quickly realized that a group of Processes that share most of the system (except memory) with each other but none of the system with the rest of the Processes would act as an OS-level Container. The Linux Containers Project (aka LXC / LXD) uses this approach. Similarly, Application Containers can be built this way, too. Docker uses this approach.
And it didn't take long until an Israeli virtualization company named Qumranet realized: hey, wait a minute, if we have these features that allow us to build anything on a sliding scale from Threads which share everything to Containers which share nothing except the kernel … what if we just slightly extend this scale just a little bit further so that Tasks don't even share the kernel?
We get a Hypervisor with a world-class scheduler, memory management, and network stack, drivers for tens of thousands of devices, support for dozens of architectures, and an established ecosystem for remote configuration and management. And all of that practically for free!
The result of that realization is KVM, which essentially turns the Linux kernel into a bare-metal Hypervisor, but one built out of an existing OS.
Compared to Xen, this solves the "awkwardness" of having the Hypervisor depend on the privileged OS and at the same time having the privileged OS depend on the Hypervisor by effectively collapsing the two into one. It solves the driver problem of something like ESXi by simply being Linux, an OS which already has drivers for almost everything under the sun. It solves the problem of having to develop APIs, tools, servers, and infrastructure for (remote) administration, management, maintenance, and configuration, since all of those tools already exist for Linux.
Just imagine how much harder it would be to build something like the Proxmox web GUI if you first had to write (or port) your own web server, port Python (or PHP / Ruby / Node.JS / Java / …), database, etc. Or the fact that Proxmox disk images can be stored on half a dozen different network filesystems, Proxmox can authenticate users against half a dozen authentication providers, etc. None of this had to be developed, it already existed as part of the Linux ecosystem.
I don't know much about Hyper-V, but I believe it has a similar architecture with using the NT kernel as the Hypervisor kernel and thus getting access to NT drivers, libraries, and applications.
Conclusion
Hypervisors share many similarities with OSs. They can be regarded as a special kind of OS or they can be regarded as something different but closely related. Real-world implementations show the tight relation with Hypervisors relying on OSs and even OSs being used as Hypervisors.