NAS and SAN in Virtualized Environments
Traditional enterprise storage was originally designed for two different use cases:
- Storing user files (Network Attached Storage or “NAS”)
- Providing fast, reliable disk storage directly to applications (Storage Area Network or “SAN”)
Twenty years ago, before virtualization, NAS was used to hold user files, public folders, archives, etc. SAN was used for databases, ERP, and other applications that either needed larger storage or higher reliability than disks attached directly to a server (direct-attached storage or “DAS”).
These systems were in wide use when VMware and other virtualization solutions started appearing on the market. Initially, VMware was used to virtualize single servers and used DAS. The physical server was a single point of failure for the virtual machines (VMs), and because there was no portability of VMs between single servers, you had to follow the same rules of physical server sizing: size the server for the biggest (peak) load. Most of the time this mean running underutilized servers, but there would be headroom for the peak load when needed. Think of retail, for example: a server needs to be sized for the holiday peak, even if in July the server is woefully underutilized. In the first wave of virtualization, that did not change.
The insight that made VMware an enterprise powerhouse was that if a running VM could be moved from one server to another within a resource pool, server resources could be optimized and utilization maximized. Networks were 100Mbps and sometimes 1Gbps, which is enough to live-migrate a VM’s CPU and memory. What was needed was a shared storage infrastructure that multiple servers could access simultaneously. VMware decided to support NAS and SAN: NAS with NFS, and SAN using iSCSI or Fibre Channel (FC) with their own proprietary file system, VMFS, placed on top.
The advantage was that most enterprises already had SAN, NAS or both, so they could immediately start using VMware on their existing storage. SAN and NAS were already reliable, had certified staff, and were well understood.
So VMware took off, and so did the use of NAS and SAN storage with virtualization. As virtualization was still at small scale, this worked well. But as virtualization began to reach higher and higher percentages of the overall storage workloads, problems became evident. In fact, VMware has reported that up to 40% of their trouble tickets are storage related.
Here are the problems:
NFS and NAS:
- Designed for thousands of users and millions or billions of files (virtualization use cases are more like hundreds of servers and thousands of VM files). It’s a different scale
- File server use-case is usually lock file, read into local memory, save back to the file server. VM use case is use the file as if it were a disk, reading and writing across the file.
- File servers files are many and small, and can be nested deep in folder trees (10MB is a very large file for file servers). The virtualization use case is the opposite: a relatively small number of very big files (10GB is a small VM), and a relatively shallow directory.
- File servers use a lot of metadata calls (e.g., lock, read attributes, set attributes, commit, lookup), to keep multiple users from accessing files, and maintain security with many users with different levels of access. The VM use case is almost all read/write.
- File servers typically read 100% when the file’s opened and write 100% when the file’s closed. VMs read/write constantly across the VM.
Most NAS is optimized for file server use case, so their design is not optimized for VMs, which can result in poor performance, overhead from unnecessary metadata calls, etc.
NAS, however, is file-centric, which maps well to the structure of VM objects (vDisks, config, metadata, memory, etc.)
SAN challenges include:
- Designed for 1 application to 1 LUN. The VM use case puts many VMs in a single LUN.
- Designed for transporting blocks quickly to disk, assuming all blocks are for the same application in a FIFO manner. In a VM environment, the SAN loses visibility into which blocks are with which VM, which can result in collisions and contention.
- Storage services at a LUN level (snap, replication, clone, etc.) make sense with a “one app: one LUN” paradigm.
SAN benefits include: raw speed, designed for random read/write across the LUN, which is similar to how vDisks are accessed.