N10-007 Compare and contrast technologies that support cloud and virtualization

Virtualization

Virtualization is the current rage due to the cost-savings and performance it provides. Virtualization can be implemented through open source solutions (such as Xen and VirtualBox) as well as proprietary solutions (such as VMware), allowing you to take a single physical device and make it appear to users as if it is a number of stand-alone entities.

Virtual Servers

Just as workstations can be virtualized, so can servers: A single server can host multiple logical machines. Using only one server to does the functions of many, the cost-savings that can accumulate in terms of hardware, utilities, and infrastructure can add up.

Tales of security woes that can occur with attackers jumping out of one virtual machine and accessing another have been exaggerated. Although such threats are possible, most software solutions include sufficient protection to reduce the possibility to a small one.

Virtual Switches

As the name implies, a virtual switch works the same as a physical switch but allows multiple switches to exist on the same host (thus saving the implementation of additional hardware). Virtual switches are regularly used with VLAN implementations.

Storage area network

A storage area network (SAN) is any high-performance network whose primary purpose is to enable storage devices to communicate with computer systems and with each other.

We think that the most interesting things about this definition are what it doesn’t say:

  • It doesn’t say that a SAN’s only purpose is communication between computers and storage. Many organizations operate perfectly viable SANs that carry occasional administrative and other application traffic.
  • It doesn’t say that a SAN uses Fibre Channel or Ethernet or any other specific interconnect technology. A growing number of network technologies have architectural and physical properties that make them suitable for use in SANs.
  • It doesn’t say what kind of storage devices are interconnected. Disk and tape drives, RAID subsystems, robotic libraries, and file servers are all being used productively in SAN environments today. One of the exciting aspects of SAN technology is that it is encouraging the development of new kinds of storage devices that provide new benefits to users. Some of these will undoubtedly fail in the market, but those that succeed will make lasting improvements in the way digital information is stored and processed.

iSCSI

iSCSI is an IP-based standard for linking data storage devices over a network and transferring data by carrying SCSI commands over IP networks. iSCSI supports a Gigabit Ethernet interface at the physical layer, which allows systems supporting iSCSI interfaces to connect directly to standard Gigabit Ethernet switches and/or IP routers. When an operating system receives a request it generates the SCSI command and then sends an IP packet over an Ethernet connection. At the receiving end, the SCSI commands are separated from the request, and the SCSI commands and data are sent to the SCSI controller and then to the SCSI storage device. iSCSI will also return a response to the request using the same protocol. iSCSI is important to SAN technology because it enables a SAN to be deployed in a LAN, WAN or MAN.

Jumbo frame

In computer networking, jumbo frames are Ethernet frames with more than 1,500 bytes of payload (MTU). Conventionally, jumbo frames can carry up to 9,000 bytes of payload, but variations exist and some care must be taken when using the term. Many, but not all, Gigabit Ethernet switches and Gigabit Ethernet network interface cards support jumbo frames, but all Fast Ethernet switches and Fast Ethernet network interface cards support only standard-sized frames.

Using a larger MTU value (jumbo frames) can significantly speed up your network transfers.

An average speed of 26.6 MB/s on a 30+ GB transfer using 4k jumbo frames. The slowest drive in the link was a Western Digital WD3200JB – circa 2004, EIDE 320 GB (ATA100/7200 RPM/8 MB cache). Burst speeds (i.e. 300-400 megabyte files) through the network are about 80% of the burst speed drive-to-drive which is not too bad (approximately 30 MB/s network vs. approximately 38 MB/s drive-to-drive).

Fibre Channel

Fibre Channel based networks share many similarities with other networks, but differ considerably by the absence of topology dependencies. Networks based on Token Ring, Ethernet, and FDDI are topology dependent and cannot share the same media because they have different rules for communication. The only way they can interoperate is through bridges and routers. Each uses its own media dependent data encoding methods and clock speeds, header format and frame length restrictions. Fibre Channel based networks support three types of topologies, which include point-to-point, loop (arbitrated), and star (switched). These can be stand-alone or interconnected to form a fabric.

Network attached storage

Local Area Networks came along next, and helped in the sharing of data files among groups of desktop microcomputers. Soon it became clear that LANs would be a significant step toward distributed, client/server systems.

Large-scale client/server systems were then constructed, tying sizable numbers of LANs together through Wide Area Networks (WANs). The idea was to leverage cheap microcomputers and cheap disk storage to replace expensive (but reliable) central computers. NAS describes technology in which an integrated storage system, such as a disk array or tape device, connects directly to a messaging network through a LAN interface, such as Ethernet, using messaging communications protocols like TCP/IP. The storage system functions as a server in a client/server relationship. It has a processor and an operating system or micro-kernel, and it processes file I/O protocols such as Network File System (NFS) to manage the transfer of data between itself and its clients.

Cloud concepts

Cloud computing is typically classified in two ways:

  • Location of the cloud computing
  • Type of services offered

Location of the cloud

Cloud computing is typically classified in the following three ways:

  • Public cloud: In Public cloud the computing infrastructure is hosted by the cloud vendor at the vendor’s premises. The customer has no visibility and control over where the computing infrastructure is hosted. The computing infrastructure is shared between any organizations.
  • Private cloud: The computing infrastructure is dedicated to a particular organization and not shared with other organizations. Some experts consider that private clouds are not real examples of cloud computing. Private clouds are more expensive and more secure when compared to public clouds.

Private clouds are of two types: On-premise private clouds and externally hosted private clouds. Externally hosted private clouds are also exclusively used by one organization, but are hosted by a third party specializing in cloud infrastructure. Externally hosted private clouds are cheaper than On-premise private clouds.

  • Hybrid cloud Organizations may host critical applications on private clouds and applications with relatively less security concerns on the public cloud. The usage of both private and public clouds together is called hybrid cloud. A related term is Cloud Bursting. In Cloud bursting organization use their own computing infrastructure for normal usage, but access the cloud using services like Salesforce cloud computing for high/peak load requirements. This ensures that a sudden increase in computing requirement is handled gracefully.
  • Community cloud involves sharing of computing infrastructure in between organizations of the same community. For example all Government organizations within the state of California may share computing infrastructure on the cloud to manage data related to citizens residing in California.

Classification based upon service provided

Based upon the services offered, clouds are classified in the following ways:

  • Infrastructure as a service (IaaS) involves offering hardware related services using the principles of cloud computing. These could include some kind of storage services (database or disk storage) or virtual servers. Leading vendors that provide Infrastructure as a service are Amazon EC2, Amazon S3, Rackspace Cloud Servers and Flexiscale.
  • Platform as a Service (PaaS) involves offering a development platform on the cloud. Platforms provided by different vendors are typically not compatible. Typical players in PaaS are Google’s Application Engine, Microsofts Azure, Salesforce.com’s force.com .
  • Software as a service (SaaS) includes a complete software offering on the cloud. Users can access a software application hosted by the cloud vendor on pay-per-use basis. This is a well-established sector. The pioneer in this field has been Salesforce.coms offering in the online Customer Relationship Management (CRM) space. Other examples are online email providers like Googles gmail and Microsofts hotmail, Google docs and Microsofts online version of office called BPOS (Business Productivity Online Standard Suite).

The above classification is well accepted in the industry. David Linthicum describes a more granular classification on the basis of service provided. These are listed below:

1. Storage-as-a-service
2. Database-as-a-service
3. Information-as-a-service
4. Process-as-a-service
5. Application-as-a-service
6. Platform-as-a-service
7. Integration-as-a-service
8. Security-as-a-service
9. Management/Governance-as-a-service
10. Testing-as-a-service
11. Infrastructure-as-a-service