yoyoclouds

just another cloudy day….

HDFS Architecture

HDFS Architecture

HDFS is a block-structured file system: individual files are broken into blocks of a fixed size. These blocks are stored across a cluster of one or more machines with data storage capacity. Individual machines in the cluster are referred to as DataNodes. A file can be made of several blocks, and they are not necessarily stored on the same machine; the target machines which hold each block are chosen randomly on a block-by-block basis. Thus access to a file may require the cooperation of multiple machines, but supports file sizes far larger than a single-machine DFS; individual files can require more space than a single hard drive could hold.

If several machines must be involved in the serving of a file, then a file could be rendered unavailable by the loss of any one of those machines. HDFS combats this problem by replicating each block across a number of machines (3, by default).

Most block-structured file systems use a block size on the order of 4 or 8 KB. By contrast, the default block size in HDFS is 64MB — orders of magnitude larger. This allows HDFS to decrease the amount of metadata storage required per file (the list of blocks per file will be smaller as the size of individual blocks increases).

Furthermore, it allows for fast streaming reads of data, by keeping large amounts of data sequentially laid out on the disk. The consequence of this decision is that HDFS expects to have very large files, and expects them to be read sequentially. Unlike a file system such as NTFS or EXT, which see many very small files, HDFS expects to store a modest number of very large files: hundreds of megabytes, or gigabytes each. After all, a 100 MB file is not even two full blocks. Files on your computer may also frequently be accessed “randomly,” with applications cherry-picking small amounts of information from several different locations in a file which are not sequentially laid out. By contrast, HDFS expects to read a block start-to-finish for a program.

This makes it particularly useful to the MapReduce style of programming described in Module 4. That having been said, attempting to use HDFS as a general-purpose distributed file system for a diverse set of applications will be suboptimal.

Because HDFS stores files as a set of large blocks across several machines, these files are not part of the ordinary file system. Typing ls on a machine running a DataNode daemon will display the contents of the ordinary Linux file system being used to host the Hadoop services — but it will not include any of the files stored inside the HDFS. This is because HDFS runs in a separate namespace, isolated from the contents of your local files. The files inside HDFS (or more accurately: the blocks that make them up) are stored in a particular directory managed by the DataNode service, but the files will named only with block ids. You cannot interact with HDFS-stored files using ordinary Linux file modification tools (e.g., ls, cp, mv, etc). However, HDFS does come with its own utilities for file management, which act very similar to these familiar tools. A later section in this tutorial will introduce you to these commands and their operation.

It is important for this file system to store its metadata reliably. Furthermore, while the file data is accessed in a write once and read many model, the metadata structures (e.g., the names of files and directories) can be modified by a large number of clients concurrently. It is important that this information is never desynchronized. Therefore, it is all handled by a single machine, called the NameNode. The NameNode stores all the metadata for the file system. Because of the relatively low amount of metadata per file (it only tracks file names, permissions, and the locations of each block of each file), all of this information can be stored in the main memory of the NameNode machine, allowing fast access to the metadata.

To open a file, a client contacts the NameNode and retrieves a list of locations for the blocks that comprise the file. These locations identify the DataNodes which hold each block. Clients then read file data directly from the DataNode servers, possibly in parallel. The NameNode is not directly involved in this bulk data transfer, keeping its overhead to a minimum.

Of course, NameNode information must be preserved even if the NameNode machine fails; there are multiple redundant systems that allow the NameNode to preserve the file system’s metadata even if the NameNode itself crashes irrecoverably. NameNode failure is more severe for the cluster than DataNode failure. While individual DataNodes may crash and the entire cluster will continue to operate, the loss of the NameNode will render the cluster inaccessible until it is manually restored. Fortunately, as the NameNode’s involvement is relatively minimal, the odds of it failing are considerably lower than the odds of an arbitrary DataNode failing at any given point in time.

Leave a comment »

HDFS Introduction

HDFS Introduction

HDFS, the Hadoop Distributed File System, is a distributed file system designed to hold very large amounts of data (terabytes or even petabytes), and provide high-throughput access to this information. Files are stored in a redundant fashion across multiple machines to ensure their durability to failure and high availability to very parallel applications. This module introduces the design of this distributed file system and instructions on how to operate it.
A distributed file system is designed to hold a large amount of data and provide access to this data to many clients distributed across a network. There are a number of distributed file systems that solve this problem in different ways.
NFS, the Network File System, is the most ubiquitous distributed file system. It is one of the oldest still in use. While its design is straightforward, it is also very constrained. NFS provides remote access to a single logical volume stored on a single machine. An NFS server makes a portion of its local file system visible to external clients. The clients can then mount this remote file system directly into their own Linux file system, and interact with it as though it were part of the local drive.
One of the primary advantages of this model is its transparency. Clients do not need to be particularly aware that they are working on files stored remotely. The existing standard library methods like open(), close(), fread(), etc. will work on files hosted over NFS.

But as a distributed file system, it is limited in its power. The files in an NFS volume all reside on a single machine. This means that it will only store as much information as can be stored in one machine, and does not provide any reliability guarantees if that machine goes down (e.g., by replicating the files to other servers). Finally, as all the data is stored on a single machine, all the clients must go to this machine to retrieve their data. This can overload the server if a large number of clients must be handled. Clients must also always copy the data to their local machines before they can operate on it.
______________________________________________________________________________________________
HDFS is designed to be robust to a number of the problems that other DFS’s such as NFS are vulnerable to. In particular:

* HDFS is designed to store a very large amount of information (terabytes or petabytes). This requires spreading the data across a large number of machines. It also supports much larger file sizes than NFS.

* HDFS should store data reliably. If individual machines in the cluster malfunction, data should still be available.

* HDFS should provide fast, scalable access to this information. It should be possible to serve a larger number of clients by simply adding more machines to the cluster.

* HDFS should integrate well with Hadoop MapReduce, allowing data to be read and computed upon locally when possible.
But while HDFS is very scalable, its high performance design also restricts it to a particular class of applications; it is not as general-purpose as NFS. There are a large number of additional decisions and trade-offs that were made with HDFS. In particular:
* Applications that use HDFS are assumed to perform long sequential streaming reads from files. HDFS is optimized to provide streaming read performance; this comes at the expense of random seek times to arbitrary positions in files.

* Data will be written to the HDFS once and then read several times; updates to files after they have already been closed are not supported. (An extension to Hadoop will provide support for appending new data to the ends of files; it is scheduled to be included in Hadoop 0.19 but is not available yet.)

* Due to the large size of files, and the sequential nature of reads, the system does not provide a mechanism for local caching of data. The overhead of caching is great enough that data should simply be re-read from HDFS source.

* Individual machines are assumed to fail on a frequent basis, both permanently and intermittently. The cluster must be able to withstand the complete failure of several machines, possibly many happening at the same time (e.g., if a rack fails all together). While performance may degrade proportional to the number of machines lost, the system as a whole should not become overly slow, nor should information be lost. Data replication strategies combat this problem.

The design of HDFS is based on the design of GFS, the Google File System. Its design was described in a paper published by Google.
Leave a comment »

Whats new in VMware vSphere 5?

vSphere 5 New Features

As with any major upgrade, improvements to speed, stability and scalability are some of the important “new” features.  Virtual machines can be larger in vSphere 5. Virtual machines with up to one terabyte of memory and up to 32 CPUs are now supported. VMware says that vSphere 5 VMs can handle over one million IOPS. vSphere 5 is also the first version of vSphere to be developed completely on ESXi which is independent from any other operating system. ESXi’s only purpose is to run VMware which means it has a very thin, optimized footprint of less than 100 MB.

However, vSphere 5 is about more than just bigger and faster.  VMware says there are over 200 new features in the latest edition of vSphere.

vSphere 5 supports three new automated functions that form the backbone of an intelligent policy management system. The idea is that administrators can configure policies that enable a “set it and forget it” approach to managing virtualization in the datacenter.

The first of these new features is Auto-Deploy. Auto-Deploy automatically deploys severs on the fly using a PXE boot to turn on the server, install an image and then add the systems resources into an existing pool. Auto-Deploy allows 40 severs to be deployed in 10 minutes instead of 20 hours. Once the servers are running, the Auto-Deploy policy can also automatically patch the installations.

Profile-Driven Storage groups storage according to user-defined policies. When provisioning resources, administrators select the level of service required and vSphere automatically chooses the available resources that best correspond to the selected level.

Finally, Storage DRS automatically manages the placement and balancing for a VM across storage resources according to the storage policy of the virtual machine, eliminating the need for an administrator to monitor and reallocate resources to maintain the necessary level of service.

 

VMware vSphere Storage Appliance

In an effort to reach the small and medium sized business (SMB) market, VMware also announced the VMware vSphere Storage Appliance.

One of the complexities for SMB enterprises looking to capitalize on the power of VMware server virtualization is the implementation and management of shared storage. The VMware vSphere Storage Appliance seeks to address this concern.  Using the appliance, customers can take advantage of features like High Availability and vMotion without having to implement their own shared storage infrastructure. Instead, the vSphere Storage Appliance integrates with VMware to pool the internal server storage and present it as shared storage to create a virtual pool of storage without the need for external storage. Since the physical storage is spread across numerous physical servers, the storage pool acts and responds in the same manner as an array of external physical storage.

The vSphere Storage Appliance can take full advantage of the new vSphere 5 allowing SMBs without complex shared storage configurations to also benefit from the intelligent policy management that drives the hallmark set it and forget it features of vSphere 5.

Network I/O control, virtual firewall updates


As with storage management, vSphere 5 users will reportedly be able to establish networking resource pools according to pre-defined rules. The new version will also enable multi-tenancy deployment, and will bridge physical and virtual QoS by complying with a new IEEE 802.1 VLAN tagging standard.

New vSphere Pricing Model

With vSphere 5, VMware introduces a new pricing model that is no longer based on the physical server hardware. Instead, vSphere 5 is priced according to the amount of virtual resources allocated. There are now just three pricing tiers, Standard, Enterprise and Enterprise Plus.

The Standard tier allows up to 24 GB of allocated memory to all virtual machines and up to eight virtual CPUs per virtual machine. Enterprise allows 32 GB of virtual memory across all virtual machines and up to eight virtual CPUs per VM. Enterprise Plus allows up to 48 GB across virtual machines and up to 32 virtual CPUs per VM.

The new pricing model hasn’t been warmly received, in part because existing VMware customers purchased hardware to maximize the value under the vSphere 4 licensing model where pricing was based on the number of sockets and cores in the server. In particular, the vSphere 4 Enterprise Plus license allows for unlimited memory. Companies purchased servers with huge amounts of memory tied to just one or two processors with six cores meaning they needed just one or two Enterprise Plus licenses per server.

_____________________________________________________________________________________________

VMware vSphere 5.0

  • ESXi Convergence – No more ESX, only      ESXi
  • New VM Hardware:  Version 8 – New Hardware support (VS5 still supports VM Hardware 4 & 7 as well if you still want to migrate to the old hosts)
    • 3D graphics Support for Windows Aero
    • Support for USB 3.0 devices
  • Platform Enhancements
    • 32 vCPUs per VM
    • 1TB of RAM per VM
    • 3D Graphics Support
    • Client-connected USB devices
    • USB 3.0 Devices
    • Smart-card Readers for VM Console Access
    • EFI BIOS
    • UI for Multi-core  vCPUs
    • VM BIOS boot order config API and PowerCLI Interface
  • vSphere Auto Deploy – mechanism for having hosts deploy quickly when needed
  • Support for Apple Products – Support for running OSX 10.6 Server (Snow Leopard) on Apple Xserve hardware.
  • Storage DRS – Just like DRS does for CPU and Memory, now for storage
    • Initial Placement – Places new VMs on the storage with the most space and least latency
    • Load Balancing – migrates VMs if the storage cluster (group of datastores) gets too full or the latency goes too high
    • Datastore Maintenance Mode  – allow you to evacuate VMs from a datastore to work on it (does not support Templates or non-registered VMs yet…)
    • Affinity & Anti-Affinity – Allows you to make sure a group of VMs do not end up on the same datastore (for performance or Business Continuity reasons) or VMs that should always be on the same datastore.  Can be at the VM or down to the individual VMDK level.
    • Support for scheduled disabling of Storage DRS – perhaps during backups for instance.
  • Profile-Driven Storage – Creating pools of storage in Tiers and selecting the correct tier for a given VM.  vSphere will      make sure the VM stays on the correct tier(pool) of storage.
  • vSphere File System – VMFS5 is now available.
    • Support for a single extent datastore up to 64TB
    • Support for >2TB Physical Raw Disk Mappings
    • Better VAAI (vStorage APIs for Array Integration) Locking with more tasks
    • Space reclamation on thin provisioned LUNs
    • Unified block size (1MB)
    • Sub-blocks for space efficiency (8KB vs. 64KB in VS4)
  • VAAI now a T10 standard – All 3 primitives (Write Same, ATS and Full Copy) are now T10 standard compliant.
    • Also now added  support for VAAI NAS Primitives including Full File Clone (to have the nas do the copy of the vmdk files for vSphere) and Reserve Space (to have the NAS create thick vmdk files on NAS storage)
  • VAAI Thin Provisioning – Having the storage do the thin provisioning and then vSphere telling the storage which blocks can be      reclaimed to shrink the space used on the storage
  • Storage vMotion Enhancements
    • Now supports storage  vMotion with VMs that have snapshots
    • Now supports moving linked clones
    • Now supports Storage DRS (mentioned above)
    • Now uses mirroring to migrate vs change block tracking in VS4.  Results in faster migration time and greater migration success.
  • Storage IO Control for NAS – allows you to throttle the storage performance against “badly-behaving” VMs also prevents them from stealing storage bandwidth from high-priority VMs.  (Support for iSCSI and FC was added in VS4.)
  • Support for VASA (vStorage APIs for Storage Awareness) – Allows storage to integrate tighter with vcenter for management.  Provides a mechanism for storage arrays to report their capabilities, topology and current state.  Also helps Storage DRS make more educated decisions when moving VMs.
  • Support for Software FCoE Adapters – Requires a compatible NIC and allows you to run FCoE over that NIC without the need for a CNA Adapter.
  • vMotion Enhancements
    • Support for multiple NICs.  Up to 4 x 10GbE or 16 x 1GbE NICs
    • Single vMotion can span multiple NICs (this is huge for 1GbE shops)
    • Allows for higher number of concurrent vMotions
    • SDPS Support (Slow Down During Page Send) – throttles busy VMs to reduce timeouts and improve success.
    • Ensures less than 1 second switchover in almost all cases
    • Support for higher latency networks (up to ~10ms)
    • Improved error reporting – better, more detailed logging
    • Improved Resource Pool Integration – now puts VMs in the proper resource pool
  • Distributed Resource Scheduling/Dynamic Power Management Enhancements
    • Support for “Agent  VMs” – These are VMs that work per host (currently mostly vmware services – vshield, edge, app, endpoint, etc)  DRS will not migrate these VMs
    • “Agents” do not need to be migrated for maintenance mode
  • Resource pool enhancements – now more consistent for clustered vs. non-clustered hosts.  No longer can modify resource      pool settings on the host itself when it is managed by vcenter.  It does allow for making changes if the host gets disconnected from vCenter
  • Support for LLDP Network Protocol – Standards based vendor-neutral discovery protocol
  • Support for NetFlow – Allows collection of IP traffic information to send to collectors (CA, NetScout, etc) to provide bandwidth statistics, irregularities, etc.  Provides complete visibility to traffic between VMs or VM to outside.
  • Network I/O Control (NETIOC) – allows creation of network resource pools, QoS Tagging, Shares and Limits to traffic types, Guaranteed Service Levels for certain traffic types
  • Support for QoS (802.1p) tagging – provides the ability to Q0S tag any traffic flowing out of the vSphere infrastructure.
  • Network Performance Improvements
    • Multiple VMs receiving multicast traffic from the same source will see improved throughput and CPU efficiency
    • VMkernel NICs will see higher throughput with small messages and better IOPs scaling for iSCSI traffic
  • Command Line Enhancements
    • Remote commands and local commands will now be the same (new esxcli commands are not backwards compatible)
    • Output from commands can now be formatted automatically (xml, CSV, etc)
  • ESXi 5.0 Firewall Enhancements
    • New engine not based on iptables
    • New engine is service-oriented and is a stateless firewall
    • Users can restrict specific services based on IP address and Subnet Mask
    • Firewall has host-profile support
  • Support for Image Builder – can now create customized ESXi CDs with the drivers and OEM add-ins that you need.  (Like      slip-streaming for Windows CDs) Can also be used for PXE installs.
  • Host Profiles Enhancements
    • Allows use of an answer file to complete the profile for an automated deployment
    • Greatly expands the config options including: iSCSI, FCoE, Native Multipathing, Device Claming, Kernel Module Settings & more)
  • High Availability Enhancements
    • No more Primary/Secondary concept, one host is elected master and all others are slaves
    • Can now use storage-level communications – hosts can use “heartbeat datastores” in the event that network communication is lost between the hosts.
    • HA Protected state is now reported on a per/VM basis.  Certain operations no longer wait for confirmation of protection to run for instance power on. The result is that VMs power on faster.
    • HA Logging has been consolidated into one log file
    • HA now pushes the HA Agent to all hosts in a cluster instead of one at a time.  Result: reduces config time for HA to ~1 minute instead of ~1 minute per host in the cluster.
    • HA User Interface now shows who the Master is, VMs Protected and Un-protected, any configuration issues, datastore heartbeat configuration and better controls on failover hosts.
  • vCenter Web Interface – Admins can now use a robust web interface to control the infrastructure instead of the GUI client.
    • Includes VM Management functions (Provisioning, Edit VM, Poer Controls, Snaps, Migrations)
    • Can view all objects (hosts clusters, datastores, folders, etc)
    • Basic Health Monitoring
    • View the VM Console
    • Search Capabilities
    • vApp Management functions (Provisioning, editing, power operations)
  • vCenter Server Appliance – Customers no longer need a Windows license to run vCenter.  vCenter can come as a self-contained appliance
    • 64-bit appliance       running SLES 11
    • Distributed as       3.6GB, Deployment range is 5GB to 80GB of storage
    • Included database       for 5 Hosts or 50 VMs (same as SQL Express in VS4)
    • Support for Oracle       as the full DB (twitter said that DB2 was also supported but I cannot       confirm in my materials)
    • Authentication thru       AD and NIS
    • Web-based       configuration
    • Supports the vSphere       Web Client
    • It does not support:        Linked Mode vCenters, IPv6, SQL, or vCenter heartbeat (HA is       provided thru vSphere HA)
Leave a comment »

The Eucalyptus Open Source Private Cloud

The Eucalyptus Open Source Private Cloud

Eucalyptus is a Linux‐based open Source software architecture that Implements efficiency enhancing private and hybrid clouds within an enterprise’s existing IT Infrastructure.

Eucalyptus is an acronym for “Elastic Utility Computing Architecture for Linking Your Programs to Useful Systems”.

A Eucalyptus private cloud is deployed across an enterprise’s “on‐premise” datacenter infrastructure and is accessed by users over enterprise intranet. Thus sensitive data remains entirely secure from external intrusion behind the enterprise firewall.

Initially developed to support the high performance computing (HPC) research of Professor Rich Wolski’s research group at the University of California, Santa Barbara, Eucalyptus is engineered according to design principles that ensure compatibility with existing Linux-based data center installations. Eucalyptus can be deployed without modification on all major Linux OS distributions, including Ubuntu, RHEL, Centos, and Debian. And Ubuntu distributions now include the Eucalyptus software core as the key component of the Ubuntu Enterprise Cloud.

Technology

Eucalyptus was designed from the ground up to be easy to install and as non intrusive as possible. The software framework is highly modular, with industry standard, language‐agnostic communication. Eucalyptus is also unique by providing a virtual network overlay that both isolates network traffic of different users and allows two or more clusters to appear to belong to the same Local Area Network (LAN). The external interface to Eucalyptus can also be leveraged to become compatible with multiple public clouds (Amazon EC2, Sun Cloud, etc.).

Eucalyptus Components

Each Eucalyptus service component exposes a well-defined language agnostic API in the form of a WSDL document containing both the operations that the service can perform and the input/output data structures. Inter-service authentication is handled via standard WS-Security mechanisms. There are five high-level components, each with its own Web-service interface, that comprise a Eucalyptus installation (Fig a). A brief description of the components within the Eucalyptus system follows.

CLOUD CONTROLLER

Cloud Controller (CLC) is the entry-point into the cloud for administrators, developers, project managers, and end users. The CLC is responsible for querying the node managers for information about resources, making high level scheduling decisions, and implementing them by making requests to cluster controllers. The CLC, as shown in Figure 1, is also the interface to the management platform. In essence, the CLC is responsible for exposing and managing the underlying virtualized resources (servers, network, and storage) via a well-defined industry standard API (Amazon EC2) and a Web-based user interface.

Functions:

  1. Monitor the availability of resources on various components of the cloud infrastructure, including hypervisor nodes that are used to actually provision the instances and the cluster controllers that manage the hypervisor nodes
  2. Resource arbitration – Deciding which clusters will be used for provisioning the instances
  3. Monitoring the running instances

In short, CLC has a comprehensive knowledge of the availability and usage of resources in the cloud and the state of the cloud.

CLUSTER CONTROLLER 

Cluster Controller (CC) generally executes on a cluster front-end machine, or any machine that has network connectivity to both the nodes running NCs and to the machine running the CLC. CCs gather information about a set of VMs and schedules VM execution on specific NCs. The CC also manages the virtual instance network and participates in the enforcement of SLAs as directed by the CLC. All nodes served by a single CC must be in the same broadcast domain (Ethernet).

Functions:

  1. To receive requests from CLC to deploy instances
  2. To decide which NCs to use for deploying the instances on
  3. To control the virtual network available to the instances
  4. To collect information about the NCs registered with it and report it to the CLC

NODE CONTROLLER

Node Controller (NC) is executed on every node that is designated for hosting VM instances. A UEC node is a VT enabled server capable of running KVM as the hypervisor. UEC automatically installs KVM when the user chooses to install the UEC node. The VMs running on the hypervisor and controlled by UEC are called instances. Eucalyptus supports other hypervisors like Xen apart from KVM, but Canonical has chosen KVM as the preferred hypervisor for UEC.

Node Controller runs on each node and controls the life cycle of instances running on the node. The NC interacts with the OS and the hypervisor running on the node on one side and the CC on the other side.

NC queries the Operating System running on the node to discover the node’s physical resources – the number of cores, the size of memory, the available disk space and also to learn about the state of VM instances running on the node and propagates this data up to the CC.

Functions:

  1. Collection of data related to the resource availability and utilization on the Node and reporting the data to CC
  2. Instance life cycle management

STORAGE CONTROLLER

Storage Controller (SC) implements block-accessed network storage (e.g. Amazon Elastic Block Storage — EBS) and is capable of interfacing with various storage systems (NFS, iSCSI, etc.). An elastic block store is a Linux block device that can be attached to a virtual machine but sends disk traffic across the locally attached network to a remote storage location. An EBS volume cannot be shared across instances but does allow a snap-shot to be created and stored in a central storage system such as Walrus, the Eucalyptus storage service.

Functions:

  1. Creation of persistent EBS devices
  2. Providing the block storage over AoE or iSCSI protocol to the instances
  3. Allowing creation of snapshots of volumes.

WALRUS

Walrus (put/get storage) allows users to store persistent data, organized as eventually-consistent buckets and objects. It allows users to create, delete, list buckets, put, get, delete objects, and set access control policies. Walrus is interface compatible with Amazon’s S3, and supports the Amazon Machine Image (AMI) image-management interface, thus providing a mechanism for storing and accessing both the virtual machine images and user data. Using Walrus users can store persistent data, which is organized as buckets and objects. WS3 is a file level storage system, as compared to the block level storage system of Storage Controller.

For using Walrus to manage Eucalyptus VM images, you can use Amazon’s tools to store/register/delete them from Walrus. Other third party tools can also be used to interact with Walrus directly.

Third party tools for interacting with Walrus

  1. s3curl S3 Curl is a command line tool that is a wrapper around curl.
    http://open.eucalyptus.com/wiki/s3curl
  2. s3cmd is a tool that allows command line access to storage that supports the S3 API.
    http://open.eucalyptus.com/wiki/s3cmd
  3. s3fs is a tool that allows users to access S3 buckets as local directories.
    http://open.eucalyptus.com/wiki/s3fs

MANAGEMENT PLATFORM

Management Platform provides an interface to various Eucalyptus services and modules. These features can include VM management, storage management, user/group management, accounting, monitoring, SLA definition and enforcement, cloud-bursting, provisioning, etc.

EUCA2OOLS

Euca2ools are command-line tools for interacting with Web services that export a REST/Query-based API compatible with Amazon EC2 and S3 services. The tools can be used with both Amazon’s services and with installations of the Eucalyptus open-source cloud-computing infrastructure. The tools were inspired by command-line tools distributed by Amazon (api-tools and ami-tools) and largely accept the same options and environment variables. However, these tools were implemented from scratch in Python, relying on the Boto library and M2Crypto toolkit.

Features:

  1. Query of availability zones (i.e. clusters in Eucalyptus)
  2. SSH key management (add, list, delete)
  3. VM management (start, list, stop, reboot, get console output)
  4. Security group management
  5. Volume and snapshot management (attach, list, detach, create,      bundle, delete)
  6. Image management (bundle, upload, register, list, deregister)
  7. IP address management (allocate, associate, list, release)

KEY BEFEFITS

  • Build and manage self-service heterogeneous on-premise IaaS clouds using either existing infrastructure or dedicated compute, network and storage resources
  • Support high-availability IaaS for the most demanding cloud deployments
  • Gain precise control of private cloud resources via enterprise-ready user and group identity management along with resource quotas
  • Dynamic resource pooling with built-in elasticity allows organizations to scale up and down virtual compute, network and storage resources
  • Robust storage integration enables IT to easily connect and manage existing storage systems from within Eucalyptus clouds
  • Build hybrid clouds between on-premise Eucalyptus clouds and AWS and AWS-compatible public clouds
  • Run Eucalyptus or Amazon Machine Images as virtual cloud instances on Eucalyptus and AWS-compatible clouds
  • Leverage vibrant AWS ecosystem and management tools to manage Eucalyptus IaaS clouds

REFERENCES

Leave a comment »