0-abstract 1–introduction 2–related work/literature survey3–system model 4–Implementation
2–related work/literature survey3–system model
4–Implementation (algorithm )
5–experimental results and evaluation
Multi Objective VM Scheduling in Open Stack Private Cloud
Cloud Computing is an emerging, distributed computing platform. It is an internet based approach to distribute and access the resources dynamically based on the user requests. Open stack is an open source cloud computing platform which helps to deploy own private or public cloud and manage all the resources. Determining the effective techniques to manage the cloud infrastructure, resources of datacentre has recently become important. As it mainly depends on virtualization technology so it is very much needed to provide an effective virtual machine scheduling strategy.VM Scheduling can be defined as assignment of a group of virtual machines (VMs) to a group of physical machines (PMs). Since the compute node(host) is the one who provides all the resources to the VM such as memory, RAM, disk and computing power so Decision of selection of host on which the instance must be launched becomes crucial. In this paper we propose a model that uses logistic Regression to classify weather a host is overloaded or not-overloaded and then take the decision by considering various parameters such as Ram utilization, CPU utilization and Current workload of host. In this paper, we propose a model based logistic regression to detect overloaded hosts by dynamically generating rules based on historical data of hosts and datacentre in order to update association functions to address and adapt the changes of different types of workloads running on the cloud provider datacentre. The experiments results based on dynamic workloads show the proposed algorithm significantly outperforms the existing strategy.
OpenStack is an open-source cloud computing software platform that has achieved an exponential increase in developer attention and acceptance. It is essentially a set of services working in unison to operate a cloud infrastructure. The OpenStack compute scheduler (nova) decides the host on which the VM is to be launched. It assigns the request to the node that has a maximum resource to accommodate the process. It does not consider energy efficiency. This leads to poor resource utilization. OpenStack compute scheduler’s instance assignment decision building does not depend on the CPU load; Rather hosts with a large amount of CPU and memory are chosen, leading to resource wastage.
Also, due to the unexpected and significant increase in request and movement of cloud environments are getting more complex. This results in a greater necessity of control of the underlying physical infrastructure. An important part of controlling such remarkable environments is scheduling of virtual resources. Different kind of virtual machines are required as per the requirement and these services are provided as per the Service Level Agreement. For managing a large amount of VM requests, efficient resource scheduling algorithms are required.
Although various strategies have been implemented for scheduling of resources. The key concern is that the present strategies are weak and leads to the unproductive use of resources.Our aim is one such innovative solution. Scheduling users requests for Virtual machines (instances), depending upon a number of factors including the type of customers that are served, the characteristics of the underlying physical infrastructure will effectively schedule the instances depending upon the type of resource utilization duly maintaining Service level agreement (SLA) guarantees.
In cloud computing environment allocation of resources along with least energy consumption and efficient utilization of existing resources is considered to be a complex task. The resources available in the cloud are assigned by using a virtual machine (VM). Therefore, there exists a necessity of an efficient VM scheduling method to make the best use of resources and improve the system performance.
In this paper, logistic Regression module is used to monitor the available resources and analyse them based on the historical data of all the hosts to find out the overloaded hosts in order to reduce energy consumptions and decrease the number of virtual machine migrations. This enhances the resource utilization in the cloud datacenters and also provides a good quality of service that guarantees users satisfaction. Virtual Machine consolidation by monitoring and detecting the overloaded hostss by setting a fixed threshold,such technique will also reduces the power consumption
The rest of the paper is arranged as follows: In section (2) we review some of the related works related to scheduling and resource management that concerns in minimizing the resource wastage. The system model and their layers are presented in details in section (3). We offer a complete description of our proposed algorithm in section (4). The power model, setup, and experimental results are illustrated in section (5). Finally, the conclusions are drawn in section (6).
In this section, we will discuss some of the related work that provides different strategies to administer the resources so that to reduce energy consumption and power consumption. We have contemplated a fixed threshold to resolve weather hosts are overloaded or under loaded in the data centre
VM placement is the process of mapping of instances (virtual machines) to the hosts (physical machines). As the whole framework is fully virtual, so VM scheduling becomes a core issue of system. A lot of research has been done to solve this problem using numerous different strategies. Scheduling must be done based on time saving, energy consumption, Power consumption and load balancing metric.
The main objective of Open stack open source project is to build a massively scalable cloud, with horizontal and elasticity in mind. Open stack consists of six main services as shown in table 1. We are concentrating mainly on compute service and how it affects other services. Compute is the one which provides all the resources such as RAM, disk, VCPU, computing power to the virtual machine and associated metadata in image. Dashboard provides a web front-end to the Compute whereas as neutron provides virtual networking for the compute. Block Storage provides storage volumes for Compute. Image can store the actual virtual disk files in the Object Storage and all the services authenticate with Identity.
The scheduling is performed by a Nova service in OpenStack based clouds it acts as an intermediate layer that works above the freely chosen hypervisors like KVM, Xen, VMware, and Hyper-V. Nova offers a vast API’s to perform and manage resource scheduling and uses removable design orient that allows nova scheduler to place the VM’s as per certain scheduling strategies. The scheduling is done based on the host’s present state status, is sent to nova scheduler at regular time intervals as situated in scheduling strategies
Nova is compute service of OpenStack that regulates the requests of storage and processing, which is when a request is generated nova scheduler decides on which compute host the newly generated VM is placed 11. Nova has a configurable nature for configuring scheduler in many ways. Compute host is a physical machine that runs nova compute service over it.
Nova scheduler consists of three types of schedulers by default they are filter, chance and Simple.
There are 14 different types of filters available that can be used and configured. Open stack allows administrator to set a desired filter of their choice. Firstly, all available hosts are weighed with respect to that filter and then they are filtered if they do not meet up to the requirements.
Chance Scheduler randomly chooses any available host by not considering any of its characteristics
Simple Scheduler it finds the host with least available load.
Open stack services
Nova Provides computing
Neutron Provides networking
Keystone Provides identity
Swift Provides object storage
Cinder Provides block storage
Horizon Provides Dashboard (GUI)
Default scheduling algorithm used by Open stack
1) Worst Fit
The worst fit algorithm is an approach to allocating the largest free memory space available so that the portion left will be big enough for the requested flavor size. It is the reverse of best fit. This algorithm initially searches the complete list of free memory space and selects the highest memory space that is sufficient.
The below algorithm shows the implementation of worst fit algorithm
The Round Robin (RR) algorithm is among the essential and generally utilized algorithm for scheduling of resources. Time is sliced into small units are characterized with this. Every single active process is designated in a roundabout line. Disseminating the whole load equally among the VMs as well as the equality of distributing the load has been likewise focused on round robin algorithm. In an RR queue, every task has identical execution time and every single task is executed in turn so that none of the tasks sits ideally. The scheduler will start to designate the virtual machine to a node and will additionally move to next virtual machine which is to be situated in the next node. Till a virtual machine is allocated to a node, the algorithm remains connected to every node and will be moved to the first node and hence, the same process will be rehashed for the further requested node as well.
The below algorithm shows the implementation of Round robin algorithm
Efficient virtual Machine scheduling algorithm
In 2, authors discussed the difficulties for virtual resource and virtual machine scheduling in clouds, to do the same Efficient Virtual Machine Scheduling Algorithm (EVMSA) is proposed which enhances the resource utilization for example CPU, memory, and disk. Authors have described the basic algorithms which are implemented in OpenStack nova-scheduler (filter scheduler) and have mentioned that, In OpenStack, the nova-API call which are appropriate to OpenStack components have been mapped by nova-scheduler. It keeps running a daemon named nova-schedule and gets a compute or network or volume server from a set of available resources which are based on the scheduling algorithm in the place. The various factors like load, memory, a distance of available zone, CPU architecture etc. will matter significantly in decision making for schedulers. Thus, nova-schedule ought to be remodelled by factors in the current state of the whole cloud infrastructure and apply the complex algorithm to guarantee efficient utilization.
Energy-aware Scheduling for Infrastructure Clouds
Throughout the years, energy has turned into a noteworthy cost factor for data centres. While scheduling the instances, the provider must select on which physical machine the virtual machine ought to be placed. Energy consumption is the greatest factor driving the placement decision. The authors had performed a simulation to measure the difference in energy consumption exclusively by virtual machine schedulers. An extensive (re)-evaluation of the energy-saving potential of timed instances has been performed. In view of the simulation parameters, the original optimizing scheduler was adapted to accomplished better outcomes in the new environment.
Fig 2. CPU Utilisation v/s Power Consumption
Above Fig represents experimental analysis when CPU Utilisation percentage and Power Consumption is compared. It shows that the relationship is linear and thus CPU Utilisation percentage can be used to measure approximate power consumption when actual sensor data are not available.
When CPU utilization in percentage is available, power consumption (in watt) by a node can calculate using following equation:
P= CPUidle + (CPUfull load – CPUidle )*CPU(%)100*CoresWhere,
P = Power consumption (watt)
CPUidle = Power consumption in idle CPU state (watt)
CPUfull load = Power consumption in full load CPU state (watt)
CPU(%) = CPU utilization in percentage
Cores = Number of CPU cores installed.
Existing weighing strategy of only weighing against free RAM leads to non-optimal resource usage. Different weighing metrics should be defined for different workloads. Existing Scheduler does not consider the workload characteristics of the requested instance. So there is a need of a new strategy that will consider all the characteristics and schedule the instance.
PROPOSED SYSTEM MODULE
This section gives the description of the proposed system for scheduling the instances based on an appropriate algorithm. Moreover, it also describes the advantages, requirement specifications along with the pseudocode of used algorithms.
The proposed System module consists of mainly two modules the first one is monitoring of resources and generating train data periodically after certain period of time and then classify the hosts weather they are overloaded or not overloaded using historical data by considering various parameters such as RAM utilization, CPU utilization and Workload as parameters. In (1) the cloud users in the proposed system submit their resource requirements based on their need, and then using the historical data of hosts to gather the statistical data of physical machines and gathers the current status of hosts and their information. The linear and logistic Regression models are implemented as a decision maker module, which is in charge of estimation and prediction when a host is considered as being overloaded or not, and also updates the present situation of all the physical machines and then optimizes the scheduling strategy, decision is taken considering various parameters such as RAM Utilization, CPU Utilization and Current Workload.
After filtering out all the hosts then best fit algorithm is applied to the remaining hosts. The best-fit deals with allocating the smallest free memory space available which is just sufficient enough to launch the instance for the requested flavor. This algorithm initially searches the complete list of free memory space and considers the least memory space that is sufficient. It then finds a memory which is close to actual flavor size requested.
Figure 3.1 shows the design process of scheduling and launching of instances requested by the user. Initially the user request for an instance of particular Flavour. Then hosts are filtered after using the decision provided by linear and logistic Regression module. The accessible hosts are provided as input to the Filters and based on the filters that are specified, the hosts that cannot fulfil are filtered out. After filtering out few hosts, best fit is applied based on the instance flavour requested by the user. The requested flavour and the current available memory, CPU utilization and Workload in each of the available hosts is monitored and henceforth, scheduling algorithm best fit is used.
1 http://se.inf.tu-dresden.de/pubs/papers/knauth2012scheduling.pdf2 1 OpenStack. https://www.openstack.org/software/. Online; accessed 02-Sept-2017.