This tutorial for Big data and cloud computing will help you in learning Big data with Cloud technology to understand what is cloud storage, Big data in the cloud, characteristics of cloud computing, cloud computing services, and cloud hosting, cloud data storage, and deployment models, cloud computing companies and cloud service providers, cloud infrastructure, advantages of cloud computing and issues with cloud computing.
Introduction to Big data and Cloud Computing
Cloud computing is the utilization of resources like hardware and software that are provided as a service over the Internet. Cloud computing a virtualization framework.
cloud computing is like source on-demand whether it be storage, computing, etc. Cloud follows a pay per usage model. You need to pay for the number of resources you use.
This computing service by the cloud will charge you based only on the number of computing resources we use. Hence for instance, if we want to give the demo to a client on a cluster of more than 100 machines and you do not have so many machines currently available with you, then in such case cloud computing plays a very crucial role.
Cloud computing plays a significant role within the Big Data world, by providing parallelly expandable and enhanced infrastructure that supports the practical implementation of Big Data.
Cloud Computing and Big Data
In cloud computing, all data is collected in data centers and then distributed to the end-users. Beyond, automatic backups and recovery of data are also guaranteed for business continuity, all such resources are available in the cloud. We do not know the exact physical location of these resources offered to us. You just need replica terminals like desktops, laptops, phones, etc. and an internet connection.
There are several ways to access the cloud:
1) Software as a service (SAAS) ex. Salesforce.com, dropbox, google drive, etc.
2) Platform as a service (PAAS).
3) Infrastructure as a service (IAAS).
Features of Cloud Computing
Let us discuss a few features of cloud computing:
a. Scalability
Scalability is offered by using distributed computing
b. Elasticity
Users are permitted to use and pay for only that much resource which they are using. In cloud computing, elasticity is defined as the measure to which a system is able to adjust to workload variations in an autonomic manner so that at any time the available resources match the present demand as closely as possible.
c. Resource Pooling
Similar resources are permitted to be used by several organizations. The resources are shared for serving different customers via various tenant model, with different resources dynamically assigned and reassigned according to customer demand.
d. Self-service
Users are offered easy to use interface through which they can choose the services they needed. A customer can separately provision computing abilities, such as server time and network storage, as needed without demanding human interaction.
e. Low Costs
It charges you based only on the number of computing resources we use and you need not buy costly infrastructure. Pricing on an effectiveness computing basis is usage-based and fewer IT skills are required for implementation.
f. Fault Tolerance
Permits recovery in case of a part in the cloud system fails to react.
Cloud Deployment Models
Primarily there are mainly 2 types of cloud deployments models:
· Public cloud – When the services are open over a network for public use then it is known as Public cloud.
· Private Cloud – It functions solely for a single organization, whether controlled within or by a third-party and hosted either internally or externally.
Cloud Delivery Models
Cloud services are classified as below:
1) Infrastructure as a service (IaaS): It means complete infrastructure will be offered to you. Maintenance associated jobs will be done by the cloud provider and you can use it as per your requisite. IaaS can be used as public and private both.
Examples of IaaS are virtual machines, load balancers, and network-attached storage(NAS).
2) Platform as a service (PaaS): In PaaS, we have object storage, queuing, databases, runtime, etc. All these we can get directly from the cloud provider. It is our concern to configure and use that. Cloud Providers will give us the resources but connectivity to our database and other similar activities are our accountability. Examples of PaaS are Windows Azure and Google App Engine.
3) Software as a service (SaaS) ex. Salesforce.com, dropbox, google drive, etc. In SaaS, we do not have any responsibility. Users will be using the application that is running on the cloud. The infrastructure setup is the responsibility of the service provider. For SaaS to work, the infrastructure (IaaS) and the platform (PaaS) must be in place.
Cloud for Big Data
Below are a few examples of how cloud applications are used for Big Data:
IaaS in a public cloud: Using a cloud provider’s infrastructure for Big Data services, provide access to nearly limitless storage and compute power. IaaS can be used by organization customers to create cost-effective and easily accessible IT solutions where cloud providers tolerate the complications and costs of managing the fundamental hardware. If the balance of a business customer’s operations fluctuate, or they are looking to expand, they can tap into the cloud resource as and when they need it rather than purchase, install and integrate hardware themselves.
PaaS in a private cloud: PaaS suppliers are starting to combine Big Data technologies such as Hadoop and MapReduce into their PaaS subscriptions, which eliminate the dealing with the complications of managing individual software and hardware elements. For Instance, web developers can use separate PaaS environments at every stage of development, testing, and finally hosting their websites. Still, businesses that are developing their own internal software can also use Platform as a Service, mostly to create different protected development and testing environments.
SaaS in a hybrid cloud: Numerous organizations feel the need to evaluate the customer’s voice, particularly on social media. SaaS suppliers provide the platform for the evaluation as well as social media information. Office software is the best instance of businesses using SaaS. Jobs related to accounting, sales, invoicing and planning can all be done through SaaS. Businesses may desire to use one piece of software that performs all of these jobs or several that each performs different tasks. The software can be subscribed through the internet and then accessed online by any computer in the office using a username and password. If required, they can switch to software that fulfills their requirements in a better way. Everybody who needs access to a particular piece of software can be set up as a user, whether it is one or two people or every employee in an organization that employs hundreds.
Providers in the Big Data Cloud Market
Cloud computing firms come in all shapes and sizes. Every large software suppliers either have already started offerings in cloud space or are in the process of launching. In adding together there are many startups that have interesting products in cloud space. Now we have a list of major suppliers of cloud computing. Some of the cloud providers are google, Citrix, net magic, RedHat, Rackspace, etc. Amazon is the primary cloud provider between all. Microsoft is also offering cloud services and it is called Azure.
IaaS cloud computing companies:
· Amazon’s offerings include S3 (Data storage/file system), SimpleDB (non-relational database), and EC2 (computing servers).
· Rackspace offers Cloud Drive (Data storage/file system), Cloud Sites (web site hosting on the cloud), and Cloud Servers(computing servers).
· IBM offers include Smart Business Storage Cloud and Computing on Demand (CoD).
· AT&T offers Synaptic Storage and Synaptic Compute as a service.
Platform as a Service cloud computing companies:
· Google’s AppEngine is a development platform that is developed in Python and Java.
· Microsoft Azure offers a development platform based upon .Net.
Software as Service companies:
· In SaaS, Google offers space that includes Google Docs, Gmail, Google Calendar, and Picasa.
· IBM offers LotusLive iNotes, a web-based email service for messaging and calendaring capabilities to business users.
· Zoho offers online products similar to Microsoft office suite.
Issues in Using Cloud Services
Few important cloud services issues are as listed:
a. Data Security
Companies must confirm that their agreement with the cloud service provider guarantees data security. Giving over private data to others concerns some persons. Corporate executives might hesitate to take benefit of a cloud computing system because they can’t keep their company’s information under lock and key.
b. Performance
Factors of cloud performance must be specified in the agreement and enumerated wherever possible. Exemptions must be clearly noted. Service-Level Agreement (SLA) should visibly state all the terms and conditions between a service user and a service provider to guarantee proper performance.
c. Compliance
Cloud services must be well-suited with the compliance needs of the business. Few companies are also concerned about regulatory issues. Market viewers say that around 50 percent of people worry that they will be bound to one provider of cloud storage.
d. Legal Issues
Organizations must confirm that the location of the physical resources of the cloud does not bring any legal issue. The cloud presents a number of legal challenges towards privacy issues involved in data stored in different locations in the cloud, additionally improving the risk of confidentiality and privacy breaches.
e. Costs
Organizations should be aware of all the expenses involved with the use of cloud, and use the services in a precise way as cloud offers pay as per usage method of the cost acquired by the company.