This book is written as a textbook on cloud computing for educational programs at colleges. It uses an immersive "hands-on approach" to transfer knowledge to the reader by providing the necessary guidance and knowledge to develop working code for real-world cloud applications.
It is organised into three main parts. Part I covers technologies that form the foundations of cloud computing. These include topics such as virtualization, load balancing, scalability and elasticity, deployment, and replication. Part II introduces the reader to the design and programming aspects of cloud computing. Case studies on design and implementation of several cloud applications in the areas such as image processing, live streaming and social networks analytics are provided. Part III introduces the reader to specialised aspects of cloud computing including cloud application benchmarking, cloud security, multimedia applications and big data analytics. Case studies in areas such as IT, healthcare, transportation, networking and education are provided.
The book contains hundreds of figures and tested code samples that serve to provide a rigorous, "no hype" guide to cloud computing. Review questions and exercises are provided at the end of each chapter. The focus of the book is on getting the reader firmly on track to developing robust cloud applications on their own. Thus, readers can use the exercises to develop their own applications on cloud platforms, such as those from Amazon Web Services, Google Cloud, and Microsoft's Windows Azure. Additional support is available at the book's website: www.cloudcomputingbook.info
Arshdeep Bahga is a research scientist at Georgia Institute of Technology. His research interests include cloud computing and big data analytics. Arshdeep has authored several scientific publications in peer-reviewed journals in the areas of cloud computing and big data.
Vijay Madisetti is a professor of computer engineering at Georgia Institute of Technology. He is a Fellow of IEEE and has received the 2006 Terman Medal from the American Society of Engineering Education and HP Corporation.
Part I Introduction and Concepts
1 Introduction to Cloud Computing 1.1 Introduction 1.1.1 Definition of Cloud Computing 1.2 Characteristics of Cloud Computing 1.3 Cloud Models 1.3.1 Service Models 1.3.2 Deployment Models 1.4 Cloud Services Examples 1.4.1 IaaS: Amazon EC2, Google Compute Engine, Azure VMs 1.4.2 PaaS: Google App Engine 1.4.3 SaaS: Salesforce 1.5 Cloud-based Services & Applications 1.5.1 Cloud Computing for Healthcare 1.5.2 Cloud Computing for Energy Systems 1.5.3 Cloud Computing for Transportation Systems 1.5.4 Cloud Computing for Manufacturing Industry 1.5.5 Cloud Computing for Government 1.5.6 Cloud Computing for Education 1.5.7 Cloud Computing for Mobile Communication
2 Cloud Concepts & Technologies 2.1 Virtualization 2.2 Load Balancing 2.3 Scalability & Elasticity 2.4 Deployment 2.5 Replication 2.6 Monitoring 2.7 Software Defined Networking 2.8 Network Function Virtualization 2.9 MapReduce 2.10 Identity and Access Management 2.11 Service Level Agreements 2.12 Billing
3 Cloud Services & Platforms 3.1 Compute Services 3.1.1 Amazon Elastic Compute Cloud 3.1.2 Google Compute Engine 3.1.3 Windows Azure Virtual Machines 3.2 Storage Services 3.2.1 Amazon Simple Storage Service 3.2.2 Google Cloud Storage 3.2.3 Windows Azure Storage 3.3 Database Services 3.3.1 Amazon Relational Data Store 3.3.2 Amazon DynamoDB 3.3.3 Google Cloud SQL 3.3.4 Google Cloud Datastore 3.3.5 Windows Azure SQL Database 3.3.6 Windows Azure Table Service 3.4 Application Services 3.4.1 Application Runtimes & Frameworks 3.4.2 Queuing Services 3.4.3 Email Services 3.4.4 Notification Services 3.4.5 Media Services 3.5 Content Delivery Services 3.5.1 Amazon CloudFront 3.5.2 Windows Azure Content Delivery Network 3.6 Analytics Services 3.6.1 Amazon Elastic MapReduce 3.6.2 Google MapReduce Service 3.6.3 Google BigQuery 3.6.4 Windows Azure HDInsight 3.7 Deployment & Management Services 3.7.1 Amazon Elastic Beanstalk 3.7.2 Amazon CloudFormation 3.8 Identity & Access Management Services 3.8.1 Amazon Identity & Access Management 3.8.2 Windows Azure Active Directory 3.9 Open Source Private Cloud Software 3.9.1 CloudStack 3.9.2 Eucalyptus 3.9.3 OpenStack
4 Hadoop & MapReduce 4.1 Apache Hadoop 4.2 Hadoop MapReduce Job Execution 4.2.1 NameNode 4.2.2 Secondary NameNode 4.2.3 JobTracker 4.2.4 TaskTracker 4.2.5 DataNode 4.2.6 MapReduce Job Execution Workflow 4.3 Hadoop Schedulers 4.3.1 FIFO 4.3.2 Fair Scheduler 4.3.3 Capacity Scheduler 4.4 Hadoop Cluster Setup 4.4.1 Install Java 4.4.2 Install Hadoop 4.4.3 Networking 4.4.4 Configure Hadoop 4.4.5 Starting and Stopping Hadoop Cluster
Part II Developing for Cloud
5 Cloud Application Design 5.1 Introduction 5.2 Design Considerations for Cloud Applications 5.2.1 Scalability 5.2.2 Reliability & Availability 5.2.3 Security 5.2.4 Maintenance & Upgradation 5.2.5 Performance 5.3 Reference Architectures for Cloud Applications 5.4 Cloud Application Design Methodologies 5.4.1 Service Oriented Architecture 5.4.2 Cloud Component Model 5.4.3 IaaS, PaaS and SaaS Services for Cloud Applications 5.4.4 Model View Controller 5.4.5 RESTful Web Services 5.5 Data Storage Approaches 5.5.1 Relational (SQL) Approach 5.5.2 Non-Relational (No-SQL) Approach
6 Python Basics 6.1 Introduction 6.2 Installing Python 6.3 Python Data Types & Data Structures 6.3.1 Numbers 6.3.2 Strings 6.3.3 Lists 6.3.4 Tuples 6.3.5 Dictionaries 6.3.6 Type Conversions 6.4 Control Flow 6.4.1 if 6.4.2 for 6.4.3 while 6.4.4 range 6.4.5 break/continue 6.4.6 pass 6.5 Functions 6.6 Modules 6.7 Packages 6.8 File Handling 6.9 Date/Time Operations 6.10 Classes
7 Python for Cloud 7.1 Python for Amazon Web Services 7.1.1 Amazon EC2 7.1.2 Amazon AutoScaling 7.1.3 Amazon S3 7.1.4 Amazon RDS 7.1.5 Amazon DynamoDB 7.1.6 Amazon SQS 7.1.7 Amazon EMR 7.2 Python for Google Cloud Platform 7.2.1 Google Compute Engine 7.2.2 Google Cloud Storage 7.2.3 Google Cloud SQL 7.2.4 Google BigQuery 7.2.5 Google Cloud Datastore 7.2.6 Google App Engine 7.3 Python for Windows Azure 7.3.1 Azure Cloud Service 7.3.2 Azure Virtual Machines 7.3.3 Azure Storage 7.4 Python for MapReduce 7.5 Python Packages of Interest 7.5.1 JSON 7.5.2 XML 7.5.3 HTTPLib & URLLib 7.5.4 SMTPLib 7.5.5 NumPy 7.5.6 Scikit-learn 7.6 Python Web Application Framework - Django 7.6.1 Django Architecture 7.6.2 Starting Development with Django 7.6.3 Django Case Study - Blogging App 7.7 Designing a RESTful Web API
8 Cloud Application Development in Python 8.1 Design Approaches 8.1.1 Design Methodology for IaaS Service Model 8.1.2 Design Methodology for PaaS Service Model 8.2 Image Processing App 8.3 Document Storage App 8.4 MapReduce App 8.5 Social Media Analytics App
Part III Advanced Topics
9 Big Data Analytics 9.1 Introduction 9.2 Clustering Big Data 9.2.1 k-Means Clustering 9.2.2 DBSCAN Clustering 9.2.3 Parallelizing Clustering Algorithms Using MapReduce 9.3 Classification of Big Data 9.3.1 Naive Bayes 9.3.2 Decision Trees 9.3.3 Random Forest 9.3.4 Support Vector Machine 9.4 Recommendation Systems
10 Multimedia Cloud 10.1 Introduction 10.2 Case Study: Live Video Streaming App 10.3 Streaming Protocols 10.3.1 RTMP Streaming 10.3.2 HTTP Live Streaming 10.3.3 HTTP Dynamic Streaming 10.4 Case Study: Video Transcoding App
11 Cloud Application Benchmarking & Tuning 11.1 Introduction 11.1.1 Trace Collection/Generation 11.1.2 Workload Modeling 11.1.3 Workload Specification 11.1.4 Synthetic Workload Generation 11.1.5 User Emulation vs. Aggregate Workloads 11.2 Workload Characteristics 11.3 Application Performance Metrics 11.4 Design Considerations for a Benchmarking Methodology 11.5 Benchmarking Tools 11.5.1 Types of Tests 11.6 Deployment Prototyping 11.7 Load Testing & Bottleneck Detection Case Study 11.8 Hadoop Benchmarking Case Study
12 Cloud Security 12.1 Introduction 12.2 CSA Cloud Security Architecture 12.3 Authentication 12.3.1 Single Sign-on (SSO) 12.4 Authorization 12.5 Identity & Access Management 12.6 Data Security 12.6.1 Securing Data at Rest 12.6.2 Securing Data in Motion 12.7 Key Management 12.8 Auditing
13 Cloud for Industry, Healthcare & Education 13.1 Cloud Computing for Healthcare 13.2 Cloud Computing for Energy Systems 13.3 Cloud Computing for Transportation Systems 13.4 Cloud Computing for Manufacturing Industry 13.5 Cloud Computing for Education
Appendix-A: Setting up Ubuntu VM Appendix-B: Setting up Django
Bibliography Index