Weekly Plan

Overview of the schedule

Schedule Section 2 (HPC-ABDS Technologies)
Week Topic Due
Week 1 Gaining Access to FutureSystems and Core Technologies 03/20
Week 2 The Basics of OpenStack 03/27
Week 3 Cloudmesh - Cloud Management Software 04/03
Week 4 IT Operations - Automation and Orchestration 04/20
Week 5 Virtual Clusters I (First Appearance of Hadoop) 04/29
Week 6 Virtual Clusters II (Composite Cluster with Sub-Clusters) 05/14
Week 7 Other Technologies 05/13

Notice

System Notice

Week 1

Gaining Access to FutureSystems and Core Technologies

In this week, you will learn how to gain access to the FutureSystems resources. Some of the lessons have been prepared for the beginners to help understand the basics of Linux operating systems and the collaboration tools i.e. GitHub, Google hangout and remote desktop. Please watch video lessons and read online materials on this page. It also covers Unix shell scripting, SSH and other utilities with various exercises.

Collaboration Tools
Topic Video Text Lab sessions Study Material By Lab Session Homework
Overview and Introduction 2 mins 10 mins   03/23 N/A
Google
  • Google+, Hangout, Remote Desktop
4 mins 15 mins   03/23 N/A
Shell Access 3 mins 10 mins   03/23 N/A
GitHub 18 mins 30 mins 10 mins 03/23 04/03
System Access to FutureSystems
Topic Video Text Lab sessions Study Material By Lab Session Homework
ssh-keygen 4 mins 10 mins see (a) below 03/23 04/03 see (a) below
Account Creation 12 mins 10 mins see (a) below 03/23 04/03 see (a) below
Remote Login 6 mins 10 mins see (a) below 03/23 04/03 see (a) below
Putty for Windows 11 mins 10 mins see (a) below 03/23 04/03 see (a) below
  • (a) Create an account on the FutureSystems Portal, upload your ssh key and log into india. Dependent on your OS you may or may not need to use putty. Please identify a location from where you can login via ssh. Maybe such a location exists outside of your office.
Linux Basics
Topic Video Text Lab sessions Study Material By Lab Session Homework
Overview and Introduction 4 mins 5 mins   03/23  
Shell Scripting 15 mins 30 mins 5 mins, 5 mins, 10 mins, 10 mins 03/23 04/03 all 4 Labs
Editors
  • Emacs, vi, and nano
5 mins 30 mins see (b) below 03/23 04/03 see (b) below
Python
  • virtualenv, Pypi
27 mins 1 hour 30 mins 03/23 04/03
Package Managers
  • yum, apt-get, and brew
3 mins 10 mins see (c) below 03/23 04/03 see (c) below
Advanced SSH
  • SSH Config and Tunnel
3 mins 20 mins 5 mins, 5 mins 03/23 04/03 both Labs
Modules 3 mins 10 mins 5 mins 03/23 04/03
  • (b) Find an editor that you will be useing to do your programming with. For advanced Python programming we recommend PyCharm. However you can probably only use this on your local computer. The way you could use it is to edit python locally, check the code into github and check it out into your vm or your login on india.futuresystems.org. This is how many of us work.
  • (c) locate a package that you install on your VM that you started with Openstack. Provide a verification that the package was installed (log). Do not forget to delete the VM after you are done. Which package manager is used on ubuntu?
Length of the lessons in Week 1
  • Total of video lessons: 2 hours
  • Total of study materials: 4 hours and 30 minutes
  • Total of lab sessions: 1 hour and 30 minutes

Week 2

Introduction to OpenStack and Public Clouds

OpenStack is a open-source cloud computing software platform and a community-driven project. You can use OpenStack to build a cloud infrastructure in your public or private network, or you can simply use cloud software for your services. The lessons in this week are specifically prepared to try OpenStack Software and give you the confidence and understanding of using IaaS cloud platforms. There are tutorial lessons to explore OpenStack web dashboard (Horizon) and compute engine (Nova) including Public Clouds e.g. Amazon EC2 or Microsoft Azure.

Basics of OpenStack
Topic Video Text Lab sessions Study Material By Lab Session Homework
Introduction and Overview 12 mins 10 mins   03/30  
OpenStack for Beginners 27 mins     03/30  
– Compute Engine (Nova)   1 hour 30 mins 03/30 04/10
– Web Dashboard (Horizon)   15 mins 15 mins 03/30 04/10
Storage (Swift) 3 mins 10 mins   03/30  
Network (Neutron) 3 mins 10 mins   03/30  
Introduction to OpenStack Juno Release 2 mins 10 mins   03/30  
Other IaaS Platforms - Public Commercial Clouds
Topic Video Text Lab sessions Study Material By Lab Session Homework
Amazon Web Services (AWS) 16 mins 30 mins 45 mins (optional, not required) 03/30  
Microsoft Azure 29 mins 50 mins 10 mins (optional, not required) 03/30  
Additional (optional) Further Study Materials
Topic Video Text Lab sessions Study Material By Lab Session Homework
OpenStack for Beginners
  • Compute Engine (Nova)
  2 hours 50 mins Not due Not due
Other IaaS Platforms
  • Public Commercial Clouds
    • Microsoft Azure
    50 mins Not due Not due
Length of the lessons in Week 2
  • Total of video lessons: 1 hour and 30 minutes
  • Total of study materials: 3 hours and 15 minutes
  • Total of lab sessions: 1 hours 40 minutes

Week 3

Cloudmesh - Cloud Management Software

Cloudmesh is a cloud resource management software written in Python. It automates launching multiple VM instances across different cloud platforms including Amazon EC2, Microsoft Azure Virtual Machine, HP Cloud, OpenStack, and Eucalyptus. The web interface of Cloudmesh helps users and administrators manage entire cloud resources with the most cutting-edge technologies such as Apache LibCloud, Celery, IPython, Flask, Fabric, Docopt, YAML, MongoDB, and Sphinx. Command Line Tools and Rest APIs are also supported.

Basics of Cloudmesh
Topic Video Text Lab sessions Study Material By Lab Session Homework
Introduction and Overview 29 mins 30 mins   04/06 Not due
Cloudmesh for Beginners
Topic Video Text Lab sessions Study Material By Lab Session Homework
Installation on a local machine (optional) 18 mins 30 mins (not required, only read the text and watch the video) 04/06 N/A
Installation on a virtual machine OpenStack 33 mins 30 mins follow the text and video 04/06 04/17
Command Line Tools (CLI) 12 mins 30 mins use the previously created VM and follow text and video use cm help and review man pages 04/06 04/17
Web Interface (GUI) 16 mins 30 mins Excersise 4: 20 mins (optional) 04/06 04/17
Python APIs 15 mins 30 mins Excersise 1 (10 mins), Excersise 2 (10 mins) 04/06 04/17
IPython on Cloudmesh (optional) 15 mins 20 mins (not required, only read text and watch video) 04/06 N/A
Advanced Cloudmesh
Topic Video Text Lab sessions Study Material By Lab Session Homework
Adding new Commands via a Python Package 5 mins 5 mins 1 hour 04/06 04/17
Virtual Clusters with Cloudmesh
  • SSH Connections between nodes, Host Configuration
5 mins 20 mins see text and video 04/06 04/17
Length of the lessons in Week 3
  • Total of video lessons: 2 hours and 33 minutes
  • Total of study materials: 4 hours and 15 minutes
  • Total of lab sessions: 1 hour and 30 minutes

Week 4

In this week, you will learn open-source configuration management (CM) software as part of IT automation and orchestration. We focus on Ansible and OpenStack Heat to review the system configuration and management but Salt, Puppet, Chef, and Juju are introduced to explore other tools as well. With different features of these software, you will see which tool is ideal for your system environment and understand basic CM techniques. We have a few lab sessions to provide hands-on experience about deploying and configuring applications on IT infrastructure.

IT Operations - Automation and Orchestration

DevOps Tools
Topic Video Text Lab sessions Study Material By Lab Session Homework
Ansible 17 mins 1.5 hours 30 mins 04/21 04/24
SaltStack   1.5 hours 10 mins (optional)    
Puppet   1 hour 20 mins (optional)    
Chef 35 mins 1 hour 30 mins (optional) 04/21  
OpenStack Heat 20 mins 1 hour 1 hour 04/21 04/24
Ubuntu Juju   30 mins 10 mins (optional)    
Length of the lessons in Week 4
  • Total of video lessons: 1 hour and 12 minutes
  • Total of study materials: 2.5 hours
  • Total of lab sessions: 1 hour and 30 minutes
Additional (optional) Lessons
  • Total of optional study materials: 4 hours
  • Total of optional lab sessions: 1 hour and 10 minutes

Week 5

This week, you will learn basics of virtual clusters. Typically, analyzing large data sets containing unstructured data types requires distributed computing resources for data processing with high performance, scalability, and availability. With virtualization technology, cluster computing can be more flexible, effective and cost-efficient in terms of resource utilization. There are three basic tutorials about deploying a virtual cluster, Hadoop cluster and MongoDB Sharded cluster which give you a chance to gain some experience of how to setup virtual clusters manually and configure software with Cloudmesh. In Week 6, advanced topics of virtual clusters will be discussed.

Virtual Clusters I

First Appearance of Hadoop

Virtual Clusters I
Topic Video Text Lab sessions Study Material By Lab Session Homework
Introduction and Overview 4 mins   see video 04/29  
Dynamic Deployment of Arbitrary X Software on Virtual Cluster 4 mins   see video 04/29  
Deploying Virtual Cluster with Cloudmesh 22 mins 30 mins 10 mins (optional) 04/29  
Deploying Hadoop Cluster   45 mins 20 mins (optional) 04/29  
Deploying Hadoop Cluster with Cloudmesh   30 mins see text 04/29  
Hadoop Example: Word Count 33 mins 1 hour see video and text 04/29  
Deploying MongoDB Sharded Cluster 4 mins 1 hour see video and text 04/29  
``cluster`` Cloudmesh Command for Virtual Clusters
  • SSH Connections between nodes, Host Configuration
5 mins 20 mins (repeated practice) 20 mins 04/29 05/01
Length of the lessons in Week 5
  • Total of video lessons: 1 hour and 12 minutes
  • Total of study materials: 4 hours and 05 minutes
  • Total of lab sessions: 50 minutes

Week 6

Virtual Cluster II: Composite Cluster with Sub-Clusters

Virtual Cluster II
Topic Video Text Lab sessions Study Material By Lab Session Homework
Composite Cluster with Sub-Clusters (Not taught in this class)
  • Introduction and Overview
  • Creating a Cross Resource Virtual Cluster
Not taught in this class Not taught in this class      
Apache Hadoop YARN 34 mins 1 hour   05/14  
Apache ZooKeeper 40 mins 1 hour   05/14  
Open MPI Virtual Cluster
  • Introduction and Overview
  • HPC Stack - MPI
  • Cloudmesh HPC (Not taught in this class)
  1 hour   05/14  
HPC Queuing System (optional) 8 mins (optional) 1 hour (optional)   05/14  
MongoDB Virtual Cluster (repeated lesson)
  • Introduction and Overview
  • Sharded MongoDB
4 mins 1 hour   05/14  
Length of the lessons in Week 6
  • Total of video lessons: 1 hour and 26 minutes
  • Total of study materials: 5 hours

Week 7

Other Technologies (under preparation)

Other Technologies
Topic Video Text Lab sessions Study Material By Lab Session Homework
Docker Basics   1 hour   05/21  
VM Software - Vagrant Not yet available 30 min   05/13 05/15
Hadoop MRv2   1 hour      
Hadoop MRv2 with Cloudmesh ``launcher``   30 mins      
Apache ZooKeeper (repeated lesson) 40 mins 1 hour   05/21  
Apache Big Data Stack (ABDS)
  • Apache Zookeeper
  • Apache Storm
  • Apache Mesos
  • Apache HBase
  • Apache Spark
  • Apache Pig
  • Apache Hive
Not yet available Not yet available   05/13 05/15
Glossary Not yet available Not yet available   05/13 05/15