I524 Lectures


This page is under construction, but most lectures are already available. All tracks will change considerably. If you want to work ahead, start with the theory track.


At the end of the page, you find a link to unreleased lectures.

Based on our experience with residential and online classes we will for the first time not require that you have to do the class videos at a particular time once they are released. This, however, has the danger that you are not watching them at all and you cheat yourself as you do not allow yourself the educational lessons that this class offers to you. It also requires you to assemble your own schedule for watching the videos that will have to be managed through Github as part of a README.md file in your git repository. You will need to do the technology track, the communications track, as well as the theory track.

Theory Track:
Some lectures have been designed to introduce you to a number of technologies. These lectures are of more theoretical nature and do not require much hands on activities. Thus you can start them any time.
Collaboration Track:
These lectures provide the tools for you to collaborate with your peers and with instructors.
Systems Track:
These lectures cover topics that are fundamental to executing your project.
Technology Track:
These are lectures with strong technology content and introduce you to using a selected number of technologies as part of the class. It is expected that you will use them as part of the project. Instead of slowing you down with graded homework we expect that you learn these technologies and reuse them as part of the project. It would be a big mistake to start the project 2 weeks before the semester ends, you will not succeed. You must start your project in the first month of the course. Progress is reported on a monthly basis while the report is updated and snapshot every month. We will monitor your progress and include them into the discussion grade. For residential students there should be no reason why you can not provide a monthly update. For online students, a valid update would be: “I changed my company and could not work on the project due to moving”. This will give you some points if submitted in time. However, if you submit nothing, we will not issue any points.


Lectures - Theory Track

Theory Track
Topic Description Resources Length Available
Overview Course Overview Slides   Jan 1
  Class Overview - Part 1 Video 11:29 Jan 1
  Class Overview - Part 2 Video 04:10 Jan 1
  Class Overview - Part 3 Video 12:41 Jan 1
Web Page Course Web Page Web Page   Jan 1
  Class Web Page - Part 1 Video 11:25 Jan 1
  Class Web Page - Part 2 Video 17:31 Jan 1
Techlist.1 TechList.1 Web Page Web Page   Jan 1
  TechList.1 Homework Video 40:08 Jan 14
Introduction Course Introduction Slides   Jan 14
  Introduction Video 0:13:59 Jan 14
  Introduction - Real World Big Data Video 0:15:28 Jan 14
  Introduction - Basic Trends and Jobs Video 0:10:57 Jan 14
Acess Patterns Data Access Patterns and Introduction to using HPC-ABDS Slides   Jan 14
  1. Introduction to HPC-ABDS Software and Access Patterns
Video - Resource 1 0:27:45 Jan 14
  1. Science Examples (Data Access Patterns)
Video - Resource 2 0:18:38 Jan 14
  1. Remaining General Access Patterns
Video 0:11:26 Jan 14
  1. Summary of HPC-ABDS Layers 1 - 6
Video 0:14:32 Jan 14
  1. Summary of HPC-ABDS Layers 7 - 13
Video 0:30:52 Jan 14
  1. Summary of HPC-ABDS Layers 14 - 17
Video 0:28:02 Jan 14
  Final Part Summary of Stack Video 0:20:20 Jan 14
Application Structure Big Data Application Structure Slides   Jan 14
  NIST Big Data Sub Groups Video 0:23:25 Jan 14
  Big Data Patterns - Sources of Parallelism Video 0:23:51 Jan 14
  First and Second Set of Features Video 0:18:26 Jan 14
  Machine Learning Aspect of Second Feature Set and the Third Set Video 0:18:38 Jan 14
Application Aspects Aspects of Big Data Applications Slides   Jan 14
  Other sources of use cases and Classical Databases/SQL Solutions Video 0:16:50 Jan 14
  SQL Solutions - Machine Learning Example - and MapReduce Video 0:18:49 Jan 14
  Clouds vs HPC - Data Intensive vs. Simulation Problems Video 0:20:26 Jan 14
Applications Big Data Applications and Generalizing their Structure Slides   Jan 14
  NIST UseCases and Image Based Applications Examples I Video 0:25:20 Jan 14
  Image Based Applications II Video 0:15:23 Jan 14
  Internet of Things Based Applications Video 0:25:25 Jan 14
  Big Data Patterns - the Ogres and their Facets I Video 0:22:44 Jan 14
  Facets of the Big Data Ogres II Video 0:15:09 Jan 14
Other More of Software Stack Video 0:24:00 Jan 14

Lectures - Collaboration Track

Collaboration Track
Topic Description Resources Length Available
Organization Lessons vs Lectures Web Page   Jan 1
Piazza Information about Piazza PDF   Jan 1
Web Page Contributing to the Web Page Web Page   Jan 1
Github Overview and Introduction Web page   Jan 1
  Install Instructions Web page   Jan 1
  config Video 2:47 Jan 1
  fork Video 1:41 Jan 1
  checkout Video 3:11 Jan 1
  pull Video 4:26 Jan 1
  branch Video 2:25 Jan 1
  merge Video 4:50 Jan 1
  rebase Video 4:20 Jan 1
  GUI Video 3:47 Jan 1
  Windows - unsupported Video 1:25 Jan 1
Paper How to write a paper by Simon Peyton Jones Video 34:24 Jan 1
  LaTeX - Overview of LaTeX Resources Web Page   Jan 1
  (optional) ShareLaTeX Video 8:49 Jan 1
  jabref Video 14:41 Jan 1
  bibtex Web Page   Jan 1
  Report Format Web Page - Git - PDF   Jan 1
  Advice based on paper 1 submissions Web Page   Mar 30
RST (Draft) Restructured Text Web Page   Jan 1

Lectures - Systems

Systems Track
Treat Quantity Description Length Available
Ubuntu Development OS for the class Web page   Jan 7
Virtualbox Virtualbox for class Web page   Jan 7
  Instalation of ubuntu in virtualbox Video   Jan 7
  Instalation of guest additions in virtualbox Video   Jan 7
Shell Linux Shell Video | Web Page   Jan 7
Python Introduction to Python Web page   Jan 7
  Python for Big Data Web Page   Jan 7
  Python CMD Web Page   Jan 7
  Python CMD5 Web Page   Mar 30
  Cloudmesh.rest Web Page   Mar 31
  (Optional) Python pyenv Web Page   Feb 24
  (Draft - Advanced) Python Fingerprint example Web page   Jan 7
  PyCharm Video   Jan 7
Refcards Reference cards Web Page   Jan 7
Emacs (Optional) Useful emacs commands Web Page   Jan 7
Cloudmesh Client Installation of Cloudmesh Client Video | Web Page   Feb 14
Setting up a cluster and Hadoop / Pig / Spark with Cloudmesh Video | Web Page   Mar 20
  Use only one public IP address Web Page   Apr 11
Ansible Starting Point Video | Web Page   Feb 14
  Roles and Others Video | Web Page   Feb 14
  Ansible Galaxy Web Page   Feb 14

Unreleased Lectures

A list of unreleased lectures that we are currently working on is available here: Unreleased Lectures