Big Data video edition

Video description

"Transcends individual tools or platforms. Required reading for anyone working with big data systems."
Jonathan Esterhazy, Groupon

Big Data video edition

Video description

"Transcends individual tools or platforms. Required reading for anyone working with big data systems."
Jonathan Esterhazy, Groupon

Big Data teaches you to build big data systems using an architecture that takes advantage of clustered hardware along with new tools designed specifically to capture and analyze web-scale data. It describes a scalable, easy-to-understand approach to big data systems that can be built and run by a small team. Following a realistic example, this Video Editions book guides readers through the theory of big data systems, how to implement them in practice, and how to deploy and operate them once they're built.

Inside:

Introduction to big data systems
Real-time processing of web-scale data
Tools like Hadoop, Cassandra, and Storm
Extensions to traditional database skills

This Video Editions book requires no previous exposure to large-scale data analysis or NoSQL tools. Familiarity with traditional databases is helpful.

Nathan Marz is the creator of Apache Storm and the originator of the Lambda Architecture for big data systems. James Warren is an analytics architect with a background in machine learning and scientific computing.

A comprehensive, example-driven tour of the Lambda Architecture with its originator as your guide.
Mark Fisher, Pivotal

Contains wisdom that can only be gathered after tackling many big data projects. A must-read.
Pere Ferrera Bertran, Datasalt

The de facto guide to streamlining your data pipeline in batch and near-real time.
Alex Holmes, Author of "Hadoop in Practice"

NARRATED BY MARK THOMAS AND CHRIS PENICK

A NEW PARADIGM FOR BIG DATA

Chapter 1. A new paradigm for Big Data

Chapter 1. Scaling with a traditional database

Chapter 1. NoSQL is not a panacea

Chapter 1. The problems with fully incremental architectures

Chapter 1. Lambda Architecture

Chapter 1. Batch and serving layers satisfy almost all properties

Chapter 1. Recent trends in technology

PART 1 BATCH LAYER

Chapter 2. Data model for Big Data

Chapter 2. Data is raw

Chapter 2. Data is immutable

Chapter 2. The fact-based model for representing data

Chapter 2. Graph schemas

Chapter 3. Data model for Big Data: Illustration

Chapter 3. Tying everything together into data objects

Chapter 4. Data storage on the batch layer

Chapter 4. Storing a master dataset with a distributed filesystem

Chapter 5. Data storage on the batch layer: Illustration

Chapter 5. Data storage in the batch layer with Pail

Chapter 5. Storing the master dataset for SuperWebAnalytics.com

Chapter 6. Batch layer

Chapter 6. Recomputation algorithms vs. incremental algorithms

Chapter 6. Scalability in the batch layer

Chapter 6. Low-level nature of MapReduce

Chapter 6. Pipe diagrams: a higher-level way of thinking about batch computation

Chapter 7. Batch layer: Illustration

Chapter 7. An introduction to JCascalog

Chapter 7. Grouping and aggregators

Chapter 7. Composition

Chapter 8. An example batch layer: Architecture and algorithms

Chapter 8. Workflow overview

Chapter 8. Deduplicate pageviews

Chapter 9. An example batch layer: Implementation

Chapter 9. URL normalization

PART 2 SERVING LAYER

Chapter 10. Serving layer

Chapter 10. The serving layer solution to the normalization/denormalization problem

Chapter 10. Designing a serving layer for SuperWebAnalytics.com

Chapter 10. Contrasting with a fully incremental solution

Chapter 10. Comparing to the Lambda Architecture solution

Chapter 11. Serving layer: Illustration

Chapter 11. Building the serving layer for SuperWebAnalytics.com

PART 3 SPEED LAYER

Chapter 12. Realtime views

Chapter 12. Storing realtime views

Chapter 12. Challenges of incremental computation

Chapter 12. Asynchronous versus synchronous updates

Chapter 13. Realtime views: Illustration

Chapter 14. Queuing and stream processing

Chapter 14. Stream processing

Chapter 14. Higher-level, one-at-a-time stream processing

Chapter 14. Guaranteeing message processing

Chapter 14. SuperWebAnalytics.com speed layer

Chapter 14. Topology structure

Chapter 15. Queuing and stream processing: Illustration

Chapter 15. Implementing the SuperWebAnalytics.com uniques-over-time speed layer

Chapter 16. Micro-batch stream processing

Chapter 16. Micro-batch processing topologies

Chapter 16. Core concepts of micro-batch stream processing

Chapter 16. Extending pipe diagrams for micro-batch processing

Chapter 16. Bounce-rate analysis

Chapter 16. Another look at the bounce-rate-analysis example

Chapter 17. Micro-batch stream processing: Illustration

Chapter 17. Finishing the SuperWebAnalytics.com speed layer

Chapter 17. Fully fault-tolerant, in-memory, micro-batch processing

Chapter 18. Lambda Architecture in depth

Chapter 18. Batch and serving layers

Chapter 18. Incremental batch processing - part 1

Chapter 18. Incremental batch processing - part 2

Chapter 18. Measuring and optimizing batch layer resource usage

Chapter 18. Speed layer

Start your Free Trial

Self paced

Go to the Course
We have partnered with providers to bring you collection of courses, When you buy through links on our site, we may earn an affiliate commission from provider.