+447737184217 support@onlinenursingwriter.com

Who Uses Hadoop?

Who Uses Hadoop?


An introduction to

Hello and welcome to An Introduction to Hadoop

Data Everywhere

“Every two days now we create as much information as we did from the dawn of civilization up until  2003”

Eric Schmidt

then CEO of Google

Aug 4, 2010

Read this quote. That data is something like 4 exabytes.

The Hadoop Project

Originally based on papers published by Google in 2003 and 2004

Hadoop started in 2006 at Yahoo!

  • Top level Apache Foundation project
  • Large, active user base, user groups
  • Very active development, strong development team

One way to do that analysis is through Hadoop

Who Uses Hadoop?

Rackspace for log processing. Netflix for recommendations. LinkedIn for social graph. SU for page recommendations.

Hadoop Components

Storage

Self-healing

high-bandwidth

clustered storage

Processing

Fault-tolerant

distributed

processing

HDFS cluster/healing. MapReduce

HDFS Basics

HDFS is a filesystem written in Java

  • Sits on top of a native filesystem
  • Provides redundant storage for massive amounts of data
  • Use cheap(ish), unreliable computers

Let’s talk about HDFS

HDFS Data

  • Data is split into blocks and stored on multiple nodes in the cluster
  • Each block is usually 64 MB or 128 MB (conf)
  • Each block is replicated multiple times (conf)
  • Replicas stored on different data nodes
  • Large files, 100 MB+

What is MapReduce?

MapReduce is a method for distributing a task across multiple nodes

Automatic parallelization and distribution

  • Each node processes data stored on that node (processing goes to the data, unlike Databases where data is brought to the query engine)

Hello and welcome to An Introduction to Hadoop

Read this quote. That data is something like 4 exabytes.

One way to do that analysis is through Hadoop

Rackspace for log processing. Netflix for recommendations. LinkedIn for social graph. SU for page recommendations.

HDFS cluster/healing. MapReduce

Let’s talk about HDFS

 

Submit Your Assignment and get professional help from our qualified experts!


Who Uses Hadoop? was first posted on July 25, 2019 at 9:30 am.
©2019 "Submit Your Assignment". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement. Please contact me at support@classassignmenthelp.com