图书标签: Hadoop 大数据 BigData 计算机 分布式 hadoop 机器学习 O'Reilly
发表于2024-05-03
Hadoop: The Definitive Guide pdf epub mobi txt 电子书 下载 2024
Get ready to unlock the power of your data. With the fourth edition of this comprehensive guide, you’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and run Hadoop clusters.
Using Hadoop 2 exclusively, author Tom White presents new chapters on YARN and several Hadoop-related projects such as Parquet, Flume, Crunch, and Spark. You’ll learn about recent changes to Hadoop, and explore new case studies on Hadoop’s role in healthcare systems and genomics data processing.
Learn fundamental components such as MapReduce, HDFS, and YARN
Explore MapReduce in depth, including steps for developing applications with it
Set up and maintain a Hadoop cluster running HDFS and MapReduce on YARN
Learn two data formats: Avro for data serialization and Parquet for nested data
Use data ingestion tools such as Flume (for streaming data) and Sqoop (for bulk data transfer)
Understand how high-level data processing tools like Pig, Hive, Crunch, and Spark work with Hadoop
Learn the HBase distributed database and the ZooKeeper distributed configuration service
Tom White has been an Apache Hadoop committer since February 2007, and is a member of the Apache Software Foundation. He works for Cloudera, a company set up to offer Hadoop support and training. Previously he was as an independent Hadoop consultant, working with companies to set up, use, and extend Hadoop. He has written numerous articles for O'Reilly, java.net and IBM's developerWorks, and has spoken at several conferences, including at ApacheCon 2008 on Hadoop. Tom has a Bachelor's degree in Mathematics from the University of Cambridge and a Master's in Philosophy of Science from the University of Leeds, UK.
2016 NO.4 深入浅出,原理讲的非常透彻。核心是 Hadoop Fundamentals 和 MapReduce 两章,但是后面的 Related Projects 也写的言简意赅,能够突出重点。比如 Flume 这一章会提到一些在 Flume 官网教程上也没提到的要点。
评分很棒
评分很棒
评分很棒
评分读完了,第一次接触大数据相关的内容。这本书的内容相当全面,第一部分讲原理,中间详细介绍基于hadoop的project,最后有具体的应用举例。很多地方理解的还不是很透彻,需要进一步的阅读。
首先,翻译太差,很多句子就是瞎翻,根本不通顺,很多时候你要停下来断句,慢慢去理解。 然后,这本书是很多人去翻译的,很多人连代码都不懂,曾经一段代码看到我蒙圈,去看了一下源代码,好家伙,四行有五个错误。另外,从代码瞎缩进也可以看出这是群没写过代码的人翻的,而且...
评分专门登录来评论的,翻译也太烂了吧,真的真的建议强烈英语阅读能力好的人去读原版书,不要花冤枉钱在这上面,除了文字错误外,里边的图居然也有错,就比如260页的图最后两个年份应该是1901结果这里竟然是1900,我是真滴服了,一本神书被翻译成这样,作者得气死。zsbd zsbd zsbd...
评分Cobub Razor APP数据统计分析工具官网上有篇文章是讲Hadoop Yarn调度器的选择和使用的,我觉得写的挺好的,推荐http://www.cobub.com/the-selection-and-use-of-hadoop-yarn-scheduler/
评分Cobub Razor APP数据统计分析工具官网上有篇文章是讲Hadoop Yarn调度器的选择和使用的,我觉得写的挺好的,推荐http://www.cobub.com/the-selection-and-use-of-hadoop-yarn-scheduler/
Hadoop: The Definitive Guide pdf epub mobi txt 电子书 下载 2024