图书标签: 数据挖掘 计算机 机器学习 Data Coursera CS 数据分析 软件工程
发表于2025-03-03
Mining of Massive Datasets pdf epub mobi txt 电子书 下载 2025
Written by leading authorities in database and Web technologies, this book is essential reading for students and practitioners alike. The popularity of the Web and Internet commerce provides many extremely large datasets from which information can be gleaned by data mining. This book focuses on practical algorithms that have been used to solve key problems in data mining and can be applied successfully to even the largest datasets. It begins with a discussion of the map-reduce framework, an important tool for parallelizing algorithms automatically. The authors explain the tricks of locality-sensitive hashing and stream processing algorithms for mining data that arrives too fast for exhaustive processing. Other chapters cover the PageRank idea and related tricks for organizing the Web, the problems of finding frequent itemsets and clustering. This second edition includes new and extended coverage on social networks, machine learning and dimensionality reduction.
Jure Leskovec is Assistant Professor of Computer Science at Stanford University. His research focuses on mining large social and information networks. Problems he investigates are motivated by large scale data, the Web and on-line media. This research has won several awards including a Microsoft Research Faculty Fellowship, the Alfred P. Sloan Fellowship, Okawa Foundation Fellowship, and numerous best paper awards. His research has also been featured in popular press outlets such as the New York Times, the Wall Street Journal, the Washington Post, MIT Technology Review, NBC, BBC, CBC and Wired. Leskovec has also authored the Stanford Network Analysis Platform (SNAP, http://snap.stanford.edu), a general purpose network analysis and graph mining library that easily scales to massive networks with hundreds of millions of nodes and billions of edges. You can follow him on Twitter at @jure.
花费6个月时间,断断续续看完,哈希和近似的想法真是开阔了眼界。第一回看比较急促,此书值得反复看,多实践。
评分bug非常之多, 还找不到地方提交, 读起来极度痛苦, 前看后忘, 也许里面的算法本质上就是这样, bottom line至少近15年最新的论文成果被这么串讲一下, 本科生也能看懂
评分行文很流畅,看到下面很多人说翻译的问题,由此推荐原版。配合网课还是挺浅显的,例子举得也挺多,自学也可以。步骤写的也很细,有条件完全可以照着码,不晦涩,小白很喜欢。
评分bug非常之多, 还找不到地方提交, 读起来极度痛苦, 前看后忘, 也许里面的算法本质上就是这样, bottom line至少近15年最新的论文成果被这么串讲一下, 本科生也能看懂
评分花费6个月时间,断断续续看完,哈希和近似的想法真是开阔了眼界。第一回看比较急促,此书值得反复看,多实践。
看有同学说是 stanford的入门课程,按理说应该不是太难。作为初学者来说,本书翻译的实在不敢恭维,看了50多页是一头雾水,很多话实在是晦涩难懂。本书作用入门级课程来说,基本上涵盖了数据挖掘的各个大类,如果想细致研究某个领域的大拿就不用看了
评分看到好多人说这本书是大纲,是目录,没啥内容,讲的浅。 那就对了。 本书是Stanford CS246课程MMDS使用的讲义,还有配套的Slides和HW,所以观看本书请配套课程进行学习,同时coursera上也有配套的课程。 See more detail: http://www.mmds.org/
评分看到好多人说这本书是大纲,是目录,没啥内容,讲的浅。 那就对了。 本书是Stanford CS246课程MMDS使用的讲义,还有配套的Slides和HW,所以观看本书请配套课程进行学习,同时coursera上也有配套的课程。 See more detail: http://www.mmds.org/
评分看有同学说是 stanford的入门课程,按理说应该不是太难。作为初学者来说,本书翻译的实在不敢恭维,看了50多页是一头雾水,很多话实在是晦涩难懂。本书作用入门级课程来说,基本上涵盖了数据挖掘的各个大类,如果想细致研究某个领域的大拿就不用看了
评分当今时代大规模数据爆炸的速度是惊人的,当然,其应用也是越来越广泛的,从传统的零售业到复杂的商业世界,到处都能见到它的身影。那么大数据有什么典型特征呢?即数据类型繁多、数据体量巨大、价值密度低即处理速度快。本书也正是将注意力集中在了极大规模数据上的挖掘,而且...
Mining of Massive Datasets pdf epub mobi txt 电子书 下载 2025