BT

深圳快乐彩: Big Data Processing with Apache Spark

| by Srini Penchikala Follow 36 Followers , reviewed by Charles Humble Follow 897 Followers on Feb 23, 2018

About the Author

深圳风采开奖号码 www.ljvch.cn Srini Penchikala currently works as Software Architect at a financial services organization in Austin, Texas. He has over 20 years of experience in software architecture, design and development. Srini is currently authoring a book on NoSQL Database Patterns topic. He is also the co-author of "Spring Roo in Action" book from Manning Publications. He has presented at conferences like JavaOne, SEI Architecture Technology Conference (SATURN), IT Architect Conference (ITARC), No Fluff Just Stuff, NoSQL Now and Project World Conference. Srini also published several articles on software architecture, security and risk management, and NoSQL databases on websites like InfoQ, The ServerSide, OReilly Network (ONJava), DevX Java, java.net and JavaWorld. He is a Lead Editor for NoSQL Databases community at InfoQ.

Apache Spark is an open-source big-data processing framework built around speed, ease of use, and sophisticated analytics.

Spark has several advantages compared to other big-data and MapReduce technologies like Hadoop and Storm. It provides a comprehensive, unified framework with which to manage big-data processing requirements for datasets that are diverse in nature (text data, graph data, etc.) and that come from a variety of sources (batch versus real-time streaming data).

Spark enables applications in HDFS clusters to run up to a hundred times faster in memory and ten times faster even when running on disk.

In this mini-book, the reader will learn about the Apache Spark framework and will develop Spark programs for use cases in big-data analysis. The book covers all the libraries that are part of Spark ecosystem, which includes Spark Core, Spark SQL, Spark Streaming, Spark MLlib, and Spark GraphX.

Free download

Please choose

To receive this eMag please answer the following questions:


Would you also like to receive...

Real-Time Data Management with Apache Spark
Learn how Apache Ignite and GridGain can simplify many Apache Spark tasks, including stream ingestion, data preparation and storage, stream processing, state management, streaming analytics, and machine learning.
Sponsored by GridGain
Yes, please bundle this white paper with the book.
Note: By checking the box you grant InfoQ permission to share your contact info with this sponsor.

Buy the print version for $19.99

Table of Contents:

  • Part 1: Overview        
  • Part 2: Spark SQL                
  • Part 3: Spark Streaming       
  • Part 4: Spark Machine Learning           
  • Part 5: spark.ml Data Pipelines        
  • Part 6: Graph Data Analytics with Spark GraphX     
  • Part 7: Emerging Trends in Data Science 

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT
  • 小龙虾走俏催生新职业“品虾师” 2018-12-15
  • 女子痴迷鹿晗 商场门口对人形立牌拭泪亲吻 2018-12-14
  • 成都:共享办公受追捧 助力写字楼“去库存” 2018-12-14
  • 回复@老老保老张工:伪高工想回到那种你生产的产品再水都有人买单都不会倒闭的日子?没门儿! 2018-12-12
  • 习近平:深入实施创新驱动发展战略 为振兴老工业基地增添原动力 2018-12-12
  • 昆明母婴室地图出炉啦!公众场合喂奶不再羞答答 春城壹网 七彩云南 一网天下 2018-12-10
  • 南昌市新建区司法局深入湖区渔船宣传法律 2018-12-09
  • 银白配色更高贵-热门标签-华商网数码 2018-12-08
  • 超美雾凇冰挂奇观   豫北最大瀑布群变冰帘 2018-12-07
  • 台东“孩子的书屋”:撑起偏乡学童翻转命运的机会 2018-12-07
  • 实验室里“种植”钻石,这样的人造钻戒你能接受吗? 2018-12-06
  • 新赛季CBA联赛常规赛分组 吉、辽、深、广、青同组 2018-12-05
  • 【网络中国节】端午遇上足球杯 平陆交警夜查全力保平安 2018-12-05
  • 法学教育 要离生活更近些 2018-12-04
  • 写“平乐镇”前,小说家颜歌的光怪陆离 2018-12-03
  • 646| 166| 713| 966| 161| 188| 362| 963| 441| 118|