Always Learning

Advanced Search

Spark GraphX in Action

Spark GraphX in Action

Michael S. Malak

Feb 2017, Paperback
ISBN13: 9781617292521
ISBN10: 1617292524
  • Print pagePrint page
  • Email this pageEmail page
  • Share

While graphs are often the most natural way to represent the connections among data, the complexity of large graphs makes them conceptually difficult and computationally expensive to explore, query, and analyze. GraphX, a powerful graph processing API for the Apache Spark analytics engine, makes it possible to efficiently explore and interpret large-scale graph data at near-realtime speeds. GraphX works with Spark's in-memory distributed framework to offer unprecedented speed and capacity for analyzing social media data, performing complex textual analysis, handling important machine learning algorithms, and much more.

Spark GraphX in Action starts out with an overview of Apache Spark and the GraphX graph processing API. This example-based tutorial explains how to configure GraphX and use GraphX interactively. It offers a crystal-clear introduction to graph elements, which are needed to build big data graphs. Then, it explores the problems and possibilities of graph algorithm implementations. Along the way, it details practical techniques for enhancing applications and applying machine learning algorithms to graph data. KEY FEATURES

 Example-based tutorial

 Quickly gets readers started with GraphX

 Allows readers to go beyond the standard API

AUDIENCE

Readers should be comfortable reading Scala code. Experience with graph data and Apache Spark is helpful, but not required.

ABOUT THE TECHNOLOGY

Graphs have been slowly building in the popular consciousness for the past two decades, from "the six degrees of Kevin Bacon" in the 1990’s to post-9/11 calls to "connect the dots." But since 2013, graphs have exploded into the popular realm with Facebook’s "graph search" and now the three-vertex graph icon is the universal icon for "sharing" on social media. GraphX is the graph processing module that is part of the distributed in-memory computing framework Apache Spark.

Your opinions count

Be the first to review this product. Write your review now.