Posts

Showing posts from July, 2016

Spark Getting started - Develop using eclipse locally

Image
This article will help you to jump start on spark development on your PC or laptop (Windows) without having a fully functional Hadoop cluster installed. I use a  8 GB RAM , 128 GB storage, Windows 10  machine. These days I try to isolate development in various environments using Docker containers or Bluemix containers. Still sometimes I fall back to method of developing stuff on my local machine before deploying the code to cluster. This blog covers Setting up spark and eclipse as IDE for local development with bare minimal prerequisites. While I am writing this, Spark 1.5.1 is available and I am using the same. Follow below instructions to set up spark on your machine. Hadoop Installation on windows 1. Assuming your OS is windows, download and install Hadoop on windows. This may not be a fully functional Hadoop cluster but we are worried only about some libraries which spark will need later. Download Hadoop-2.6.0.tar.gz   . 2. You dont need to install , all you...

IBM BigInsights - Bigsheets (excel like interface to HDFS files and tables)

Image
Bigsheets is a browser-based tool that is included in the BigInsights data scientist package or data analyst package, to analyze and visualize big data. BigSheets uses a spreadsheet-like interface that can model, filter, combine, and chart data collected from multiple sources, such as an application work on big data environment. Since big sheets is a service running on big data cluster, user does not need to worry about connectivity. Big sheets is a service installed on cluster just like other services (Hive or hbase etc) In this demo we will see how to Create a master workbook from existing file in HDFS Tailor data by creating child workbook Create columns after grouping data How to create quick charts Export data to other formats Accessing BigSheets Bigsheets is available in application tab on IBM® InfoSphere® BigInsights™ Enterprise Edition . Click on Bigsheets tab and launch the application.     Bigsheets works on Ha...