Posts

Showing posts from June, 2017

Real time Analytics-Implementing a lambda architecture on Hadoop

Image
Implement lambda architecture with fewer steps - using Spark, Hbase, Solr, Hbase-lily indexer and Hive Welcome to three part tutorial of getting your data available for consumption on real time (near) basis. Data domain has so advanced where decision making has to rapid hence we are gradually moving away from batch based data load (ETL) and tending towards real time analytics. With data being center of your strategies and decision making, getting data available sooner is pivotal for all organization. This three part tech blog explains about implementing lambda architecture (architecture supporting batch and real time analytics alike). Overall architecture for such projects is to cater three needs 1. Quick data access for web sites - Random access of data pattern e.g. a particular profile id or customer id or a comment key 2. Fast searches on random texts, fuzzy search, search suggestion e.g  customer name, product name etc. 3. Analytical query support for BI tools like C...