Posts

Showing posts from 2019

Tale of stream and batch data ingestion

Image
D ata ingestion is an inevitable part of analytics — be it saving data entered by users from a website or moving a large volume of data from machines and sensors. This blog talks about the evolution and trends in data ingestion, and modern architecture principles to gain insights from the ingested data. Since a preface is set, let's have a look at how data is ingested since the inception of modern data warehousing and analytics. Good old file transfer — Files from transaction systems, all kinds of files were transferred using some file transfer protocol Database entry — Enter the data to databases directly Queue mechanism — Use more reliable queue mechanism to transfer the data to and from All the above methods are efficient and solve the data ingestion problems to a great extent. But with the offset of new data sources e.g. social media feeds, IoT systems, telemetry data, machine data, never-ending log files, etc. conventional way of data ingestion became mo...