Big Data – Streaming
There are many different areas of the architecture to design when looking at a big data project. Do you need to account for a large amount of data streaming into your warehouse or can you mostly focus on processing the data coming in and just need to pick the right data store or warehouse? Here are the major elements we look at in an architecture with a focus on Streaming in this section.
Data now comes from more places than ever. With all of the sensors generating reading while computers and people generating even more information, it can be critical to make the right decision on which tool to select. There are some thoughts below on the pros and cons. inSight360 has experience with many of the products below with an emphasis on Amazon Kinesis Streaming. Let us help you make the decision.
Pros and Cons of Data Streaming Tools
Flume
- Pros: Reliable
- Cons: Does not manage multiple streams
Kafka
- Pros: scalable and reliable – adopted in many cloud based offerings
- Cons: setup and support time consuming
Amazon Kinesis Streaming
- Pros: Set up and management tools from AWS
- Cons: Doesn’t scale quite as well as Kafka
Azure Event Hubs
- Pros: Set up and management tools
- Cons: Not as mature as others
Hortonworks Data Flow
- Pros: Powerful user interface and management capabilities unity
- Cons: Just released Q4 2015 by Hortonworks