The types of Big Data tools IT Svit uses
While Big Data is still more of an umbrella buzzword for many productive, easily scalable and cost-efficient tools and solutions, the true meaning depends on who uses the term and for what reason.
Thus said, we decided to describe the types of Big Data tools IT Svit uses.
As we have already mentioned in one of our previous articles, the Big Data is described by three V’s:
The projects we were involved in were mostly related to working with huge amounts of textual information on the Internet and its optimization, so we had to solve the tasks of dealing with big volumes of incoming data we needed to process, store and manage, which lead to learning to work with reliable databases; big variety of data we needed to gather and process, which led to using decentralized web crawling and various other techniques of content analysis; and having to provide high velocity of data processing, which resulted in using asynchronous Python multiprocessing and using queueing with RabbitMQ or SQS to build easily scalable, high-performance networks.
Big Data databases: Redis, Cassandra, MongoDB
Redis worked well for us as an in-memory key-value database used to deliver a decentralized queue for analyzing the textual content. This was needed to enable our project to gather the data for future processing.
Cassandra, a well-known and proven choice for storing and managing assorted data (like a historical data within some range, say telemetry) has met our expectations and provided great fault tolerance, easily dealing with exceptional I/O workloads we used when working with our project algorithm. This was possible due to its built-in sharding capabilities and Cassandra did deliver excellent results in storing and processing the flow of data, scaling easily as the need arose.
MongoDB, a document-oriented database served our requests of storing various data for lesser projects, that was later easily processed and interchanged with the rest of the databases. Its scalability and flexibility helped us deal with querying and indexing of the data sets
One of the issues we are facing while processing the datasets is an ever-growing quantity of concurrent events our applications have to handle. As the language chosen is Python, both multithreading and multiprocessing is required to handle this situation. Thus said, we went for asynchronous I/O architecture of our applications to ensure the stability and continuity of our microservices. The other part of the solution was configuring message brokers.
RabbitMQ and SQS queue brokers
We are skilled with using both Rabbit and SQS message brokers, as many of our operations happen within AWS. However, we do prefer using RabbitMQ when possible, as it has more functions and allows self-addition of new web-services to the list. SQS, on the other hand, scales horizontally with ease and works perfectly within AWS infrastructures.
Thus said, we are able to design, deploy, configure and maintain any infrastructure, using trusted and reliable Big Data tools. We have a decent experience working on both short and long-term projects of any complexity and stand ready to deliver top-notch services to our customers.
Feel free to browse through the latest insights and hints on the DevOps, Big Data, Machine Learning and Blockchain from IT Svit!
Big Data misuse can break your business
Correct use of the Big Data analytics and ML algorithms helps boost the customer satisfaction, secure the bottom line and increase the ROI. Quite opposite, the Big Data misuse results will be awful.
Trust or caution? Importance of NDA for Startups
NDA is one of the main judicial instruments of a startup, both a shield and a sword. Just keep in mind, the importance of NDA for startups is a double-edged sword. Why do we think so?
SLA benefits: why do you need SLA and what does it cover
SLA or a Service Level Agreement is a document highlighting the measurable metrics and results the customer expects to receive and the contractor is bound to provide. We list the SLA benefits below.
Blockchain technology explained to your grandma
The blockchain will shape the future of multiple industries, yet many people still don’t know how it works. We tried to make the blockchain technology explained in a way even a grandma will get.