Project descriptionLots of IT Svit customers faced the need to find some specific information on their corporate websites fast. Platform-specific search engines were not perfect, so we decided to create a bespoke web scraper tool that can be added to a particular website and will create custom search indexes for any website fast.
Project requirementsIT Svit needed to overcome the following challenges:
- Web crawlers must be lightweight and simple, yet efficient
- The search index must be built and processed quickly
- The tools must have convenient user interface
- The tools must have low hardware requirements
Project resultsIT Svit developed the required web scrapers and other Big Data solutions to enable our customer to form the data set for training their search engine. Toweya provided the basic specifications and we helped them create an easy to use and performant search engine platform, which enables incremental web search and provides precise results.
CEO of IT Svit
We have performed transition to the cloud more than once for our customers, and we had to tackle huge corporate websites with tons of in-house information. These websites did not provide a convenient enough search engine, so to improve their usability we developed OwnSearch — an internal tool allowing the business to build their own website search index quickly and utilize their knowledgebase and content to the full extent.
Founder at Everdapt Ltd
Their cloud infrastructure management and deployment was flawless and no problems have been encountered. IT Svit communicated well and went out of their way to be responsive. They not only met the deadlines with their cloud integration services but were often ahead of schedule.
Talk to our expert!
We will be happy to answer your questionsfree consultation
The main challenge we had to deal with was the absence of the built-in search tools or their rigidity. We decided to build the web scraper solution anew and ensure it can easily interact with any type of CMS or website builder platform.
We wanted this tool to have the following characteristics:
- High performance
- Low system resource consumption
- Ease of configuration
- Simplicity of usage
Implementation and challenges resolved
The scraper was built with Python using the asyncio and aiohttp libs, and has met all the aforementioned requirements:
- The scraper comes with a built-in webserver to ensure the simplicity of launching it
- The tool can be easily integrated into any website
- The search index results can be viewed through any browser
- The scraper has low hardware requirements
Due to being written in Python, the tool works quickly