Enterprise-grade IT service level agreement for startups

Every business that uses third-party IT outsourcing services understands the importance of signing Service Level Agreements or SLAs with their IT service provider. IT Svit, being one of the leaders of the IT outsourcing market in Ukraine and one of the top 10 Managed Service Providers worldwide, clearly understands the importance of service level agreement for IT services. Over 10 years and more than 500 DevOps projects have given us significant experience with providing remote Amazon web services management and customer support on other cloud platforms.

AWS service level agreement for a fraction of costs

As AWS is the most popular cloud platform worldwide, we have the biggest expertise working with it. IT Svit provides all the range of products and features from Amazon Web Services, from IaaS deployment to PaaS support and SaaS management. We do it on behalf of the customer and guarantee stable uptime for any project or product deployed to AWS and managed by IT Svit under our SLA — and it costs much less than ordering these services directly from AWS.

No service credits, as there are no caps on our service levels

The key component of any SLA is the description of the metrics that ensure the efficiency and completeness of the services provided, including the guaranteed service levels, service caps, and applicable service credits. When working with IT Svit, you get no service caps, as we concentrate on proactively removing bottlenecks to minimize possible performance issues and optimizing the infrastructure to reduce the numbers of incidents.

IT support service level agreement for our products

IT Svit provides ongoing support and development for all software products we help develop if the customer needs it. We ensure the product is released on time, runs cost-efficiently, scales well and all the incidents are resolved in a timely manner. This way, we’ve got you covered from the beginning of your product development and all the way through the growth phase of your startup, up to the long-term infrastructure support at scale.

Ready to start?

SLA or Service Level Agreement is the key component of any contract in IT services. If you hire another company to perform some tasks, you want to make sure there is a precise description of the types of possible issues, detailed list of incident response times and an understandable, trackable issue resolution procedure in place.

IT Svit works under SLA that differs a lot from common IT service level agreement specimens. The main difference is that we have no service caps, meaning we do not set the limit on the number of incidents resolved monthly, nor do we specify the service credits for this reason. Instead, we apply our ample DevOps expertise to audit the infrastructure and workflows in place, identify the bottlenecks that hinder growth and negatively impact the stability of operations, and proactively remove them. This way, we minimize the numbers of incidents, ensuring the stability of your operations and decreasing the workload for our DevOps engineers, who support multiple projects each.

The goal of our Service Level Agreement is to establish a reliable partnership between IT Svit and our customers and describe the provided services, supported infrastructure components, notification and alerting methods, as well as the KPIs for every project we perform.

Under the SLA IT Svit has such rights:

  • to demand the customer to follow the instructions and procedures we enact;
  • to demand the compensation for the corrections we had to do to fix the issues resulting from the intervention of third parties to the operations;
  • to select the project team composition and plan their work schedules ourselves.

Respectively, IT Svit has the responsibilities to:

  • guarantee timely processing and resolution of incidents and requests covered by this SLA
  • achieve the service KPIs depicted below and ensure stable operations of covered services
  • employ third parties if it is needed to provide the services covered by this SLA

The customers have the following right:

  • to monitor the process of service delivery in a way that does not impede the team’s productivity

Respectively, the customers have the following responsibilities:

  • to provide all the documentation and login details required for service delivery
  • to issue tickets regarding any incidents they encounter or requests they want to make

System components covered by service level agreement from IT Svit

While every business DNA is different and almost each cloud infrastructure is unique, they are all built with similar building blocks: code repositories, file storages, databases, server instances, Docker containers, etc. Below is the list of common system components and tools covered by IT Svit SLA, all of which can be used for both staging and production servers.

  • Cloud platform or bare-metal servers (AWS, GCP, DO, Azure, OpenStack, OpenShift, etc)
  • Container management services (EKS, GKE, ECS, etc)
  • Virtual machine instances (Amazon EC2, Google Compute Engine, etc)
  • File storage (Amazon S3, Google Storage Buckets, etc)
  • SSH key management tools
  • Virtual Private Clouds (Amazon VPC, Google VPC, etc)
  • VPN instances
  • NAT instances
  • API connectors
  • Web or mobile apps
  • Databases (Amazon RDS, PostgreSQL, MySQL, MongoDB, Redis, Cassandra, etc)
  • Docker Registry for images
  • Jenkins cron jobs for infrastructure management tasks
  • Monitoring tools (ELK stack, Prometheus & Grafana, Splunk, SumoLogic, etc)
  • Nginx ingress controllers
  • TLS/SSL certificate managers
  • HashiCorp Vault for secret storage
  • Code repositories and CI/CD tools (GitHub, GitLab, Circle CI, Gitlab CI, etc)

These are the basic infrastructure components present in nearly any infrastructure. The rest of the modules depend on the project and we work with a huge variety of cloud-specific or open-source instruments to build resilient, scalable and manageable infrastructures.

Workflows covered by IT Svit service level agreement

We ensure stable operations for multiple tasks and workflows, like:

  • vertical and horizontal scaling
  • VPN user management
  • various application operations
  • database management (backup & restore, sharding, replication, etc)
  • API operations (connect, deploy, check logs, etc)
  • web/mobile app operations (connect, deploy, check logs, etc)
  • various operations required based on your project specifics

Thus said, IT Svit has ample experience with software development, Big Data analytics, blockchain development, Machine Learning models training, etc. In this case, the IT service level agreement is adjusted to reflect the modules and workflows involved.

Services provided under our SLA

This is by far not an exhaustive list of common incidents and our responses to them. Particular operations in this list will be replaced according to the needs of your project.

 

Environment Object State Observation SLA Priority
Production Web Application Login page is not accessible (unsuccessful probe, timeout, etc.) Prometheus BlackBox exporter > 1% during 1 minute Critical
Staging Web application Login page is not accessible (unsuccessful probe, timeout, etc.) Prometheus BlackBox exporter > 1% during 1 minute High
Production Web Application Certificate expiring Prometheus BlackBox exporter < 3 days Critical
Staging Web Application Certificate expiring Prometheus BlackBox exporter < 3 days High
Production Web Application 5xx response code rate Monitor Nginx ingress controller logs for response codes > 1% during 1 minute Critical
Staging Web Application 5xx response code rate Monitor Nginx ingress controller logs for response codes > 1% during 1 minute High
Production Web Application 2xx response time Monitor Nginx ingress controller logs for response codes > 1000 ms during 1 minute Critical
Staging Web Application 2xx response time Monitor Nginx ingress controller logs for response codes > 1000 ms during 1 minute High
Production Web Application CPU Utilization Prometheus > 90% during 5 minutes High
Staging Web Application CPU Utilization Prometheus > 90% during 5 minutes Medium
Production Web Application Memory Utilization Prometheus > 80% during 5 minutes High
Staging Web Application Memory Utilization Prometheus > 80% during 5 minutes Medium
Production Web Application Disk IO Prometheus > 100ms during 5 minutes High
Staging Web Application Disk IO Prometheus > 100ms during 5 minutes Medium
Production API Application API is not accessible (unsuccessful probe, timeout, etc.) Prometheus BlackBox exporter > 1% during 1 minute Critical
Staging API Application API is not accessible (unsuccessful probe, timeout, etc.) Prometheus BlackBox exporter > 1% during 1 minute High
Production API Application Certificate expiring Prometheus BlackBox exporter < 3 days Critical
Staging API Application Certificate expiring Prometheus BlackBox exporter < 3 days High
Production API Application 5xx response code rate Monitor Nginx ingress controller logs for response codes > 1% during 1 minute Critical
Staging API Application 5xx response code rate Monitor Nginx ingress controller logs for response codes > 1% during 1 minute High
Production API Application 2xx response time Monitor Nginx ingress controller logs for response codes > 1000 ms during 1 minute Critical
Staging API Application 2xx response time Monitor Nginx ingress controller logs for response codes > 1000 ms during 1 minute High
Production API Application CPU Utilization Prometheus > 90% during 5 minutes High
Staging API Application CPU Utilization Prometheus > 90% during 5 minutes Medium
Production API Application Memory Utilization Prometheus > 80% during 5 minutes High
Staging API Application Memory Utilization Prometheus > 80% during 5 minutes Medium
Production API Application Disk IO Prometheus > 100ms during 5 minutes High
Staging API Application Disk IO Prometheus > 100ms during 5 minutes Medium
Production Database 5432 port is not accessible > 1% during 1 minute Critical
Staging Database 5432 port is not accessible > 1% during 1 minute High
Production Database CPU Utilization Prometheus > 90% during 5 minutes High
Staging Database CPU Utilization Prometheus > 90% during 5 minutes Medium
Production Database Memory Utilization Prometheus > 80% during 5 minutes High
Staging Database Memory Utilization Prometheus > 80% during 5 minutes Medium
Production Database Disk IO Prometheus > 100ms during 5 minutes Critical
Staging Database Disk IO Prometheus > 100ms during 5 minutes High
Production Redis 6379 port is not accessible > 1% during 1 minute Critical
Staging Redis 6379 port is not accessible > 1% during 1 minute High
Production Redis CPU Utilization Prometheus > 90% during 5 minutes High
Staging Redis CPU Utilization Prometheus > 90% during 5 minutes Medium
Production Redis Memory Utilization Prometheus > 80% during 5 minutes High
Staging Redis Memory Utilization Prometheus > 80% during 5 minutes Medium
Production Redis Disk IO Prometheus > 100ms during 5 minutes High
Staging Redis Disk IO Prometheus > 100ms during 5 minutes Medium
Production Vault Cluster degraded Logs > 1 min Critical
Staging Vault Cluster degraded Logs > 1 min High
Production Vault 8200 port is not accessible > 1 min Critical
Staging Vault 8200 port is not accessible > 1 min High
Production VPN Developers don’t have access to the VPN instance Google Virtual Private Cloud monitoring > 40% during 1 day High
Staging VPN Developers don’t have access to the VPN instance Google Virtual Private Cloud monitoring > 40% during 1 day Medium
Request for the maintenance Infrastructure components There is a need to use existing infrastructure functionality to make changes without changing current behavior. Medium
Change request Infrastructure components There is a need for infrastructure changes/functionality with changing current behavior. Low
Request for providing information Infrastructure and application components There is a need in additional information about the system or its behavior. Low

IT Svit service level agreement — guaranteed performance for your business!

IT Svit provides reliable support for all kinds of IT operations, from software development and database administration to Big Data analytics, blockchain development, and Artificial Intelligence algorithms. We back our promises up with an in-depth service level agreement, and we do not aim for doing the least number of work to get paid. Instead, we aim at going an extra mile to remove the bottlenecks and minimize the numbers of incidents, which benefits all parties involved. Sounds too good to be true? Get in touch and see for yourself!

Contact Us



Our website uses cookies to personalise content and to analyse our traffic. Check our privacy policy and cookie policy to learn more on how we process your personal data. By pressing Accept you agree with these terms.