CASE № 2

SYSSOFT

System Software LLC is one of the largest x IT solutions in the Russia Federation. It has been available in the services market since 2008. The software catalog is the largest in both Russian Federation and the CIS and provides over 30 000 units. The activities scope includes private and public organizations automation and informatization projects implementation, cloud solutions and services provision, intended to resolve certain customer issues. The company’s website has got a large visits indicator, a huge catalog as well as high website operation requirements 24/7/365. This obliges us to adhere to the highest standards in maintaining fast and resilient website and CRM company's operation.

GOAL

To optimize the website operation and company’s CRM, to achieve high website pages loading speed, to exclude the downtime probability while performing all the tasks during continuous integration process.

ISSUE

The hosting provider rented 12 fairly high rates servers, initially intended to ensure high quality operation. The servers worked the appropriate way from the very beginning until the company's power started to rapidly grow. Hosting quality has fallen dramatically after websites pages amount increase. A huge bugs amount has been identified in the virtualization system (from incorrect accounting of RAM consumption in the guest OS to issues including inode counting while using only 50% of disk space) after a detailed analysis. The entire infrastructure was limited to bash scripts on the machines themselves and almost non-informative and poorly tuned Cacti monitoring. Each issue resulted in downtime from 10 minutes to 2 hours.

WORK

Detailed highly loaded, unstable server structure audit without any fault tolerance systems.

Detailed highly loaded, unstable server structure audit without any fault tolerance systems.

Web app settings have been optimized, apache has been replaced with nginx.

Over 60 tasks according to the specification have been successfully performed for the project.

SOLUTION

It was decided to reduce downtime in terms of the current hosting, first of all. So, web apps, databases, website files and company's CRM clustering as well as web app settings optimization have been performed. We managed to eliminate the unstable servers working structure as well as come up with the fault tolerance systems solution. It is worth noting we managed to defeat individual CRM pages issue which loading time was to 5-10 minutes, the website performance was poor, test scripts fixed over 10 seconds catalog’s pages response time.

VICTORY

All websites were transferred to more modern and high-speed servers. A database cluster was created with optimized query distribution and sequential checking of each node availability in the cluster before submitting a request. Besides a unique Zabbix monitoring setting with SMS alert was implemented especially for the customer’s needs. So, it became possible to achieve the KPI websites indicator equal to 100%.