How long a software live directly depends on how much hygiene has been maintained while working on it. At an early stage of application, we also know that delivering features quickly is equally important as maintaining quality. In this situation, unknowingly sometime the quality does deteriorate.

But....., before the quality get affected badly, it is important to stop adding feature and look back and clean all the mess.  In Godial, that's what we try to do.

For this months our OKR is to improve the application's performance and quality, so that we can maintain it easily and also can scale it smoothly in term of user growth and feature implementation.

In the process of improving the performance, the first step we took was to migrate the application to aws server with a better architecture design.

The next step is to look into the application and identify the pain points where we can optimise performance. Yesterday we found one issue with the mongo query and have improved it. In this blog we want to share the story with you of how we improve query performance.

In last 3 months, the number of user has increased massively which surfaced the problem with our application. We recently found that some of the API are talking much more time to execute and also causing increase in resource utilisation.

One of the api was taking too much time to compute because of the heavy queries to database.

API execution time

This was taking huge amount of time taken and during peak time, it causes high load average on database which made the query much more slower.

In general it takes, 3-4 sec to complete, however in circumstances when load is heave, these APIs were killer.

We identified the issues related to this and fixed it, and after that found that the load is reduced to lower as 10-20 ms and ~300 ms when testing in stress condition.

API execution time after optimising query

To understand how we optimised the API, let me brief you with what the API does?

The API was responsible for identifying the last calls made by agents who work in   teams, also to sum up all the minutes he talks and bunch of other analytical stuff.

We use mongo database for storing data,  and our query was something like

aggregate{
  {
      $sort : {
          created_on: 1
      },
      {
          $match: {
              some_id: {
                $in: [ObjectId("")...]
              },
          }
      },
      {
          $group: {
            _id : field, 
            "some_field": {
                $sum : "$field" 
            },
            "some_last_field": {
               $last : "$last_field"
            }
          }
          ... some other stuff
      }
  }
}

What optimisation we applied?

One issue we realised was with the sort. The index was missing on createdOn causing   massive time in CALLSCAN stage. We used,

`explain("executionStats")`

in query to identify the execution time, and found the stage callscan and fetch stage is consuming a lot of time. First thing we applied was changed the index and query to -1 which means store in descending order.

2nd optimisation was more sort of product call we had to take, Since the data is increasing massively, it is unwise to not have lower bound on the query. Thus, in matching stage, we added condition to only fetch details which are created in last x days.

Both these changed helped in improving the performance massively also,  It was a good learning for the team.

Thanks