Building a SAST program at Razorpay’s scale

Published in

Razorpay Engineering

9 min readJul 1, 2022

This blog is authored by Sandesh Mysore Anand and Libin. Both of them are part of team Security @ Razorpay

tl;dr

No single tool or technique can identify all security defects in an application. Part of building a mature Security program is to use a number of techniques to find security defects. Static Application Security Testing (SAST) analyzes an application’s source code to uncover security defects. This allows SAST tools to have excellent coverage (you can scan every line of code) and run early in the software development lifecycle (you don’t need a deployed application). While SAST has been around for decades, SAST tools are often hard to use. Setup is complex, scans take too long, results have too many false positives and writing custom rules is hard.

In a company like Razorpay, where we push code to production dozens of times every day, these downsides are unacceptable. We’ve spent the last 15 months overcoming some of these challenges to make SAST a key part of our product security strategy. This blog talks about that journey and what plans we have going forward.

A maze with 3 entry doors and a computer screen at the exit, denoting the “maze” of complexities of SAST

How SAST tools work

Before we talk about our program, a (very) high-level overview on how SAST tools work

While every tool works slightly differently, in essence, they follow the same 2 key steps:

Model code: In this step, the SAST tool consumes source code and converts code into a format that is useful to run analysis on. Some tools compile the code, others use an abstract syntax tree (AST) to build a model and some even convert them to a custom format of their choice. This step is needed to ensure code from all programming languages is converted to a similar format, which can then be input to step 2
Find defects: In this step, the SAST tools apply various rules (or “test cases”) to the modelled code. These rules could be defined by the tool vendor or custom written by the tool user (side note: Razorpay relies heavily on custom rules). For every rule match (or test case passed), a defect is created. At the end of this step, a list of defects is created.

Once the results are obtained, there are other steps (such as presenting the results in a readable format). We will skip those details to keep this blog at a reasonable length. You can read more about how SAST tools work here.

Choosing the right tool

Given that SAST is such a tool-heavy program, choosing the right tool is important. We started off with some important assumptions:

No single tool can find all security defects in an application. We prefer incremental improvements and reduced noise over a promise of finding “all” defects with a lot of noise.
Like most engineering teams, we also use internal libraries, coding guidelines, and unique one-off solutions. No SAST tool can provide rules specific to our practices out-of-the-box.
To ensure specific rules are applied, we will need to invest in fine-tuning and writing rules.
Traditional SAST tools aggregate the entire codebase and piece it together to analyze it, making static analysis a time-consuming task (hours or even days to complete a single scan). The solution we choose should dramatically reduce scan time and have the ability to scan incremental changes (i.e. PRs)

Security pros may have noticed that we ignored two other common requirements: Coverage (e.g. OWASP benchmarking) and Low false positive rates on the default ruleset. This is because we realised that providing a phenomenal custom rule engine will make it easy for us to build rules that meet coverage requirements and turn off rules that produce a high rate of false positives.

Outlining 4 key things Razorpay looks for in a SAST tool. Quick scans, Language support, Integration and Customization — Outlining 4 key things Razorpay looks for in a SAST tool.

While we would love to say that we performed a rigorous evaluation process with all the top SAST tools and narrowed it down to the best option, the reality was a little different. While investigating a potential security defect, we wanted a tool to quickly scan a large repo (think millions of lines of code). Most tools we knew of were either commercial (there was no time to go through procurement) or did not work well on large code bases.

We then came across r2c’s Semgrep. It took us less than an hour to install the CLI version of the tool, run a scan and get desired results. This blew our minds and we started digging deeper into the tool. After a few weeks of evaluation and a couple of conversations with the r2c team, we decided to give Semgrep a go (note: Semgrep has an open source CLI version and a commercial app, which also provides you with a web interface to apply rules and review results).

In hindsight, here’s what clinched the deal for Semgrep:

A breeze to integrate into our CI/CD pipeline and scans run really fast
Best in the industry custom rule engine. We especially loved the fact that you could write custom rules in the same language as the code being scanned (you can take them for a spin here)
Support for Go. All of Razorpay’s coding is moving towards Go and Semgrep’s support here was a huge plus. Note: A lot of Razorpay legacy code is in PHP. Semgrep’s support here isn’t great, but we were able to augment that by writing custom rules of our own.
Basic integration with Github Actions and Slack (full-fledged API support came later)

As you can see, the non-negotiable criterion and Semgrep’s capabilities match heavily. This made our decision to go with Semgrep fairly straightforward.

Using Semgrep effectively

Semgrep’s integration in Razorpay’s continuous integration workflow — SAST integration in Razorpay’s continuous integration workflow

Razorpay has hundreds of microservices, many of them deploying daily. We rely on GitHub for both source code storage and CI builds (GitHub Actions). While Semgrep’s out-of-the-box support for GitHub Actions helped, choosing what rules to run for each project by understanding its technology (language and frameworks) became a challenge.

After some trial and error, we figured out that to run a successful SAST program there are 3 aspects that we need to get right.

Run scans early

Semgrep scans were placed at each pull request to scan the changes and periodic full scans on the whole code base to assess the overall state. Semgrep features allowed us to scan the portion of the code which was altered instead of scanning the whole code base. This helped us save precious time spent on scanning and also to focus on issues brought by a particular change. Semgrep scans were triggered by Semgrep GitHub agents via CI. Findings are reported as PR comments, which enabled developers to read about the security vulnerabilities without leaving their environment.

Customization and reducing complexity

As the number of projects grew, so did the complexity of maintaining this program, and keeping up the efficiency. With many rules that span across different projects mixed with multiple technologies, we started to observe the effectiveness of findings take a hit. Upon further analysis, we found its root cause due to a few reasons.

Too many findings: Rules which are less critical to security and with less confidence were generating too many findings, creating developer fatigue.
False positives: Rules which needed to be fine-tuned to custom/business use-cases. These rules generated a lot of false positives.
Inefficient policies: While Semgrep allowed us to create policies, too many people changing too many policies started to affect its accuracy. Teams were not able to choose the right policies for them.

To solve scaling issues with managing rules, we drafted policies which are easy to pick and choose based on the technologies being used and based on the nature of the rules. For instance:

Ruleset a: Security critical and technology agnostic (hardcoded secrets)

Ruleset b: Security critical and language-specific (Python security)

Creating such templates made life easier for developers and we started seeing more adoption of policies and the total findings started to come down (with a consistent increase in fix percentage). These easy-to-understand templates also serve the purpose of choosing “what is good security?” and make sure the team enables the critical security controls when they onboard a SAST scanner.

Our design was focused on the following objectives.

Find all true positives critical to security, bugs found no matter how small or less in numbers should be valid (high confidence, true positives)
Fine-tune bugs by removing/modifying bugs which are not security-critical or critical to business. (less impact, low confidence)
Periodically modify/update rules based on outcome. (reduce false positives)
Encourage security experts to write custom rules which are specific to their business functions(eg: tracing, custom logging, access controls, routing etc).

Semgrep provided those necessary features for us to group these rules and create templates which were crucial to its success.

“Selling” SAST internally

While we worked through technical challenges, we also needed to ensure engineers warmed up to Semgrep. In the past, top-down efforts to introduce Security tools had failed. This time, we decided to use a bottom-up strategy to drive adoption. Key ideas included:

Find early adopters. We chose a business unit which was enthusiastic about trying new things (hat-tip to the Capital business unit). This allowed us to make mistakes and learn from them, all while dealing with friendly, enthusiastic developers. Once other BUs heard about Capital’s success, they were happy to try it themselves. Having a thriving Security Champions program helped drive that message.
Advertise success, fix failures. We spoke about how SAST can help in every possible forum. In brown bag sessions, during updates to leadership, Hackathons and so on. On the other hand, every time we hit a glitch with the program, we put in a dedicated effort to fix it ASAP.
Track progress in public: We published metrics about Semgrep adoption. This helped BU leaders track progress in real-time. We called out teams with high adoption and nudged teams with lower adoption. We also leveraged OKRs to add Semgrep adoption as quarterly requirements.

All this meant, that we grew from 0 to 200 applications onboarded in a little over 12 months.

What next?

While we are thrilled to get the SAST program going, we have a long way to go before we can hit the desired level of maturity. Two critical topics we need to improve to reach the desired level are as below.

100% adoption: While all critical projects have been onboarded, we aim to onboard all projects to our SAST platform in blocking mode (i.e. a PR cannot be merged to master if there’s an open finding).
Build an effective feedback mechanism to continually improve SAST configuration. This includes enabling our security champions to continually provide feedback and improve Semgrep configuration, periodically reviewing existing rules and adding new rules after evaluation and writing Semgrep rules for every defect discovered using other methods (e.g.: external bug bounty submission)
Enable developers to make more decisions. At its core, SAST is a way to enable developers to write secure code. This mission can only be achieved if most of the decision-making rests with them. Once we reach a certain level of awareness among engineers, we would like them to be able to write and choose what rules to apply. They should also be able to modify the cadence of scanning as needed.
Build an easy way to run periodic scans on our entire codebase. When events like log4shell happen, it’s useful to have the ability to quickly run a scan across all our repositories. Today, doing that is a bit of a herculean task. We’d like to build a tool where at a click of a button, a set of rules can be run across all (or a subset) of repositories.

Come Work With Us! 🚀

If the work we do excites you, we are actively hiring for our engineering team (including our Security team). We are always looking out for great folks. Reach out to us at tech-hiring@razorpay.com.

Read more about the work we do at RazorpayEngg

Go Consuming All Your Resources?

A debugging journey of how we fixed memory bloat on a Go service, giving us a 76% reduction on Infra cost

engineering.razorpay.com

How Razorpay’s Notification Service Handles Increasing Load

Read about the solutions implemented on Razorpay’s notification service that allowed it to overcome performance and…

engineering.razorpay.com

Distributed Tracing with Hypertrace

Solving the complex problem of distributed tracing and monitoring using Hypertrace.

engineering.razorpay.com

Building a SAST program at Razorpay’s scale

tl;dr

How SAST tools work

Choosing the right tool

Using Semgrep effectively

Run scans early

Customization and reducing complexity

“Selling” SAST internally

What next?

Come Work With Us! 🚀

Go Consuming All Your Resources?

A debugging journey of how we fixed memory bloat on a Go service, giving us a 76% reduction on Infra cost

How Razorpay’s Notification Service Handles Increasing Load

Read about the solutions implemented on Razorpay’s notification service that allowed it to overcome performance and…

Distributed Tracing with Hypertrace

Solving the complex problem of distributed tracing and monitoring using Hypertrace.

Written by Sandesh Mysore Anand