Qubole: Fast-tracking code reviews with DeepSource's automated analysis

Share on Twitter Share on LinkedIn Share on Facebook
Header image
"Thanks to DeepSource, all our code quality practices are now automated. The DeepSource checks are working well for us. Along with using it for Python, we are planning to integrate it with some of our upcoming Go projects."
— Joy Lal Chattaraj, Member of Technical Staff, Qubole
30
bug-risks resolved
273
total issues resolved
98,000
lines of code analyzed

About Qubole Data Service (QDS)

Qubole Data Service (QDS) is the #1 cloud-native big data platform that revolutionizes the way companies access insights from the data and make it actionable for their business use case. It serves some of the largest data-driven companies such as Lyft, Expedia, Box, and Oracle.

Anything ‘data’ nowadays is incomplete without the mention of smart minds behind it — data engineers, analysts and data scientists who form Qubole’s major user base. To enable them to access the Qubole Data Service API in their day-to-day work, they use qds-sdk-py, a Python SDK. The SDK provides an easy-to-use command-line interface allowing its users to run Hive, Hadoop, Pig, Presto and shell commands synchronously, or submit a command and check its status against QDS.

Background

Qubole has one or two developers from each team working on qds-sdk-py, which brings the total to 15-20 developers. According to Joy Lal Chattaraj, member of technical staff at Qubole, to maintain project’s health they relied on unit test coverage and de-facto standards like PEP8, compact function definitions, etc. for any new code added. All of these conventions were enforced manually during code reviews.

Challenge

With multiple developers from different teams working on the project:

Solution

Joy realized that automating code reviews will be a win-win for everyone. In August 2019, he implemented DeepSource to run static code analysis for the project. The straightforward configuration required zero technical support and the analysis was up and running in a few minutes. The team’s on-boarding followed shortly after.

Automating the code review process

Automation proved to be a savior for Qubole’s developers. DeepSource scans run with every pull request, and flag the issues directly in GitHub checks within seconds. What happens next?

Shorter feedback loop

The developers conduct the first round of code review themselves and fix the flaws detected before involving the reviewer, saving both the developer and reviewer a lot of time. Recalling one such instance, Joy says:

"One of the developers had added a new feature which had a few uninitialized variables, PEP8 standards violations, and a few more irregularities. DeepSource automatically highlighted those issues and the developer had fixed them even before someone reviewed the code manually, saving a lot of the reviewer's time."

Faster releases with minimal flaws

With DeepSource, Qubole’s review time has decreased 3-fold, feature release cycle has picked up pace and the scope for missing flaws — be it an obvious error or an elusive one, has reduced considerably. A few instances of the flaws detected:

FLK-F821: Undefined name

PYL-E1121: Too many positional arguments for method call

Mandatory checks on GitHub pull requests

Seeing the accuracy of issues reported and the low false positives, Joy and his team also saw evident improvement in quality metrics of the code base. That’s when he decided to make the checks mandatory, which means unless all the issues flagged by DeepSource are not resolved, neither the developer nor the reviewer can merge the code. It has been helpful for Qubole in blocking the pull requests that do not comply with the project’s coding standards.

Tracking test coverage, continuously

At Qubole, the team aims to have 100% unit test coverage for all the incoming code and above 80% for the project overall, which is tracked regularly. Having a tool in place already which tracks test coverage without any overhead, became a bonus.

DeepSoure reports / updates the test coverage status after every run. It helps the developers ensure that the defined threshold is maintained in every pull request that is merged to master. “Test coverage is the check I like the most” Joy says, “and the capability to integrate it with pull requests is icing on the cake.”


Interested in giving DeepSource a try?

Sign up with your GitHub or GitLab account and setup analysis in under 5 minutes, or request a demo to have a closer look and understand how DeepSource can be useful for your use case.

Automate objective parts of code reviews

Automate objective parts of code reviews

Get started