Revenera logo
Image: Software supply-chain vulnerabilities: A close look on code

Vulnerabilities that affect the supply chain of software and its distribution are easily among the most terrifying experiences any company involved must endure. Such vulnerabilities typically affect a wide range of different parties who use the affected software or parts thereof and who may distribute them further down the road. In turn, investigating and mitigating such vulnerabilities and their impacts becomes an enormous task.

One key aspect for prevention and mitigation is to become aware of vulnerabilities in your own code. You must deeply know your own code effectively.

Using a Software Composition Analysis (SCA) tool provides invaluable intelligence by using source code analysis. Whether you utilize third party code directly or through forking for example, potentially vulnerable code sections can easily be identified based on known vulnerabilities in the many existing source-code projects.

Perhaps you forked parts of the OpenSSL project without realizing that you essentially imported known vulnerabilities into your own source code with your action.

Other processes like code reviews are equally crucial to ensure that your own code is secure, especially when your code incorporates third-party code portions by either using a source-code project directly or contributions like third-party initiated pull requests.

To give an example, recently, a vulnerability (CVE-2021-42574), commonly referred to as “Trojan Source”) made some headlines by obfuscating malicious code through bi-directional Unicode characters. While obfuscating malicious code is a long-used technique to hide the existence of malicious code, think about Base64 encoding as a very basic means to do so, the use of bi-directional Unicode characters adds a nice twist to it. Researchers at the University of Cambridge analyzed, coordinated and made the issue public [1][2].

Such bi-directional Unicode characters are needed for the support of languages written in differing directions – left to right (e.g., English) and right to left (e.g., Arabic). For that the Unicode standard features a wide array of bi-directional control characters including ones that allow the direction to change for a character group solely to allow for a mix of directions within a text.

To get back to the attack, in essence it relies on source code being handled differently during code review process than when it is being used as part of compilation or interpretation process later. The outcome is that any person or process reviewing the code may review a completely different code-logic compared to what the resulting code getting executed will be. The result being that the code reviewer is completely unaware of the malicious code logic.

Vendors and maintainers are still trying to cope with the situation at hand. In many ways adding means to add awareness through direct warnings or logging whenever such bi-directional Unicode characters exist. After all, completely removing support for such Unicode characters will also remove completely valid use-cases and doesn’t appear to be feasible.

Overall, the sole advice left for potentially affected users and companies appears to be to try to use products and tools that are aligned to handle the bi-directional Unicode characters in a similar fashion or at the very least issue warnings, which allows to be able to detect the malicious code-logic then. This can mean that code review processes may have to be redesigned completely though.

In the end, be mindful of such attack vectors, analyze them in your own context, consider an SCA solution, and stay secure!