Researchers with the University of Cambridge have discovered a bug that affects most computer code compilers and many software development environments. At issue is a component of the digital text encoding standard Unicode, which allows computers to exchange information regardless of the language used. Unicode currently defines more than 143,000 characters across 154 different language scripts (in addition to many non-script character sets, such as emojis). Specifically, the weakness involves Unicode’s bi-directional or ‘Bidi’ algorithm, which handles displaying text that includes mixed scripts with different display orders, such as Arabic — which is read right to left — and English (left to right). But computer systems need to have a deterministic way of resolving conflicting directionality in text. Enter the ‘Bidi override’, which can be used to make left-to-right text read right-to-left, and vice versa.
Jonathan Knudsen, Senior Security Strategist at Synopsys, commented on the announcement: “Boucher and Anderson’s paper Trojan Source: Invisible Vulnerabilities explores how Unicode control characters could allow malicious actors to insert vulnerabilities in source code. At the heart of the problem are currently differences between how source code is displayed to developers and how it is interpreted by the compiler. Using Unicode control characters, the researchers were able to construct source code that appears to behave one way, but in fact behaves differently.
“Trojan Source highlights the fact that nearly all development teams use open source components as a foundation for their applications. An attacker could contribute source code to an open source component that appears innocuous but has a nefarious purpose. This was always a possibility, but Trojan Source makes it easier to disguise the intent of malicious code.
“The entire ecosystem is reacting with warnings and mitigations about Unicode control characters found in source code, as detailed in the paper.
“Meanwhile, good cybersecurity during application development is a necessity, just as it always has been. Threat modelling helps flush out design vulnerabilities, while automated testing helps locate vulnerabilities during implementation. Software Composition Analysis (SCA), in particular, helps developers manage the open source components they’ve used and keep on top of evolving known vulnerabilities in those components.”Click below to share this article