Nov 25, 2024
BitsLab's Sub-brand TonBit Discovers Critical Vulnerability in TON VM Root Cause and Mitigation Explained
The TON network’s virtual machine (VM) recently underwent a significant security upgrade. The security team at TonBit, a subsidiary of BitsLab, successfully identified and helped resolve a critical vulnerability that could have led to resource exhaustion in the TON VM. The vulnerability exploited the recursive mechanism used by the VM when handling Continuation nesting, which could be abused by malicious smart contracts to cause system crashes and network instability.
If exploited, this vulnerability could have enabled attackers to disrupt all validator nodes for the cost of less than 1 TON token, posing a direct threat to the network’s availability. Leveraging its exceptional technical expertise, TonBit swiftly identified the issue and highlighted the risks associated with the recursive mechanism in TON’s virtual machine. This discovery prompted the TON team to implement an innovative solution, replacing the recursive control flow with an iterative approach. This adjustment effectively fixed the vulnerability and enhanced the security of the TON ecosystem.
In its latest update announcement, the TON core team extended special thanks to TonBit for its outstanding contributions to the ecosystem’s security.
This detailed security report will delve into the root cause, technical specifics, and resolution of this vulnerability. The report provides an in-depth analysis of how the vulnerability leveraged deep nesting in Continuations to construct a recursive chain that triggered resource exhaustion attacks. It also explains how malicious smart contracts could exploit this mechanism to increase the call stack depth and exhaust the host’s stack space.
Additionally, we will detail how the TonBit team addressed the issue by eliminating the design flaw in the recursive chain and implementing a collaborative, iterative mechanism as a replacement. This fix not only significantly enhanced the stability of the TON network but also offered a valuable reference for improving the foundational security of the blockchain industry.
Case Study: DoS Vulnerability in TON VM and Relevant Mitigation
Introduction
This report describes a DoS vulnerability in TON VM and the mitigation to address the issue. The vulnerability is caused by the way the VM handles continuation nesting in the contract execution. The vulnerability allows a malicious contract to create a continuation, deeply nesting in special ways and triggering deep recursion on evaluation, which exhausts the host stack space and halts the VM. The mitigation modifies the TVM’s handling of continuations and control flow. Instead of performing sequential tail calls through the continuation chain, the VM now proactively iterates over the chain. This approach ensures that only constant host stack space is used, preventing stack exhaustion.
Synopsis
The TON VM, as outlined in the official documentation, is a stack-based virtual machine that employs a Continuation-Passing Style (CPS) as its control flow mechanism for both its internal processes and smart contracts. Control flow registers are accessible to contracts, providing them with flexibility.
The continuations used in TVM can be theoretically divided into three categories:
• OrdCont, or vmc_std, holding TOM ASM slice to eval, is a first-class citizen in TVM; contracts can create them explicit in runtime and pass it around for implementing arbitrary control flow;
• Extraordinary continuations usually hold OrdCont as components created from explicit iteration primitives and special implicit operations and handle corresponding control flow mechanisms;
• The additional ArgContExt,e nvelope other continuation for saving control data.
During contract execution, the VM enters the main loop where it decodes the contract slice for one word at a time and then dispatches the corresponding operation to the appropriate handler. Normal handlers execute their respective operations in place and return immediately.
In contrast, iteration words create an extraordinary continuation using the supplied continuations as components and jump to the extraordinary continuation with the proper context. The extraordinary continuation itself implements the logic in its jump and jump to one component according to the condition. Take WHILE as an example, we can demonstrate this process (possible jumpout omitted) in Figure 1.
Root Cause
In vulnerable versions of the VM, these jumps result in sequential dynamic tail calls, which require the host stack to maintain a frame for each jump (Figure 2).
Take WhileCont as an example, others omitted for brevity
Figure 2: Tri-jump recursion to step down into nesting
Ideally, this wouldn’t pose a problem because components are usually represented as OrdCont, whose jump only saves current context and then indicates VM to execute the slice it holds prior to the remaining contract slice, with no more recursion introduced. However, the extraordinary continuations are available for its components in the cc(c0 in TVM) register(the set_c0 branches above) as an intended design in theory, so the contract can abuse this feature to execute deep recursion(described later). Compared with changing the implementation of this normal feature, it is clearer and easier to get rid of the recursion in the jump process of extraordinary continuations.
By repetitively using obtained extraordinary continuation in the construction of upper-level extraordinary continuation, a deeply-nested continuation can be created via iteration. These deeply nested continuations, when evaluated, may exhaust the host stack space available, resulting in a SIGSEGV signal from the OS and the termination of the VM process.
Figure 3 provided here for the PoC of nesting procedure:
Figure 3: Nesting Procedure
We see the body extended with one WhileCont{chkcond=true} in each iteration. By executing the saved cc produced in the last iteration, we get a call stack looking like this:
We see a linear dependency of stack space on a nesting level, i.e., iteration times, indicating possible stack exhaustion.
Note on Exploit in Realworld
The gas limit in real-world chains makes the construction of malicious contracts rather difficult. Due to the linear complexity of the nesting process(TVM design effectively prevents cheaper construction by self-referential), the development of a working malicious contract is non-trivial. Quantitatively, one level of nesting results in a call sequence, consuming three host stack frames(320Byte) in debug binary and two in release binary(256Byte, the latter two calls inlined into one). For a validator node on modern POSIX OS, the default stack size is 8MiB, which is sufficient for over 30k levels of nesting in release binary. The construction of a contract that can exhaust the stack space is still possible, but it is much more difficult than the example in the previous section.
Mitigation
The patch modifies the behavior of jumps in the case of continuation nesting. We can see the change of the continuation jump’s signature.
Take UntilCont as an example; others omitted for brevity.
Rather than call VmState::jump on the next continuation, which means recursively performing a tri-jump at each continuation and waiting for a return value to propagate backward, the continuation jump now resolves only the next level of the continuation and then yields control back to the VM.
Cooperatively, the VM iteratively resolves each continuation level until it encounters a NullRef, signaling the completion of the chain, as implemented in OrdCont or ExuQuitCont. During this iterative process, only one continuation jump is allocated on the host stack at any given time, ensuring constant stack usage.
Conclusion
For services requiring availability, the usage of recursion can be a potential attack vector. In case a user-defined logic is involved, the enforcement of recursion termination can be a challenge. This DoS vulnerability shows an extreme case under hard (other) resource limits with an unexpected normal feature abuse. Similar issues can occur in case the recursion is dependent on the user input, which is common among control flow primitives of VMs.
This report analyzes the technical details, root cause, and potential exploitation methods of a critical DoS vulnerability identified in the TON Virtual Machine (VM). It also highlights the efficient solution proposed by the TonBit team, which replaced the VM’s recursive jump mechanism with an iterative processing approach. This innovative fix mitigated a vulnerability that could have led to network paralysis, offering robust security assurances for the TON ecosystem.
The incident underscores TonBit’s deep expertise in blockchain infrastructure security and its pivotal role as the TON network’s official Security Assurance Provider (SAP). As an indispensable security partner within the TON ecosystem, TonBit remains at the forefront of safeguarding blockchain network stability and user asset security. From vulnerability detection to solution design, TonBit has leveraged its technical prowess and profound understanding of blockchain technology to establish a solid foundation for the long-term growth of the TON network.
Moreover, TonBit continues to focus on enhancing network security architecture, protecting user data, and improving the security of blockchain application scenarios. Moving forward, TonBit is committed to driving security innovation and advancing the development of secure technologies, ensuring ongoing support for the health and growth of the TON ecosystem and the broader blockchain industry. This successful vulnerability remediation effort has received high praise from the TON core team, further solidifying TonBit’s position as a leader in blockchain security and demonstrating its steadfast commitment to advancing decentralized ecosystems.
TonBit Official Website: https://www.tonbit.xyz/
TonBit Official Twitter: https://x.com/tonbit_
TonBit Official Telegram: https://t.me/BitsLabHQ
LinkedIn: https://www.linkedin.com/company/tonbit-team/
Blog: https://www.tonbit.xyz/#blogs
For Audit Inquiries, Contact Us from Telegram: @starchou