Distilling Lightweight Specialized Models for Smart Contract Auditing!

February 17, 2026 · View on GitHub

Distilling Lightweight Specialized Models for Smart Contract Auditing!

Banner

🎯 Core Idea: Teacher-Student Distillation

🕌 HKT-SmartAudit Ecosystem

Our ecosystem provides a comprehensive solution for smart contract auditing with:

🛠️ A suite of specialized models for different audit needs
💻 User-friendly interfaces for seamless interaction
⚡ Powerful backend infrastructure for efficient processing

By leveraging these specialized models, our Ecosystem helps users quickly and accurately identify potential vulnerabilities in their smart contracts, providing valuable insights and recommendations for improvement.

🕌 Fast Glance Notes

🚀 Run Specialized Models

All notebooks are beginner-friendly! Simply add your smart contract dataset, click "Run All", and receive a detailed audit report. Available for both Colab and local execution.

🌟 Featured Models

Model List	Base Model	Notebooks	Model Source
HKT-DeepSeek-R1 (8b)	DeepSeek-R1-Distill-Llama-8B	-	🤗 Download
HKT-Llama-3.1 (8b)	Meta-Llama-3.1-8B-Instruct-bnb-4bit	-	🤗Download
HKT-Gemma-2 (9b)	gemma-2-9b-bnb-4bit	-	🤗 Download
HKT-Mistral-Nemo (12b)	Mistral-Nemo-Instruct-2407-bnb-4bit	-	🤗 Download
HKT-Qwen-2.5-coder (7b)	Qwen/Qwen2.5-Coder-7B-Instruct	-	🤗 Download
HKT-Llama-3.2 (1b)	Llama-3.2-1B-Instruct-bnb-4bit	-	🤗 Download
FTAudit-Llama3 (8B)	Meta-Llama-3-8B-Instruct-bnb-4bit	▶️ Colab	🤗 Download
FTAudit-Mistral (7B)	mistral-7b-instruct-v0.3-bnb-4bit	▶️ Colab	🤗 Download
HKT_qwen3 (14b)	Qwen3-14B-Instruct		🤗 Download
HKT_GLM_4_7 (flash)	GLM-4.7-flash		🤗 Download
HKT_gpt-oss (20b)	gpt-oss-20B		🤗 Download

🔍 Earlier Research Models

Model List	Model Source
FTAudit-Codellama-v0.2 (13B)	⬇Download
FTAudit-Codellama (7B)	⬇Download
FTAudit-Llama2 (7B)	⬇Download

🦙 FTAudit.ai News

🆕 New Release! Adapting our method to smaller models: HKT-vul-llama-3_2-1b
🆕 New Experiment! Testing with reasoning models: HKT-vul-DeepSeek-R1-8b

Teacher-Student Distillation

We have developed an automatic approach to distill knowledge: Distillation

📊 Experimental Datasets & Results

We rigorously evaluated our models across three distinct datasets. All evaluation metrics, reports, and detailed analyses are available in their respective repositories:

1. Standard Vulnerability Set

Dataset: SmartBugs-curated [1] (143 contracts, 182 DASP-classified vulnerabilities)
Scope: 10 common vulnerability types
Evaluation Results: evaluation
→ Precision/Recall metrics, confusion matrices, per-vulnerability breakdown

2. Real-World Projects Set

Dataset: Code4rena-audited [13] (6,454 contracts from 72 projects)
Key Stats: 243 issue contracts, 784 high/medium severity findings
Evaluation Results: evaluation
→ Rank-optimization analysis, complexity-scaling tests, gas optimization impact

3. CVE Benchmark Set

Coverage: 13 critical CVEs (Jan 2025 snapshot)

Categories:

["Integer Overflow", "Access Control", "Logic Bugs", "Other Critical"]

🔗 Links and Resources

Type	Links
📚 Documentation & Wiki	Read Our Wiki
🥇 Benchmarking	Details
🌐 Evaluation	Reports

🐞 Vulnerability Categories

Smart contract vulnerabilities can be divided into two categories: machine-auditable and machine-unauditable. Machine-auditable vulnerabilities can be identified by conventional tools, whereas machine-unauditable vulnerabilities require human auditors for detection. In our study, we label these machine-auditable vulnerabilities as “detectable vulnerabilities” and the machine-unauditable ones as “undetectable vulnerabilities.” Our fine-tuning dataset incorporates both types, comprising a total of 112 distinct vulnerability labels.

🔍 Detectable Vulnerabilities

These vulnerabilities can be identified by conventional automated tools.

Vulnerability Type	Description
Reentrancy	Allows a function to be interrupted and re-entered before completion
Predictable Randomness	Use of deterministic values for random number generation
Integer Overflow	Arithmetic operations exceeding the max value for the data type
Unchecked Low-Level Calls	Failure to check return values of low-level function calls
Front-running	Exploitation of transaction ordering in the blockchain
Denial-of-Service	Preventing legitimate users from accessing the contract
Access Control	Improper restrictions on who can execute certain functions
Time Manipulation	Exploitation of block timestamp dependencies
Uninitialized Struct	Using struct variables without proper initialization
Short Address	Exploiting EVM padding behavior in function parameters

🕵️ Undetectable Vulnerabilities

These vulnerabilities require human auditors for detection due to their complexity or context-dependent nature.

Price Manipulation
Lack of Input Validation
Hidden Ownership Change
Unrestricted Initialization
State Manipulation
Misuse of msg.value within a loop
Incorrect Allowance Update Logic
Data Corruption
Lack of Withdrawal Function
Missing Authorization Checks
Storage Error
Hash Collision
Misuse of shr without Context
Uninitialized Return Variable
Constructor Misdeclaration
Potential Unintended Behavior
Missing onlyOwner Modifier
Shadowing State Variable
Uninitialized State Variable
Improper Struct Initialization
Improper Initialization Check
Withdrawal Functions without Proper Access Control
Direct Modification of Array Length
Misuse of Assembly return
Unchecked Transfer
Unsafe Enum Conversion
Impracticality of Exact Match
Lack of Functionality
Mapping Deletion
Shadow State Variable
Misinterpretation of Unsigned Integer Comparison
Redundant Condition
Integer Division Resulting in Loss of Precision
Atomicity and Ordering
Constructor Inheritance
Constructor Execution Order
Unused Return
Overshadowing Built-in XX
Incorrect Data Type
Uninitialized Function Pointer
Use of Variable before Declaration
Improper Scope of Variable
Incorrect Scope of Variable max
Incorrect Constructor Call
Unbounded Loop with External Calls
Missing State Variable Declarations
Incorrect Increment Operation
Incorrect Operation
Unbounded Gas for External Calls
Incorrect Use of assert
Unnecessary Comparison
Deprecated Features
Unpredictable Initialization
Redundant Statements and Syntax Errors
Incomplete Implementation
Inefficient State Modifications in a Loop
Misuse of Mapping Getter
Cancellation Authority
Discrepancy in Balance Calculation
Dependence on External Data
Complexity
Funding Rate Calculation Precision
Minting Permissions
Flash Loan Fee Manipulation
Centralized Risk
Single Points of Failure
Potential Ownership Hijacking
Insecure Ownership Deletion
Constructor Syntax
Uninitialized creator Variable
Incorrect Require Condition in withdraw Function
Uninitialized Constructor
Missing Zero-Address Check
Lack of Access Control for Sensitive Functions
Signature Malleability
Incorrect Signature Verification
Potential Inconsistency
Oracle Manipulation
Oracle Downtime
Missing Nonce Increment
Off-chain Signature
Replay Attack
Owner's Absolute Control Over Critical Functions
Misuse of msg.sig for Authorization
Unchecked Return Values for ERC20 Transfers
Potential for Inaccurate Data
No Validation of the Price Source
Lack of Proper Error Handling in SafeMath Library
Missing SafeMath Functions for Division
Unchecked Division by Zero
Overflow and Underflow in transfer and transferFrom Functions
Single Point of Failure
Deprecated throw
Hardcoded Timestamp
Use of Deprecated Functions
Logic errors in state machine implementations
Potential Token Lockup
Unlimited Token Approval
Improper Event Emission
Incorrect Handling of ETH Transfers
Lack of Pull Payment Implementation
Incorrect use of fallback functions

Newly Discovered Vulnerabilities:

Our models have successfully identified 13 vulnerabilities across 4 different types that were not detected in the audit reports from Code4rena.

📜 Full List of Undetected Vulnerabilities

Potential Token Lockup:

-- In yVault.sol, the earn function transfers the available tokens to the controller without any checks or restrictions. If the controller contract is not properly implemented or has a vulnerability, it could potentially lock up the tokens and make them irretrievable.

function earn() external {
    uint256 _bal = available();
    token.safeTransfer(address(controller), _bal);
    controller.earn(address(token), _bal);
}

-- The mint function of sYETIToken.sol, sends the YETI tokens to the sYETI contract using the sendToSYETI function of the yetiToken contract. If the yetiToken contract has a vulnerability or is not properly implemented, it could potentially lock up the tokens and make them irretrievable.

function mint(uint256 amount) public returns (bool) {
    User memory user = users[msg.sender];
    uint256 shares = totalSupply == 0 ? amount : (amount * totalSupply) / effectiveYetiTokenBalance;
    user.balance += shares.to128();
    user.lockedUntil = (block.timestamp + LOCK_TIME).to128();
    users[msg.sender] = user;
    totalSupply += shares;
    yetiToken.sendToSYETI(msg.sender, amount);
    effectiveYetiTokenBalance = effectiveYetiTokenBalance.add(amount);
    emit Transfer(address(0), msg.sender, shares);
    return true;
}

Insufficient Input Validation:

In sYETIToken.sol, the setTransferRatio function checks if the new ratio is zero but does not check for overflow. A malicious owner could set a very high transferRatio, potentially manipulating the rebasing mechanism.

function mint(uint256 amount) public returns (bool) {
    User memory user = users[msg.sender];
    uint256 shares = totalSupply == 0 ? amount : (amount * totalSupply) / effectiveYetiTokenBalance;
    user.balance += shares.to128();
    user.lockedUntil = (block.timestamp + LOCK_TIME).to128();
    users[msg.sender] = user;
    totalSupply += shares;
    yetiToken.sendToSYETI(msg.sender, amount);
    effectiveYetiTokenBalance = effectiveYetiTokenBalance.add(amount);
    emit Transfer(address(0), msg.sender, shares);
    return true;
}

Unlimited Token Approval:

In the initialize function of IdeleYieldSource.sol, the contract approves the idleToken to spend an unlimited amount of underlyingAsset tokens using safeApprove with type(uint256).max. This poses a risk if the idleToken contract is compromised or has a vulnerability that allows it to drain the approved tokens.

    function initialize(address _idleToken) public initializer {
        __Ownable_init();
        idleToken = _idleToken;
        underlyingAsset = IIdleToken(idleToken).token();
        IERC20Upgradeable(underlyingAsset).safeApprove(idleToken, type(uint256).max);
        emit IdleYieldSourceInitialized(idleToken);
    }

In the _sendForReceiver function of NFTXSimpleFeeDistributor.sol, the contract approves the receiver contract to spend an unlimited amount of tokens using IERC20Upgradeable(_vault).approve(_receiver.receiver, amountToSend). This practice is generally discouraged as it poses a security risk if the receiver contract is compromised or behaves maliciously.

  function _sendForReceiver(FeeReceiver memory _receiver, uint256 _vaultId, address _vault, uint256 amountToSend) internal virtual returns (bool) {
    if (_receiver.isContract) {
      IERC20Upgradeable(_vault).approve(_receiver.receiver, amountToSend);
      
      bytes memory payload = abi.encodeWithSelector(INFTXLPStaking.receiveRewards.selector, _vaultId, amountToSend);
      (bool success, ) = address(_receiver.receiver).call(payload);
      return success && IERC20Upgradeable(_vault).allowance(address(this), _receiver.receiver) == 0;
    } else {
      IERC20Upgradeable(_vault).safeTransfer(_receiver.receiver, amountToSend);
    }
  }

The _maxApprove function of SingleTokenJoinV2.sol, grants unlimited token approval to the specified spender if the current allowance is less than the contract's token balance. This could potentially allow the approved spender to transfer more tokens than intended. The contract approves unlimited token spending for each token in the basket using _maxApprove before calling joinPool. If any of the tokens in the basket are malicious or have a vulnerability, they could potentially drain the approved tokens from the contract.

The Vault.sol contract does not handle the return values of some external function calls. For example, in the earn function, the return value of _controller.earn is not checked. If the external function fails or returns an unexpected value, it could lead to inconsistencies or unexpected behavior.

The lockWithPermit function of XDEFIDistribution.sol contract, uses the permit function of the IEIP2612 interface to approve the token transfer. However, there is no check to ensure that the permit call was successful before proceeding with the token transfer.

Unprotected Function:

Unprotected setMinter Function: The setMinter function of Position.sol contract, allows the contract owner to set any address as a minter. However, there is no mechanism to prevent the owner from accidentally or maliciously setting an unintended address as a minter.

Unprotected setParams() Function: The setParams() function of USDV.sol contract, allows the DAO to set the blockDelay parameter. However, there are no restrictions on the value that can be set. An attacker controlling the DAO can set an arbitrary blockDelay, potentially disrupting the contract's behavior or enabling attacks. It is important to validate and restrict the range of values that can be set for critical parameters.

Unprotected setMin Function: The setMin function of Vault.sol, allows the strategist to set the min value, which determines the minimum amount of tokens available for earning. However, there is no upper limit or validation on the _min value. A malicious strategist could set a very high min value, effectively locking up a significant portion of the vault's tokens.

Unprotected migrate function: The migrate function of MapleLoan.sol, allows the factory contract to migrate the contract to a new implementation. However, there are no restrictions on the migrator_ address or the arguments_ passed to the function. This could lead to unauthorized migration or loss of funds if not properly implemented.

🏋🏿Training Details

😘 Thank You to

Unsloth AI for providing the Faster Interface Plan, accelerating our evaluations.