Smart Contract Vulnerability Detection Methods based on Machine Learning

 DETECTION TECHNIQUES BASED ON MACHINE LEARNING

Smart Contract Vulnerability Detection Methods based on Machine Learning | Blockchain | cryptocurrency | Algorithm | Detection Methods |

Machine learning

     Machine learning is made up of two algorithms. The training algorithm uses data to learn the features of the model and optimize parameters within the model for an objective function. In dissimilarity, the algorithm of inference takes data unlearned as input and deduces a set of structures like the training data. Unsupervised learning refers to a learning algorithm in which each data point is unlabeled. Deep learning, which is based on neural networks and can extract features in a black-box manner, has become the most popular approach to machine learning in recent years.

     The aim of this study is to produce a method for learning frail smart contracts in a directive to find vulnerabilities in untaught smart contracts. A neural network model is a machine learning model that represents connections between neurons on a computer, which is a brain mechanism in a human. On a computer, the model approximates the expression of complex non-linear functions. A neural network treats two types of tasks, a regression task, and a classification task, as well-known settings. The regression problem derives continuous numbers from input data, whereas the classification problem derives a class from input data.

     The extraction of Eth2Vec features using a neural network for natural language processing is classified as a regression problem. Each neural network input signal has a unique weight, and the weight is used to propagate the importance of the input signal to the output layer. In particular, the impact of input on neural network output grows in proportion to the weight of the input. An activation function determines how to activate a quantity of the neural network's input signals. Support Vector Machine (SVM) is a linear binary classifier that can classify input into two categories and has a very high classification performance. Furthermore, by combining with the kernel method, SVM provides nonlinear classification.

ContractWard

     ContractWard distinguishes smart contract weakness in the opcode level by removing bigram highlights from the rearranged opcode and preparing individual parallel ML classifiers for every weakness class. The method targets six vulnerabilities and also tries different things with Random Forests, K-Nearest Neighbors, SVM, AdaBoost, and XGBoost classifiers.

   ContractWard is extremely effective at detecting vulnerabilities in smart contracts. To begin, we simplify opcodes. As a result, the number of extracted features with n-gram (n = 2) is decreased. In other words, ContractWard's input data is shortened Second, the supervised machine's essence The goal of a learning algorithm is to train an objective function that can describe the mapping relationships that exist between feature space and the samples' labels

   ContractWard learns the parameters of the objective function through progressive iteration and updates during the training process. ContractWard can thus straightforwardly detect whether a new mockup is susceptible or not, and what kind of vulnerabilities it goes to, using the variables that are erudite during training. There are two common tools for detecting flaws in smart contracts. Oyente is a tool that uses symbol execution to detect contract vulnerabilities. It is necessary to explore all executable paths in a contract during the detection process.

   In the meantime, the loop body must be iterated. As a result, it is time-consuming. Security is yet another tool for detecting vulnerabilities. It extracts precise semantic facts from contract dependency graphs by symbolically analyzing them, and then uses these facts to match compliance and violation patterns. Constructing dependency graphs and matching patterns takes time as well. The precision of vulnerability prediction with ContractWard is reliant on the certainty of labels engendered by Oyente.

LSTM-based

   The study proposes a sequence learning method to distinguish risks in the opcode of smart contracts. Particularly, it utilizes one-hot encoding and an inserting network to address the agreement's opcode. The acquired code vectors are utilized as a contribution to preparing an LSTM model for deciding if the given smart contract is protected or vulnerable.

AWD-LSTM-based

   A sequence-related multi-class classification plot is introduced. This paper adjusts 'Average Stochastic gradient Descent Weighted Dropped LSTM' (AWD-LSTM)  for vulnerability detection. The proposed model comprises of two sections: a pre-prepared encoder for language assignments, and an LSTM-based classifier for weakness grouping. This technique deals with the opcode even out and can recognize three risk types.

Neural network

    To address these smart contract issues in Ethereum, novel methods beyond the rule-based framework are being developed. We define a smart contract's source code as a contracted graph based on the data- and control dependencies between program statements. The graph's nodes represent critical function invocations or variables, while the edges represent their temporal execution traces. We produce an elimination segment to casualize the graph since maximum GNNs is fundamentally flat throughout information dissemination. To handle normalized graphs, we extend GCN to a degree-free GCN (DR-GCN).

    In addition, we consider the distinct roles and temporal relationships of various program elements and propose a novel temporal message propagation network (TMP). We ran extensive experiments on over 300,000 real-world smart contract functions, and the results show that our approaches significantly outperform state-of-the-art methods for detecting various types of vulnerabilities, such as reentrancy, timestamp dependence, and infinite loop vulnerabilities. Our implementations have been made public in order to facilitate future research.

CNN-based

    The method changes the contract bytecode into fix-sized RGB shading pictures and prepares a convolution neural network for weakness identification. Essentially to ESCORT, a CNN-based classifier utilizes multi-label classification, which has a low

     Certainty score while deciding the definite vulnerability types. In contrast with our work, the CNN-based discovery conspire has the following impediments: (i) The multi-label classification execution isn't fulfilling because of its low certainty level. So, speculate that this is on the grounds that the picture representation of the bytecode and the CNN design disregard the successive data existing in the contract. (ii) The extensibility/speculation capacity of the CNN-based recognition strategy is neither examined nor assessed.

GNN-based

   The study proposes a graph neural organization (GNN)- based methodology. Specifically, this work constructs a 'contract diagram' from the contract's source code where hubs and edges represent basic capacity calls/factors and worldly execution follows, separately. This diagram is standardized to feature significant hubs furthermore, passed to a temporal message propagation (TMP) network for vulnerability prediction.

Comments