IDENTIFICATION OF CRACKS IN PIPELINES BASED ON MACHINE  
LEARNING AND DEEP LEARNING 
 
 
 
BY  
JINCHEN HE 
 
 
 
 
 
A THESIS  
PRESENTED TO LAKEHEAD UNIVERSITY 
 IN FULFILLMENT OF THE  
THEIS REQUIREMENT FOR THE DEGREE OF  
MASTER OF MECHANICAL ENGINEERING 
 
 
 
 
 
 
 
 
 
Lakehead University, Faculty of Mechanical Engineering 
Thunder Bay, Ontario, CANADA, January 2022 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
© Jinchen He, January 2022. 
All rights reserved. 
 
 
 
 
 
 
 
 
 
 
 
 
This thesis by Jinchen He is accepted in its present form by the Mechanical Engineering 
Department of Lakehead University as satisfying the thesis requirements for the degree of 
Bachelor of Engineering 
 
APPROVED BY 
 
SUPERVISOR   
Dr. Hao Bai     
Name Signature Date 
   
EXAMINER   
Dr. Wilson Wang   
Name Signature Date 
   
EXAMINER   
Dr. Jian Deng   
Name Signature Date 
   
 
 
 
 
 
 
 
 
Declaration 
I certify that I am the author of this project and that any assistance I received in its 
preparation is fully acknowledged and disclosed in the project. I have also cited any source from 
which I used data, ideas, or words, either quoted or directly paraphrased. I also certify that this 
current study was prepared by me specifically for this course. 
No portion of the work referred to in this study has been submitted in support of an application 
for another degree or qualification to this or any other university or institution of learning  
Jinchen He January, 2022 
Student Name Signature Date 
Abstract 
Pipelines are important long-distance transportation structures in modern industry, and because 
many are buried deep underground, pipeline health monitoring is critical to industry; however, 
inspecting underground pipelines can be quite challenging due to the large financial and human 
resources required. For decades, different methods have been used to assess pipeline cracks. 
Ultrasonic quantitative nondestructive testing (QNDT) is one of the frequently used methods in 
pipeline health monitoring. In the current study, the coefficients of the reflected and transmitted 
waves due to different incident waves were first generated by using a semi-analytical finite 
element method based on classical elasticity theory. In that study, different types of pipes, 
including different geometries and materials, were considered. Then four different regression 
machine learning algorithms and three deep learning algorithms were used to identify crack 
features. In this study, the prediction accuracy was compared between the different algorithms 
and different datasets. The objective was to find the algorithm with the highest prediction rate 
and to select a suitable dataset for prediction. It was found that the extremely randomized tree 
(ERT) algorithm was the best in identifying cracks in the pipeline. The prediction accuracy will 
be improved by selecting different data sets. In addition, all algorithms performed better in 
predicting the radial crack depth (CDRD) than predicting the circumferential crack width 
(CWCD). 
Keywords: Non-Destructive Testing, Ultrasonic, Wave Response Coefficient, Machine Learning, 
Deep Learning    
Acknowledgments 
My utmost gratitude goes to my thesis supervisor Dr. H. Bai, whose unlimited guidance and 
dedication have provided a lot of support. It is his continuous encouragement enabled me to 
complete my thesis research during the epidemic period. He taught me how to carry on the 
research as clearly as possible, and it was a great privilege to work and learn with him. 
Meanwhile, I would like to express my deep gratitude to Dr. Wilson Wang and Dr.Jian Deng for 
their valuable suggestions on my research topic. 
Finally, I would like to thank all the people who gave me support during the epidemic, especially 
my parents, whose unconditional love and care become my strength to facilitate my thesis 
completion. 
Jinchen He 
Lakehead University 
January 2022 
TABLE OF CONTENTS 
List of Table ............................................................................................................................................ 9 
List of Figure......................................................................................................................................... 10 
List of Abbreviations ............................................................................................................................. 11 
Chapter I Introduction ........................................................................................................................... 12 
1.1 Literature Review ........................................................................................................................ 13 
1.1.1 Applications in Mechanical Engineering ............................................................................... 13 
1.1.2 Applications in Bioengineering ............................................................................................. 15 
1.1.3 Applications in Medical Science............................................................................................ 15 
1.1.4 Applications in Physics ......................................................................................................... 16 
1.1.5 Applications in Other Domain ............................................................................................... 16 
1.2 Outline of the Thesis .................................................................................................................... 17 
1.3 Contributions ............................................................................................................................... 18 
Chapter II Methodology ........................................................................................................................ 19 
2.1 The Fundamentals of Waves in a Cylinder ................................................................................... 19 
2.1.1 Equations of Motion .............................................................................................................. 19 
2.1.2 Semi Analytical Finite Element ............................................................................................. 21 
2.2 Selection of Data Characteristics .................................................................................................. 24 
2.3 Combination of Machine Learning ............................................................................................... 25 
2.3.1 Determination of Machine Learning Algorithm ..................................................................... 25 
2.3.2 Principles of Machine Learning Algorithms .......................................................................... 25 
2.3.2.1 Support Vector Machine (SVM) Algorithm ......................................................... 25 
2.3.2.2 Random Forest Algorithm ................................................................................... 26 
2.3.2.3 Extremely Randomized Tree Algorithm .............................................................. 27 
2.3.2.4 K-Nearest Neighbors Algorithm .......................................................................... 27 
2.3.3 Selection of Hyperparameters in Machine Learning Code ...................................................... 28 
2.3.3.1 Support Vector Machines (SVM) ........................................................................ 28 
2.3.3.2 K-Nearest Neighbors (KNN) ............................................................................... 29 
2.3.3.3 Random Forest & Extremely Randomized Tree Algorithm .................................. 29 
2.4 Combination of Deep Learning .................................................................................................... 30 
2.4.1 Determination of Deep Learning Algorithm .......................................................................... 30 
2.4.2 Principles of Deep Learning Algorithms ................................................................................ 31 
2.4.2.1 Recurrent Neural Network (RNN) Algorithm Principles ...................................... 31 
2.4.2.2 Principle of Long Short-Term Memory (LSTM) .................................................. 32 
2.4.2.3 Principle of Gated Recurrent Unit (GRU) ............................................................ 35 
2.5 Selection of Hyperparameters for Deep Learning Code ................................................................ 36 
Chapter III Process and Results ............................................................................................................. 38 
3.1 Pre-Setting of Hyperparameters and Pipeline Parameters.............................................................. 38 
3.1.1 Settings of Pipeline Parameters ............................................................................................. 38 
3.1.2 Settings of the Machine Learning Hyperparameters ............................................................... 39 
3.1.3 Settings of Deep Learning Hyperparameters .......................................................................... 41 
3.2 Assessment Criteria ..................................................................................................................... 42 
3.2.1 Mean Absolute Error ............................................................................................................. 42 
3.2.2 Coefficient of Determination ................................................................................................. 42 
3.2.3 Root Mean Square Error ........................................................................................................ 43 
3.3 Prediction of Pipeline Crack Information ..................................................................................... 43 
3.3.1 Prediction Results of Crack Depth in Radial Direction (CDRD)............................................. 45 
3.3.2 Prediction Results of Crack Width in Circumferential Direction (CWCD) ............................. 47 
3.3.3 Results Comparison of CDRD and CWCD ............................................................................ 49 
3.3.4 Comparison of Machine Learning and Deep Learning ........................................................... 52 
3.4 Results & Discussion ................................................................................................................... 55 
Chapter IV Conclusion .......................................................................................................................... 56 
Chapter V Future Works ........................................................................................................................ 58 
Reference .............................................................................................................................................. 59 
Appendices............................................................................................................................................ 62 
Appendix A: Control group file about CDRD .................................................................................... 62 
Appendix B: Control group file about CWCD .................................................................................... 64 
Appendix C: CDRD prediction codes in Extremely Randomized Tree ............................................... 66 
Appendix D: Part of the CDRD prediction codes in Gated Recurrent Unit .......................................... 67 
 
 
List of Table 
Fig. 1 Cylindrical Coordinates ....................................................................................................... 19 
Fig. 2 Semi Analytical Finite Element in Cylinder.......................................................................... 21 
Fig. 3 Fortran Process .................................................................................................................... 24 
Fig. 4 Support Vector Classifier (left) and Support Vector Regression (right) ................................. 26 
Fig. 5 Random Forest Structure Diagram ....................................................................................... 27 
Fig. 6 K-Nearest Neighbors ........................................................................................................... 28 
Fig. 7 Example of Recurrent Neural Network (RNN) ..................................................................... 31 
Fig. 8 Basic structure of a standard recurrent neural network (RNN) .............................................. 32 
Fig. 9 Principle of Long-Short Term Memory (LSTM)................................................................... 33 
Fig. 10 Inner structure of a Long Short-Term Memory (LSTM) ..................................................... 34 
Fig. 11 Neural Network Figure of Long-Short Term Memory (LSTM) ........................................... 35 
Fig. 12 GRU cell............................................................................................................................ 36 
Fig. 13 CDRD Prediction Histogram .............................................................................................. 47 
Fig. 14 CWCD Prediction Histogram ............................................................................................. 48 
Fig. 15 Performance comparison histogram of CDRD and CWCD in Control Group ...................... 49 
Fig. 16 Performance comparison histogram of CDRD and CWCD in CFP Group ........................... 50 
Fig. 17 Performance comparison histogram of CDRD and CWCD in TOMR Group ...................... 50 
Fig. 18 Performance comparison histogram of CDRD and CWCD in ICWN Group ....................... 51 
Fig. 19 Performance comparison histogram of CDRD and CWCD in PM Group ............................ 51 
Fig. 20 Performance comparison histogram of CDRD and CWCD in Total Group ......................... 52 
Fig. 21 R2 comparison histogram ................................................................................................... 53 
Fig. 22 R2 comparison histogram ................................................................................................... 54 
Fig. 23 GRU loss curve in CDRD prediction .................................................................................. 54 
Fig. 24 GRU loss curve in CWCD prediction ................................................................................. 55 
 
 
 
 
9 
 
List of Figure 
Table 1 Setting Information about Fortran Code ............................................................................. 39 
Table 2 Hyperparameters of Machine Learning Algorithms............................................................ 40 
Table 3 Hyperparameters of Deep Learning Algorithms ................................................................. 41 
Table 4 Training Data Case ............................................................................................................ 44 
Table 5 Using different data types and different algorithms to predict CDRD ................................. 45 
Table 6 Standard Deviation of Different CDRD Data Types ........................................................... 46 
Table 7 Using different data types and different algorithms to predict CWCD ................................ 47 
Table 8 Standard Deviation of different CWCD data types ............................................................. 48 
Table 9 Control group file about CDRD ......................................................................................... 62 
Table 10 Control group file about CWCD ...................................................................................... 64 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10 
 
List of Abbreviations 
Abbreviations Meaning 
CDRD Crack Depth in the Radial Direction 
CWCD Crack Width in the Circumferential Direction 
TOMR Thickness Over Mean Radius 
ICWN Input Circumferential Wave Number 
CFP Circular Frequency in Pipeline 
PM Pipeline Materials 
SVM Support Vector Machine 
KNN K-Nearest Neighbors 
RF Random Forest 
ERT Extremely Randomized Tree 
SRNN Simple Recurrent Neural Network 
LSTM Long Short-Term Memory 
GRU Gated Recurrent Unit 
MAE Mean Absolute Error 
R2 Coefficient of Determination 
CT Computed Tomography 
UT Ultrasonic Testing 
CCTV Closed-Circuit Television Testing  
HMM Hidden Markov Model 
NDT Non-Destructive Testing 
NN Neural Network 
NIR Near-Infrared Spectroscopy 
PCA Principal Component Analysis 
DCNN Deep Convolution Neural Network 
ANN Artificial Neural Network 
CNN Convolutional Neural Networks 
11 
 
Chapter I Introduction 
The rapid development of cities and industries relies heavily on the network of pipelines 
including multiple applications such as oil pipelines, natural gas transportation, and urban 
sewage pipeline system. A long-distance natural gas or crude oil pipeline can spread out 
typically more than 2,000 kilometers, causing a high probability of deterioration/damage 
occurring to the structure due to either natural disasters or human intervention. In particular, the 
lifetime of oil pipeline is a crucial part of a few countries’ economies, and a ruptured pipeline not 
only causes significant damage to the economy but also can destroy the ecosystem and 
environment. For instance, the largest marine oil spill in human history occurred in the Gulf of 
Mexico on April 20, 2010, resulted in the loss of 11 human lives and $1 billion to the British 
Petroleum Company. It negatively impacted the fishing economy of Louisiana in the United 
States. In addition, crude oil pollution caused serious damage to the ecological systems, led by 
some vulnerable species in extinction. 
Therefore, the operation process of such long-distance and large-diameter-based pipelines 
requires to conduct regular health monitoring in order to safeguard pipelines from incurred 
damages, namely corrosion and rust, stress deformation, and welding defects. This article 
primarily outlines the optimization of crack/defect detection methods for pipelines. Among them, 
methods such as radio-graphic flaw detection, Computed Tomography (CT), Ultrasonic Testing 
(UT) technology, and Closed-Circuit Television Testing (CCTV) methods are commonly used. 
However, deep buried pipelines consisting of oil or liquid water are hard to be inspected, for 
instance, Closed-Circuit Television Testing (CCTV) is particularly inconvenient in such cases 
due to the limitation of a robotic car not to enter inside the pipeline and take pictures. Conversely, 
the use of ultrasonic testing technology can meet most testing needs in practice. The principle of 
ultrasonic detection technology mainly uses the characteristics of ultrasonic waves propagating 
along the pipeline, which are reflected back from the edge of the crack interface.  This 
information is used to detect and inspect pipeline defects, but this technology relies on 
specialized equipment such as transducers and proficient experts to classify and judge the images 
generated by the instrument. Relying on humans to qualitatively evaluate images is undoubtedly 
very inefficient compared to using neural networks. Many neural networks primarily utilize the 
whole image or values generated by the ultrasound machine as the basis for neural network 
12 
 
training. However, there is not sufficient data for training. The error in the classification of 
defects or prediction results is relatively large in this instance. Therefore, in this work, the wave 
response coefficients generated by Fortran software as a dataset are used and combined with a 
variety of machine learning and deep learning algorithms for comparison and discussion. This 
strategy can generate a large amount of data while reducing labor costs. 
 
1.1 Literature Review 
1.1.1 Applications in Mechanical Engineering  
In 2008, M. Wolff [1] conducted a health examination using the acoustic structure of the 
components of an aircraft. The aluminum plates and B-CFRP plates with cracks were arranged 
into two groups i.e., A and B for experimental comparison. The transducer on the aluminum 
plate and B-CFRP plate were arranged in a circular form and matrix form, respectively. A 
Hidden Markov Model (HMM) and Support Vector Machine (SVM) model were used as 
statistical classifiers to classify the plates with or without cracks. They observed that a statistical 
classifier cannot accurately locate the crack position, while a high degree of accuracy in the 
classification of crack is achieved. In terms of classification accuracy, the support vector 
machine (SVM) model was higher than the Hidden Markov Model (HMM), and its classification 
accuracy of isotropic materials was higher than that of composites. 
In 2008, Chengjun Jiang [2] reported detection technologies of pipelines including radiographic 
inspection, Ultrasonic Testing (UT), metal magnetic test, etc., and explained that Ultrasonic 
Testing (UT) technology was better than other technologies in detecting plane defects in any 
direction of the material.  which was this study’s basis for choosing Ultrasonic Testing in 
combination with machine learning.  
In 2008, Carvalho [3] used radiography, manual detection, and automatic acoustic technology to 
classify three types of industrial piping defects, namely lack of penetration, lack of fusion, and 
undercut. Among them, the radiographic inspection technology adopted γ-ray and X-ray, while 
automatic scanning was performed by an inspection vehicle with magnetic wheels. By using 
scanned data, the defect size was estimated by detecting the discontinuity of the weld combining 
MATLAB. The results showed that Ultrasonic Testing (UT) technology had obvious advantages 
13 
 
over other technologies. The Carvalho group attempted to utilize Artificial Neural Network 
(ANN) method to classify defects. After the ultrasonic signal was preprocessed and smoothed, it 
was used as a featured input. Results showed that Artificial Neural Network (ANN) cannot 
classify different types of defects but could examine the existence of defects. 
In 2009, Caiping Zhao [4] developed a diagnostic system for detecting defects using pipeline 
ultrasonic guided wave test data. This system selected the eigenvalues such as the amplitude of 
the reflected signal to demonstrate the features and defects of the pipeline structure in the 
detection diagram. Fifteen classes of defects were classified by the Back Propagation (BP) 
algorithm, and the recognition rate of elbows and welding joints were the highest and reached 
more than 90%, which proved that the recognition rate of the neural network was high with 
stable results. 
In 2019, Roberto [5] used Ultrasonic Testing (UT) flaw tracker to collect multiple sets of 
ultrasonic data including the length, depth, and location of cracks. After professionals classified 
and trained the data using machine learning, resulted in pinpoint accuracy of Support Vector 
Machine (SVM) to classify various classes of defects. 
In 2019, Tripathi [6] classified microdamage using a piezoelectric ceramics sample. In general, it 
is difficult for experts to classify the damaged value in material with smaller than 100 µm cracks 
in depth. Through machine learning and deep learning, the K-Nearest Neighbor (KNN) of 
machine learning was very suitable to classify micro-damaged ceramic plates in the counterpart 
of Convolutional Neural Networks (CNN) deep learning. Therefore, the deep learning techniques 
cannot offer specific advantages over simple machine learning algorithms. In terms of data 
feature selection, the features of frequency domain were found better than that in time domain. 
In 2020, A. Mardanshahi [7] proposed a new model to automatically detect and classify the crack 
density of composite materials using guided wave propagation and artificial intelligence. They 
used Non-Destructive Testing (NDT) program via the antisymmetric Lamb wave to test samples 
with different crack densities, and then extracted information such as the amplitude and wave 
speed of the Lamb wave from the collected signals. After training the set, the classification 
accuracy of the Support Vector Machine (SVM) showed the highest performance of 91.7%, 
while the classification accuracy of the Neural Network (NN) reached a maximum of 88%. 
14 
 
 
1.1.2 Applications in Bioengineering 
Wort is the liquid extracted from the mashing process during the brewing of liquor. In 2019, Fan 
Zhang [8] determined the wort production quality of beer by combining Non-Destructive Testing 
(NDT) technology with machine learning. Since the production data was high-dimensional, a 
dimensionality reduction method to obtain a concise latent space was adapted, which was further 
used for data analysis to control the wort production quality. In this experiment, Near-Infrared 
Spectroscopy (NIR) technology was used primarily to collect production data, and the obtained 
data was combined with machine learning methods such as Principal Component Analysis (PCA) 
to analyses wort production quality. This group successfully demonstrated the use of low-
dimensional data to represent high-dimensional data. 
In 2020, Te Ma [9] used near-infrared hyperspectral imaging combined with deep learning to 
predict seed viability. The experimental data were related to the internal molecular vibration 
information (chemical composition difference) and spatial distribution of seeds. The Principal 
Component Analysis (PCA) method and Support Vector Machine (SVM) method were used for 
training. The results showed that even the naturally aged seed test set could produce about 90% 
accuracy in classification as compared to the normal seed test set with a classification accuracy 
of nearly 95%, which proved the reliability of these two methods for predicting seed viability. 
 
1.1.3 Applications in Medical Science 
In 2017, Burlina [10] used ultrasound imaging combined with machine learning and deep learning 
for the diagnosis of muscle inflammation. Eighty subjects in this experiment were divided into 
three groups of patients with different muscle inflammation and one group of healthy individuals 
was considered for reference. In terms of machine learning methods, the echo intensity of muscle 
and fat was used as the characteristics for training by using the random forest method. 
Meanwhile, Deep Convolution Neural Network (DCNN) used to train ultrasound images. The 
results indicated that predictions performance made by Deep Convolution Neural Network 
(DCNN) method was higher than machine learning method random forests. 
15 
 
In 2020, Zhengsi Xiong [11] performed an Non-Destructive Testing (NDT) of liver cancer with a 
combination of machine learning methods. The experiment was designed to collect breath data 
from healthy subjects and liver cancer patients, and dimensionality reduction processing on the 
generated data was performed. Data such as sensor temperature and relative humidity were 
selected as features, and then machine learning, the Support Vector Machine (SVM) algorithm, 
was applied to classify the data. The Support Vector Machine (SVM) algorithm with a linear 
kernel function showed the best classification effect. 
 
1.1.4 Applications in Physics 
In 2018, William Sorteberg [12] research group created a data set to simulate wave motion using 
the Long Short-Term Memory (LSTM) method in order to build a predictive deep neural 
network with three main modules. After testing the test set, the structural similarity index 
decreased during longer-term predictions, and the neural network can predict the future 
information up to 80-time steps only. 
In 2019, Rautela M [13] used high-frequency tone-burst signals as the excitation waveguide in the 
experiment and the time domain as the spatial feature input in the deep learning framework. 
They used Convolution Neuron Network (CNN), Recurrent Neural Network (RNN), and Long 
Short-Term Memory (LSTM) to detect the signal to simulate cracks in the waveguide. This 
article supports the evidence that a deep learning framework can provide perfect binary 
classification. The significance of this study highlights the fact that deep learning algorithms are 
promising tool for learning guided wave data sets. 
In 2019, Yohei [14] combined Convolution Neuron Network (CNN) with sensors to create a new 
wave-front sensor. The principle was that the image receiver used the deep learning method to 
estimate directly the Zernike coefficient by preprocessing the measured value of the intensity of 
a single light source, indicated that data preprocessing can improve the accuracy of the 
wavefront prediction through deep learning. Furthermore, Zernike polynomials were used to 
describe the properties of the wavefront. 
 
1.1.5 Applications in Other Domain 
16 
 
In 2017, Pingping Zhu [15] used deep learning to recognize and classify targets in underwater 
sonar images and utilized Convolution Neuron Network (CNN) to extract image features, and 
then a Support Vector Machine (SVM) was used for classification after extraction. They 
confirmed that the combination of Support Vector Machine (SVM) with Convolution Neuron 
Network (CNN) showed the best effect on the classification of underwater sonar images as 
compared to the combination of Support Vector Machine (SVM) with local binary pattern and 
Support Vector Machine (SVM) with the histogram of oriented gradients. 
In 2018, Tomasz [16] developed a hybrid Computed Tomography (CT) scanner with special 
sensors for designing humidity analysis. The scanner was used to generate data for training, 
which combined with the neural network method to create the wall humidity images, leading to 
reflect the humidity inside the wall. They demonstrated that the estimation of wall humidity in 
combination with neural networks is better than traditional least angle regression or ElasticNet 
methods. 
 
1.2 Outline of the Thesis  
(1) Generate datasets using Fortran program based on the theory of elasticity  
(2) Use Machine Learning and Deep Learning to predict crack information. 
• Machine Learning Algorithms:  
o Support Vector Machine (SVM)  
o K-Nearest Neighbors (KNN) 
o Random Forest (RF) 
o Extremely Randomized Tree (ERF) 
• Deep Learning Algorithms:  
o Simple Recurrent Neural Network (SRNN) 
o Long-Short Term Memory (LSTM) 
o Gated Recurrent Unit (GRU) 
17 
 
(3) Introduce different data types and compare the training results 
(4) Create a table of results to reflect the predictive performance between different method. 
(5) Compare the prediction performance of Machine Learning and Deep Learning. 
(6) Analysis and discuss the output results.  
 
1.3 Contributions  
Firstly, the current study identified a suitable algorithm for predicting crack defects in pipelines, 
that is because many conclusions on the ideal prediction method is not uniform, to discover an 
algorithm that is optimal for pipeline crack prediction was the goal of this study. 
Secondly, the wave response coefficients are used as features to predict pipelines’ cracks. The 
coefficients be got are differ from the features about wave velocity or wave form amplitude, that 
is because the wave response coefficients are frequency domain signals which transferred form 
time domain signals by Fourier transform. 
Thirdly, the using of the suggested data type and features of prediction cracks is reliable and 
accurate. Because varies data sets can be departed into different data types, and each data set has 
different prediction performance, that is also the same as features. Thus, to improve the 
prediction accuracy, finding an optimal data type and feature are needed. 
 
 
 
 
 
 
 
 
18 
 
Chapter II Methodology 
2.1 The Fundamentals of Waves in a Cylinder 
This chapter will briefly cover wave propagation in a cylinder using the formula and theory 
described by Datta and Shah [17]. First, the fundamental equations of wave propagation in a 
cylinder are reviewed. After obtaining the wave solution via the semi-finite element approach, 
the wave solution is used to solve the wave reflection and transmission problems. 
 
2.1.1 Equations of Motion  
 
 
Fig. 1 Cylindrical Coordinates 
 
Cylindrical coordinates are used to solve the wave propagation problem with each point of three 
displacement components:  
① ur displacement component in the radial direction. 
② uθ displacement component in the circumferential direction.  
③ uz displacement component along the vertical axis.  
The equations of motion expressed in terms of stresses are:  
 ∂σrr 1 ∂σrθ ∂σrz 1
 (2.1) 
+ + + (σ − σ ) = ρü
∂r r ∂θ ∂z r rr θθ r
19 
 
 ∂σrθ 1 ∂σθθ ∂σθz 2 (2.2) 
+ + + σ
∂r r ∂θ ∂z r rθ = ρüθ 
 ∂σrz 1 ∂σθz ∂σzz 1
 (2.3) 
+ + + σ
∂r r ∂θ ∂z r rz = ρüz
Here σij is the stress components, i, j = r, θ, z. ρ is mass density. Double dots represent the partial 
derivative with respect to time. The relationship between strain and displacement is given by: 
 e = (Lr + Lθ + Lz)u (2.4) 
Here, u = (ur, uθ, uz)T is displacement vector and Lr, Lθ, and Lz represent the partial derivatives 
of r, θ, and z, respectively, and are given by: 
 ∂ (2.5) 
 0 0  0 0 0 0 0 0
 ∂r   1 ∂   0 0 0  
 1   0 0    
0 0 r ∂θ ∂
 r     0 0  
 0 0 0 ∂z
0 0 0
L =      
r    L =  1 ∂
0 0 0 θ 0 0  ,   Lz =  ∂  
  0 0  
∂  r ∂θ  ∂z  
 0 0   0 0 0   ∂  
 ∂r  1 ∂   0 0  
 ∂ 1  0 0 ∂z
0 − 0 [r ∂θ ] [
[ ] 0 0 0 ]
∂r r
According to Hooke's law, the relation expression is listed as follow: 
 σ c c c c c c
rr 11 12 13 14 15 16 err (2.6) 
   c c c c  
σθθ 22 23 24 25 c26  e  
  θθ
 σ    
zz c
 33 c34 c35 c36 ezz
 σ
=
θz  c  
44 c45 c46  γθz 
 σzr  sym c55 c56  γzr 
{σrθ} [ c {γ }
66] rθ
Here, eii  is the normal strain components, i = r, θ, z.  and  γij  is the engineering shear strain 
component.  i, j = r, θ, z, i ≠ j. The vector on the left side of the equation represents six distinct 
stress components, whereas six distinct strain components are represented in the vector on the 
right. 
The boundary conditions are that the inner and outer surfaces are traction free. 
 
20 
 
2.1.2 Semi Analytical Finite Element  
As shown in Figure 2, the section of pipeline is divided into N parts. The black circles are one of 
the finite elements, Rk is the inner diameter of the ring, Rk+1 refers to the outer diameter of the 
ring, hk stands for the thickness of the ring, H is the thickness of the entire pipeline, kth sublayer 
refers to the kth ring. 
 
 
Fig. 2 Semi Analytical Finite Element in Cylinder 
 
Divide the composite cylinder into several coaxial cylinders and used a quadratic polynomial 
interpolation function to represent the displacement distribution on the thickness of the sublayer 
in the radial direction. In the kth sublayer, the displacement component of a certain point is as 
follows: 
 {U} = [N(r)]{q} (2.7) 
where,  
 {U} = 〈ũ ṽ w̃〉T (2.8) 
~ ~ ~ ~ ~ ~ ~ ~ ~
{q} = <ubk  v
b
k  w
b
k  u
m m
k  vk  w
m
k  u
f
k v
f f T
k wk>  
{q} represents the displacement of three nodes in a unit, each with three component vectors, for 
a total of 9 vectors in {q}. Here b, m, and f describe the first, middle, and last points, respectively.  
21 
 
~ ~ ~
u, v , w indicate the three displacement components, r indicates radius. [N(r)] represents the 
function of the interpolation matrix. The interpolation polynomials ni (i=1,2,3) are quadratic 
functions of the radial variable defined as:  
 n1 = 1 − 3η + 2η
2 (2.9) 
n = 4η − 4η22  
n3 = −η + 2η
2 
Where (r−r
η = k) , hk being the thickness of the sublayer, and rk being the radial coordinate of the 
hk
inner surface of the kth sublayer. 
The final finite element equation has the form:  
 (−k2[K1] − ik[K2] − [K3] + ω
2[M]){Q0} = 0 (2.10) 
Here, K1, K2, and K3 are all stiffness matrices; M is the mass matrix; the node displacement 
vector Q has the wave form solutio0n as:  
 {Q} = {Q }ei(mθ+kz−ωt)0  (2.11) 
Here, {Q0}is the amplitude of the wave solution; m is circumferential wave number, m= 0, ±1, 
±2 …; k is axial wave number.  
The matrices [M], [K1], [K2], and [K3] are defined below.  
 H (2.12) 
[M] = ∫ ρ [N]T[N]rdr 
0
H
[K1] = ∫ [b]T [C][b]rdr 
0
H
[K2] = ∫ [b]T [C][a] − [a][C][b]rdr 
0
H
[K ] = ∫ [a]T2 [C][a]rdr 
0
The nonzero elements of the 6 × 9 matrix [a] are as follows: 
22 
 
 dn1 dn2 dn3 (2.13) 
a(1,1) = , a(1,4) = , a(1,7) =  
dr dr dr
n n n
a(
1
2,1) = , a(2,2)
1 2
= im , a(2,4) =  
r r r
n2 n3 n
( 3
a 2,5) = im , a(2,7) = , a(2,8) = im  
r r r
n1 n2 n3
a(4,3) = im , a(4,6) = im , a(4,9) = im  
r r r
dn1 dn2 dn
a(5,3) ( ) ( 3
= , a 5,6 = , a 5,9) =  
dr dr dr
n1 dn1 n1 n2
a(6,1) = im , a(6,2) = − , a(6,4) = im  
r dr r r
dn n n dn n
a(
2 2
6,5) = − , a(
3
6,7) = im , a(6,8)
3 3
= −  
dr r r dr r
The nonzero elements of [b] are: 
 b(3,3) = b(4,2) = b(5,1) = n1 (2.14) 
b(3,6) = b(4,5) = b(5,4) = n2 
b(3,9) = b(4,8) = b(5,7) = n3 
It is noted that [K1] and [M] are real and symmetric, [K2] is skew-Hermitian, and [K3] is 
Hermitian. 
The solutions of equation (2.10) give the wave number and the corresponding wave modes. For a 
given incident wave, the wave coefficients of the response will be determined via superposition 
of wave modes. The method is depicted in Figure 3 below: 
 
23 
 
 
Fig. 3 Fortran Process 
 
2.2 Selection of Data Characteristics 
The reported studies used structural response data (for example, velocity and acceleration) 
collected from experiments as the training data to train the neural networks in pipeline 
nondestructive testing; however, to the best knowledge of the author, there are no published 
literatures for directly using wave coefficients in the study. Additionally, the amount of 
experimental data is limited, which significantly impacts the classification results' correctness. 
The distinction between this study and others is that this study use wave response coefficients as 
the primary features of the data, which are used for training in machine learning and deep 
learning, and making a prediction on the studied pipeline’s crack parameters. After training the 
model, the fracture information may be predicted using the input measurement's wave response 
coefficient. 
Here is a summarization of  the features will be  used in the study:  
• Transmission Coefficient 
• Reflection Coefficient 
• Circular Frequency 
• Circumferential Wave Number 
• Thickness Over Mean Radius 
• Materials 
24 
 
 
2.3 Combination of Machine Learning 
Machine learning is a subfield of artificial intelligence and computer science that focuses on 
replicating how humans learn and steadily improving their accuracy via data and algorithms. It is 
not a computer-specific algorithm but a collective name for various algorithms, of which Deep 
Learning is one. Machine Learning's fundamental strategy is to solve real-world problems by 
abstracting them into mathematical models and utilizing machines to solve these mathematical 
problems, ultimately resolving the real-world problem. In this study’s example, by utilizing a 
Machine Learning method, this study extracted information about pipeline breaks from a variety 
of datasets. 
 
2.3.1 Determination of Machine Learning Algorithm 
Since the current study involves anticipating pipeline fracture information, a machine learning 
technique with a regression model was required. According to Nerseen's analysis of machine 
learning models for time series prediction [18], the machine learning methods for regression 
models are Support Vector Machine (SVM), Random Forest (RF), Extremely Randomized Tree 
(ERT), and K-Nearest Neighbors (KNN). 
 
2.3.2 Principles of Machine Learning Algorithms 
2.3.2.1 Support Vector Machine (SVM) Algorithm 
Support Vector Machine (SVM) can be classified into two types: the Support Vector 
Classification (SVC) algorithm, which is appropriate for classification issues and datasets, and 
the Support Vector Regression (SVR) algorithm, which is appropriate for regression problems 
and datasets (shown in Figure 4). SVC denotes the capacity to maximize the distance between 
the nearest sample points in the hyperplane; SVR, on the other hand, denotes the ability to 
minimize the distance between the farthest sample points in the hyperplane. In this experiment, 
the Support Vector Regression is employed, the red and blue points represent the data set used.  
25 
 
 
 
Fig. 4 Support Vector Classifier (left) and Support Vector Regression (right) 
 
Here, w · x + b = ±1 and w · x + b = ±ε are boundary lines on both sides of SVC and SVR, 
respectively. w · x + b = 0 represents the hyperplane. 2  is the distance between the two dashed 
|w|
lines. w is the normal vector which decide the direction of hyperplane, and b is the intercept 
decide the distance between hyperplane and the origin. This study are using (w, b) to represent 
this hyperplane. ε is the element of relaxation. The greater the value of ε, the more closely the 
sample point approaches the hyperplane. 
The prediction function of SVR is to make the loss of all data points within the margin boundary 
equals to 0, and the points outside the margin boundary are the support vectors of SVR. All this 
study needs to do is to ignore the points within the boundary lines and regress the remaining 
points. Due to the high dimension of this study’s data sets, utilizing the kernel function to 
transfer these samples to a higher dimension before doing regression is needed. 
 
2.3.2.2 Random Forest Algorithm 
Random Forest (RF) is built of several decision trees. Besides, there is no connection between 
the various decision trees. Random Forest (RF) is a kind of random sampling or random 
26 
 
selection of features that may help avoid overfitting. When this study feed data sets into Random 
Forest (RF), each decision tree is predicting independently. Each decision tree will give a 
prediction value. The final prediction result is calculated as the mean of all those decision tree’s 
predictions. 
 
 
Fig. 5 Random Forest Structure Diagram 
 
2.3.2.3 Extremely Randomized Tree Algorithm 
In contrast to the conventional Random Forest (RF), which selects the best segmentation, an 
Extremely Randomized Tree (ERF) is a version of the Random Forest (RF) algorithm that 
randomly selects the segmentation. This results in a higher degree of generalization and a shorter 
calculation time when compared to Random Forest (RF). As a result, Extremely Randomized 
Tree (ERF) frequently outperforms Random Forest (RF) in prediction accuracy. 
 
2.3.2.4 K-Nearest Neighbors Algorithm 
The value of the anticipated point is determined in Figure 6 by averaging the values of the K 
points nearest to it. In this case, the "closest distance" could be the Euclidean or another distance. 
For instance, suppose K equals three, and the point value in this study wish to predict is 
dependent on the three nearest red points. Each red point in the graphic corresponds to a 
particular set of data sets this study used, while the green dots correspond to the anticipated 
values. When the KNN solves a regression model problem, the average algorithm is typically 
employed, which means that the regression prediction value is calculated using the average value 
of the sample output from the nearest K samples. 
27 
 
 
 
Fig. 6 K-Nearest Neighbors 
 
2.3.3 Selection of Hyperparameters in Machine Learning Code 
Numerous parameters must be modified in machine learning programs. These parameters affect 
the algorithm's prediction performance. The optimal parameter combination for the relevant 
algorithm can be determined through repeated training datasets. 
 
2.3.3.1 Support Vector Machines (SVM) 
The Support Vector Machine (SVM) code mainly uses parameters C and gamma in this 
experiment: 
• C: the penalty coefficient; the higher the value of C is, the easier it is to overfit the data. 
Conversely, underfitting can occur with smaller values of C. 
• Gamma: a default parameter when the RBF function is chosen as the kernel. After 
mapping to the new feature space, it implicitly defines the data distribution. The greater 
the gamma value, the fewer support vectors there are; the smaller the gamma value, the 
more support vectors there are. The number of support vectors has an effect on the 
training and prediction speeds. 
28 
 
 
2.3.3.2 K-Nearest Neighbors (KNN) 
There are three commonly used hyperparameters for KNN, which are as follows:  
• K: indicates the number of 'neighbors,' the default value is three, which means sending 
the three closest samples. In brief, a low K number implies that the entire model is 
complex and prone to overfitting; on the other side, a high K value shows that the error is 
large and the prediction accuracy is low. 
• Weight: Mainly used to return results. The default setting is uniform. 
• p: indicates the variable in Minkowski distance formula. Minkowski Distance is a 
generalization of Euclidean distance, which is a general expression of multiple distance 
measurement formulas. When p = 2, this study obtain the Euclidean distance. 
 
2.3.3.3 Random Forest & Extremely Randomized Tree Algorithm 
Random Forest and Extremely Randomized Tree parameters need to be adjusted to ensure 
consistency. The parameters that must be altered are divided into two sections. The first section 
contains parameters for the Bagging framework, while the second section contains the CART 
Decision Tree parameters. 
Partial parameters of Bagging framework: 
• n_estimators: indicates the maximum number of iterations of the weak learner. If the 
value of N_ESTIMators is small, underfitting is likely to occur. Conversely, if the 
n_ESTIMators value is high and easy to overfit, the default value is 100. 
• oob_score: indicates whether to choose to use out-of-bag samples to evaluate the quality 
of the model. Its default value is False. 
Partial parameters of the CART Decision Tree:  
• max_features: represents the maximum number of features in the random forest after 
partitioning. The default option is "None," which implies that all feature numbers are 
29 
 
evaluated when dividing. The total number of features to consider is indicated by 
specifying this feature as an integer. The value of max features must be increased because 
the wave response coefficient generates a significant number of reflected and transmitted 
waves (more than 50). 
• min_samples_split: specifies the minimal number of samples required to subdivide 
internal nodes. This parameter limits the conditions under which the subtree may 
continue to be divided, and the default value is 2. 
 
2.4 Combination of Deep Learning 
2.4.1 Determination of Deep Learning Algorithm 
Deep learning emerged as a technique for optimizing artificial neural networks using 
backpropagation. It was first primarily represented by a Multilayer Perceptron. However, due to 
the issue of vanishing or exploding gradient during the training phase, no significant advances in 
neural network research have been made. Deep learning has advanced significantly in recent 
years, owing to the availability and utilization of large amounts of data and the rapid increase in 
the computational power of computers. In comparison to Convolutional Neural Networks (CNN), 
Artificial Neural Networks (ANN), and other methods, Recurrent Neural Networks (RNN) are 
better at dealing with time series problems and emphasizing sequence order, whereas CNN is 
better at processing spatial windows. ANNs are a collection of multi-layer neurons, usually 
referred to as feedforward neural networks, that are primarily used to solve tabular, text, and 
picture data problems. A cyclic neural network is generated to manage sequential processing jobs 
such as time-series data, literal expressions, etc. An RNN's structure includes a memory function, 
which enables it to analyze sequential input with dependencies and has demonstrated exceptional 
performance in various natural language processing applications. In 2018, I. Jahan and S. Z. 
Sajal [19] forecasted the stock price using RNN. Long Short-Term Memory (LSTM) is an RNN 
version that outperforms the normal RNN in many applications. F. Altché [20] employed LSTM 
in 2017 to forecast the trajectory of highway traffic. The LSTM algorithm avoids the vanishing 
gradient and has a more significant memory, but the recirculating network is effectively. The 
30 
 
Gated Recurrent Unit (GRU) is a newer generation of RNN, and it is also pretty similar to LSTM, 
because of GRU has less operations, so that it is little speedier than LSTM.  
Determining Simple Recurrent Neuron Network (RNN), Long-Short Term Memory (LSTM), 
and Gaited Recurrent Unit (GRU) as deep learning algorithms.  
 
2.4.2 Principles of Deep Learning Algorithms 
2.4.2.1 Recurrent Neural Network (RNN) Algorithm Principles 
Human reading habits are comparable to recurrent neural networks (RNN). The RNN 
accumulates information after reading using the state vector ht, just like the brain does each time 
a human reads a word. It is distinct from the typical neural network model in that it has recurrent 
connection between previous output to current input. Figure 7 illustrates the structure of an RNN. 
 
 
Fig. 7 Example of Recurrent Neural Network (RNN) 
 
Here, the figures' circles indicate neurons, while the various colors reflect various time. As can 
be seen, the RNN's hidden layer at time T contains information from the preceding time T-1.  
31 
 
The primary distinction between RNN and other algorithms such as Artificial Neural Network 
(ANN) or Convolutional Neural Network (CNN) is that the weight connections between the 
neurons in the layers are established in such a way that the output at each moment is related to 
the current input and the previous output. The following diagram illustrates structure of  RNN: 
 
 
Fig. 8 Basic structure of a standard recurrent neural network (RNN) 
 
Data will be preprocessed into vector xt first, then input into the hidden layer S to update the 
state vector through the parameter matrix W, and finally output ht to store the state information. 
In this process, the parameter matrix W of the whole RNN chain remains constant. The hidden 
layer state St of the current time node is related to the current time t input xt and the hidden layer 
state St-1 of the previous time. 
 st = φ(Wxt + Ust − 1) (2.11) 
Here U and V are weight matrices, φ is a logical S-shaped function or hyperbolic tangent 
function, and W is the circulant weight matrix of state transition. 
It is worth noting that Figure 8 does not mean that the RNN has only three neural networks, but it 
means that the same neural network is used three times at three different time points. 
 
2.4.2.2 Principle of Long Short-Term Memory (LSTM) 
32 
 
The long short-term memory (LSTM) model was developed by Hochreiter and Schmidhuber [21]. 
It is a variant of the recurrent neural network. Proposed in 1997, the LSTM may be used to 
analyze data with distant nodes in a time series and efficiently capture information about 
significant time nodes in previous time series to make more precise inferences about the current 
moment's content. Compared to the more straightforward cyclic design, which is more reliable 
for long-term learning, the LSTM has demonstrated superior performance in Automatic Speech 
Recognition (ASR), machine translation, image description creation, and other applications. 
LSTM solves problems with the disappearing gradient and exploding gradient of standard RNNs 
in various tasks. LSTM networks converge more easily than RNN networks, which has allowed 
them to gradually supplant RNN as the favored model for sequential task processing. Figure 9 
illustrates its interior structure. 
 
 
Fig. 9 Principle of Long-Short Term Memory (LSTM) 
 
The fundamental concept of LSTM is to interact with the Cell State via "three gates" and modify 
the information held by the Cell, which are the Input Gate, Forget Gate, and Output Gate. A Cell 
State is analogous to a conveyor belt that runs parallel to the chain, with very tiny interactions 
33 
 
that allow information to flow freely. Additionally, LSTM can add or delete the Cell States, 
which is regulated by a gate structure and is a mechanism to allow information to pass through 
selectively. They are composed of a layer of Sigmoid neural networks and a multiplication 
operation at the element level. The output values of the sigmoid layer determine whether the 
corresponding portion of the information is passed. Three gates protect and govern the Cell State 
in an LSTM. Thus, the LSTM has four inputs that ultimately result in a single output. Its internal 
structural diagram, depicted in Figure 10, can be simplified. 
 
 
Fig. 10 Inner structure of a Long Short-Term Memory (LSTM) 
 
The black dot denotes the elementwise multiplication of two vectors, while f denotes the sigmoid 
activation function, which regulates the value between 0 and 1. Its value indicates the state of the 
gate's opening. By definition, when the value is 0, the gate is closed. Notably, the Hyperbolic 
Tangent Function is used as the activation function in g and h. Z is an external input, and the 
34 
 
inputs of the three gates are represented by Zi, Zf, and Zo. The c in the middle is short for the 
memory cell and the output is represented by a. The whole process is input Z and Zi, which are 
transformed into f(zi) and g(z) through the activation function, and g(z)f(zi) is obtained after 
elementwise multiplication. Meanwhile, f(zf) is obtained through the activation function, and 
cf(zf) is obtained by multiplying the c value previously stored by f(zf). By addition, the updated 
memory cell value c′ can be expressed as follows: 
 c′ = g(z)f(zi) + cf(zf) (2.15) 
h(c′) is then obtained through activation function h. It is then multiplied by f(zo), generated by 
the output gate to obtain the output a. The formula of a is as follows: 
 a = h(c′)f(zo) (2.16) 
In neural networks, a neuron is represented by an LSTM. In general, multiple LSTMS are used. 
Only two LSTMS are shown here in Figure 11. 
 
 
Fig. 11 Neural Network Figure of Long-Short Term Memory (LSTM) 
 
2.4.2.3 Principle of Gated Recurrent Unit (GRU) 
35 
 
In fact, the Gated Recurrent Unit (GRU) model is a variant of the LSTM model. Figure 12 shows 
the GRU hidden layer cell.  
 
 
Fig. 12 GRU cell 
 
According to Jeffrey [22], here, ht-1 is previous output, xt and ht is current input and output. ‘×’, ‘+’ 
and ‘-1’ are logical operator, means multiply, plus and minus one. 
Compared to the LSTM, the structure of GRU doesn’t has cell state, which means GRU has less 
operations. GRU has 2 gates here, a reset gate and an update gate. The update gate acts similar to 
forget gate and input gate of LSTM, it decides what information to throw away and what new 
information to add. The reset gate is a gate used to decide how much pass information to forget.  
 
2.5 Selection of Hyperparameters for Deep Learning Code 
Although LSTM is a version of RNN and GRU is a derivative of LSTM, they all use the same 
hyperparameters. The following table summarizes the information required for this experiment's 
hyper-parameters: 
36 
 
• num_epochs: The number of epochs. An epoch is the number of times that all data is 
trained once, and the number represented by epoch is the total number of training rounds. 
The length of the epoch is proportional to the diversity of the dataset. The more 
diversification there is, the longer the epoch should be. 
• Batch_size: indicates how frequently a subset of the data is supplied to the network for 
training. The optimal batch size range is mostly determined by the convergence rate and 
stochastic gradient noise. 
• INIT_LR: represents the Learning Rate, a hyperparameter used to update the weight 
throughout the gradient descent process. The learning rate's magnitude dictates whether 
and when the objective function can converge to the local minimum. 
• num_ Layer: specifies the size of the RNN's hidden layer. By selecting an appropriate 
hidden layer, problems such as gradient explosion can be avoided. By selecting a suitable 
hidden layer, gradient explosion and other issues can be avoided. 
• hidden_ Size: indicates the number of neurons in each hidden layer of the RNN. 
 
 
 
 
 
 
 
 
 
 
 
37 
 
 
Chapter III Process and Results 
3.1 Pre-Setting of Hyperparameters and Pipeline Parameters 
Hyperparameters control the solving rate, the solution's reliability, and the optimization problem 
is the learning effect. A fine hyperparameter can assist in rapidly identifying the optimal solution 
to the minimization problem and force the model to match the data better through effective 
generalization. 
 
3.1.1 Settings of Pipeline Parameters 
When the pipeline is divided into 40 layers, the results converge. The pipeline is separated into 
40 levels to calculate the number of flaws. As a result, there are 81 points on the section line 
along the pipeline's radius. According to formula 2.7, this study must first provide the 
appropriate point's interpolation [N(r)], after which the point's displacement component value 
can be determined. 
The majority of Pipeline Material (PM) quantities are set to one. Among them are five widely 
used isotropic materials: iron, copper, magnesium, titanium, and aluminum, in addition to an 
anisotropic composite. 
Thickness Over Mean Radius (TOMD) is typically between 0 and 2. On the other hand, a low 
TOMR suggests that the pipeline is extremely thin; a solid cylinder has an TOMR of 2. In this 
experiment, TOMR values were set between 0.1 and 0.4 to avoid exceeding 0.4, which would 
have resulted in a loss of application value in real-world engineering projects. 
The circular Frequency of the Pipeline (CFP) is the incident wave's frequency. In general, the 
greater the frequency, the more waves propagate, requiring more calculation. On the other hand, 
the lower the frequency, the fewer waves may be received. Additionally, when the frequency 
exceeds a particular threshold, it results in a cliff decline in phase velocity, detrimental to data 
analysis. As a result, frequency selection significantly impacts data creation and training. 
38 
 
The Number of Crack Defects is proportional to the number of planned layers. Since the wave 
response coefficient was first computed using the semi-finite element approach, the crack's 
thickness is proportional to the thickness of each pipeline layer divided by the finite element 
method. 
The Input Circumferential Wave Number (ICWN) value affects the wave propagation curve 
selection, and different curves are acceptable for different frequency sizes. 
The Crack Length in the Radial Direction (CDRD) is proportional to the layer count, and the 
depth of each layer is equal to the reciprocal of the overall layer count. This characteristic 
primarily indicates the depth of the cylinder's crack. 
Crack Width in the Circumferential Direction (CWCD) has a value range of [0.025,1], classified 
into 41 groups. The value of CWCD denotes the radian, which primarily indicates the 
circumferential length of the cylindrical fracture. 
 
Table 1 Setting Information about Fortran Code 
Setting Options Setting Values Unit 
Total Number of Element 40 1 
Number of the Materials [1,6] 1 
Thickness Over Mean Radius [0.1,0.4] 0.1 
Circular Frequency [1,4] 1 
Crack Number 41 1 
Circumferential Wave Number [-11,11] 1 
Crack Depth in Radial Direction [0.025,1] 0.025 
Crack Width in Circumferential Direction [0.025,1] 0.025 
 
3.1.2 Settings of the Machine Learning Hyperparameters 
39 
 
The three most often used approaches for hyperparameter optimization are manual, machine-
assisted, and algorithm-based. Practicality is the primary factor guiding hyperparameter 
optimization. The hyperparameter with the best performance is chosen by comparing it to the 
model's anticipated performance. The hyperparameter settings were discovered after multiple 
tests and are listed in Table 2. 
 
Table 2 Hyperparameters of Machine Learning Algorithms 
Algorithms Hyperparameters Value 
Support Vector Penalty Coefficient (C) 1 
Machine gamma 0.1 
K 5 
K-Nearest 
Feature Weight (w) Distance 
Neighbors 
Distance Metric (p) 1 
The number of Largest Weak Learners 100 
(n_estimators) 
Out of Bag false 
(oob_score) 
Random Forest 
Maximum Number of Features 200 
and 
(max_features) 
Extremely Randomized 
Minimum Number of Samples Required for 10 
Tree Algorithm 
Subdividing Internal Nodes 
(min_samples_split) 
Minimum Number of Samples for Leaf Nodes 3 
(min_samples_leaf) 
 
40 
 
3.1.3 Settings of Deep Learning Hyperparameters 
The learning rate is a critical hyperparameter in deep learning. A high or low learning rate may 
result in an extremely slow or even non-existent model learning speed. The learning rate must be 
chosen while keeping an eye on the loss function's score. The most likely outcome is that the 
score fluctuates, but the overall trend is downward. Although the score fluctuated, the overall 
trend of deterioration was the most logical conclusion. The hyper-parameter parameters provided 
in Table 3 are determined after multiple tests. 
 
Table 3 Hyperparameters of Deep Learning Algorithms 
Algorithms Hyperparameters Value 
 Number of Eochs (num_epochs) 200 
 Batch size (batch_size) 6 
Recurrent Neural Network Learning Rate (INIT_LR) 0.001 
Number of Layer (num_layer) 3 
Hidden Size (hidden_size) 20 
 Number of Eochs (num_epochs) 100 
 Batch size (batch_size) 4 
Long-Short Term Memory Learning Rate (INIT_LR) 0.001 
Number of Layer (num_layer) 3 
Hidden Size (hidden_size) 40 
 Number of Eochs (num_epochs) 200 
 Batch size (batch_size) 20 
Learning Rate (INIT_LR) 0.0001 
41 
 
Gated Recurrent Unit Number of Layer (num_layer) 3 
Hidden Size (hidden_size) 40 
 
3.2 Assessment Criteria 
Two assessment metrics, Mean Absolute Error (MAE) and Coefficient of Determination (R2), 
were used to compare the performance of machine learning and deep learning. The Root Mean 
Square Error (RMSE) was chosen as the loss function's y value. 
 
3.2.1 Mean Absolute Error 
The Mean Absolute Error (MAE) is a frequently used loss function in regression models. It 
quantifies the average modulus length of the projected value error without considering direction. 
It has a data range of 0 to infinity. The following is the calculating formula: 
 n
1 ^ (3.1) 
MAE = ∑|yi − yi| n
i=1
^
Here, yi represents the true observed value and yi represents the predicted value. In the same 
prediction target, the smaller MAE value indicates a better prediction of the model. Conversely, 
a larger MAE value indicates a worse prediction.  
 
3.2.2 Coefficient of Determination 
The Coefficient of Determination (R2) is a statistical indicator used to represent the regression 
model and explain the change in the dependent variable's dependability. R2 is a numerical feature 
used to define the relationship between two random variables. This assessment criteria index is 
the most accurate representation of the linear regression approach, and it is calculated as follows: 
 (y 2
∑ i − ŷi) (3.2) 
i
R2 = 1 − n  
(y − y̅)2
∑ i
i n
42 
 
Here, y2  indicates the average value of the true observations, the denominator refers to the 
variance, and the numerator refers to the Root Mean Squares Error (RMSE). Generally, the 
higher the R2, the better the prediction result. When R2 is 1, the predicted value and the true 
value in the sample are completely equal without any errors. If R2 is 0, each predicted value of 
the sample is equal to the average value. 
 
3.2.3 Root Mean Square Error 
The Root Mean Square Error (RMSE) is frequently used to quantify the average size of an error; 
its value is equal to the square root of the average squared difference between the predicted and 
observed values. The following is the formula:                                                                                                                                                                                        
 n (3.3) 
∑ 2
√ i=1(ŷi − yi)RMSE =  
n
In comparison to MAE, RMSE is a more accurate representation of the sample's outliers. On the 
contrary, MAE is robust and considers outliers to be damaged data. Generally, the MAE has a 
smaller expected value than the RMSE. 
 
3.3 Prediction of Pipeline Crack Information 
Numerous training sets are used, and the various types of data created by change input variables 
have varying implications on the prediction outcomes. The data type Control Group serves as the 
fundamental reference. Case A through Case D all adjust the data type of a particular data 
variable in the case of control variables. Table 4 contains examples of these data types. 
  
43 
 
 
Table 4 Training Data Case 
 CFP ICWN TOMR PM 
Control Group 1 0 0.1 Composite 
Case A [1, 4] 0 0.1 Composite 
Case B 1 [-10,0) ∪ (0,10] 0.1 Composite 
Case C 1 0 [0.1, 0.4] Composite 
Case D 1 0 0.1 Isotropic 
 
The table's abbreviations relate to the List of Abbreviations, and to eliminate variations in the 
amount of data that could affect prediction accuracy, all cases have the same number of data sets, 
1640 group sets. 
CDRD values are fixed between [0, 1] in the Control Group. The CDRDs are evenly distributed 
into 41 groups, with a variation of 0.025 between each set. Each CDRD corresponds to one of 
the 41 CWCD groups. Similarly, CWCD is divided into 41 equal groups and falls inside the 
interval [0, 1]. The purpose of establishing the Control Group is to enable a more direct 
comparison of the effect of data modifications on the accuracy of the pipeline crack prediction. 
After establishing the Control Group, the goal of Case A was to investigate the effect of CFP 
(Circular Frequency in Pipeline) modifications on prediction accuracy. The CFP-based data type 
is separated into two files, one for CDRD prediction and one for CWCD prediction. The first file 
sets the CWCD to a constant value of 0.5. Similarly, the CDRD value is set to 0.5 in the second 
file. 
Case B is based on data sets containing ICWN (Input Circumferential Wave Number) values. 
Creating data Case B aims to investigate the effect of changing the ICWN on prediction accuracy. 
Additionally, Case B is broken into two files for training purposes. 
44 
 
In Case C, a change in TOMR (Thickness Over Mean Radius) indicates a change in the pipe's 
thickness. Since an overly large TOMR value has no engineering significance, the value is 
limited to [0.1, 0.4]. 
Case D considers a material change to the pipeline, substituting composite materials with 
isotropic materials, intending to examine the effect of different materials on prediction accuracy. 
It is composed of five different materials, including steel, copper, aluminum, magnesium, and 
titanium. 
 
3.3.1 Prediction Results of Crack Depth in Radial Direction (CDRD)  
Crack Depth in Radial Direction (CDRD) is predicted using four machine learning methods and 
three deep learning algorithms. Table 5 shows the following outcomes when MAE and R2 are 
used as assessment criteria: 
 
Table 5 Using different data types and different algorithms to predict CDRD 
 Control Group CFP HR ICWN PM 
 MAE R2 MAE R2 MAE R2 MAE R2 MAE R2 
SVM 0.08963 0.91167 0.12323 0.82835 0.08436 0.92784 0.11651 0.85249 0.10009 0.88513 
KNN 0.06043 0.95926 0.06533 0.95127 0.03542 0.98689 0.06601 0.94776 0.06215 0.95394 
RF 0.07341 0.93986 0.06210 0.95444 0.02656 0.99171 0.03507 0.98584 0.05301 0.96886 
ERT 0.06710 0.94938 0.04199 0.98072 0.01492 0.99741 0.03361 0.98605 0.03992 0.98159 
SRNN 0.09248 0.90741 0.11976 0.89705 0.08856 0.95226 0.04125 0.97623 0.08452 0.92206 
LSTM 0.10784 0.88266 0.07862 0.94328 0.03464 0.98109 0.04124 0.97629 0.06754 0.94254 
GRU 0.07884 0.92901 0.05962 0.96312 0.04462 0.97483 0.03754 0.98282 0.06141 0.95585 
 
45 
 
The table clearly indicates that the Extremely Randomized Tree (ERT) method has the highest 
prediction accuracy in machine learning, while the Gated Recurrent Unit algorithm has the 
highest prediction accuracy in deep learning (GRU). The algorithm has the highest assessment 
score across all data categories for prediction results. 
The CDRD Prediction Histogram depicted in Figure 13 visually represents the prediction 
performance. While both MAE and R2 might reflect prediction performance, a lower MAE 
number indicates better performance, whereas a higher R2 value indicates better prediction 
performance. As a result, Figure 13 simply compares the R2 approach. After calculations, Table 
6 is generated, indicating that the dataset based on Thickness Over Mean Radius (TOMD) has 
the lowest standard deviation. This suggests that data sets derived from human resources are the 
most stable when used to predict CDRD. 
 
Table 6 Standard Deviation of Different CDRD Data Types 
 Control Group CFP HR ICWN PM 
Standard 
0.02473 0.04826 0.02291 0.04491 0.0228 
Deviation 
 
As illustrated in Figure 13, after altering the data type, the accuracy of all forecasts is greater 
than the Control Group's score. This demonstrates that altering the data format has an effect on 
the accuracy of the CDRD prediction. 
 
46 
 
 
Fig. 13 CDRD Prediction Histogram 
 
3.3.2 Prediction Results of Crack Width in Circumferential Direction (CWCD)  
To predict the Crack Width in Circumferential Direction (CWCD), the MAE and R2 obtained by 
all algorithms are recorded in Table 7 below.  
Table 7 Using different data types and different algorithms to predict CWCD 
 Control Group CFP HR ICWN PM 
 MAE R2 MAE R2 MAE R2 MAE R2 MAE R2 
SVM 0.14369 0.74237 0.18478 0.61271 0.18089 0.63239 0.14428 0.7608 0.13541 0.76878 
KNN 0.05255 0.96818 0.09881 0.87956 0.10823 0.8613 0.05964 0.95549 0.0623 0.95641 
RF 0.06952 0.94467 0.05342 0.96713 0.04513 0.97804 0.07264 0.94013 0.05063 0.97251 
ERT 0.05736 0.96506 0.02071 0.99506 0.02905 0.99079 0.05013 0.97182 0.02713 0.9913 
SRNN 0.11124 0.81812 0.09114 0.88614 0.15421 0.75134 0.18421 0.60874 0.09451 0.89012 
LSTM 0.10545 0.87654 0.06457 0.93751 0.09974 0.88054 0.11246 0.82275 0.03144 0.98325 
GRU 0.07451 0.93961 0.05974 0.9493 0.09424 0.88835 0.08475 0.87642 0.02147 0.99524 
 
47 
 
As shown in Table 7, the ERT algorithm has the most robust prediction performance in most 
data situations. The best machine learning algorithm is still ERT, while GRU is the best deep 
learning approach. 
The standard deviation of the scores associated with each data type's prediction outcomes is 
calculated, as shown in Table 8. 
 
Table 8 Standard Deviation of different CWCD data types 
 Control Group CFP HR ICWN PM 
Standard 
0.07951 0.11939 0.11691 0.12052 0.07617 
Deviation 
 
The distinction from forecasting CDRD is that when Pipeline Materials (PM) data is used to 
forecast CWCD, it has a better level of stability and predictive performance. The comparison 
histogram of the CWCD R2 is shown in Figure 14. 
 
 
Fig. 14 CWCD Prediction Histogram 
 
48 
 
The CWCD score graph demonstrates that some algorithms perform worse than the CDRD 
prediction histogram when predicting CWCD. 
Although the Control Group outperformed most of the data types in terms of score stability, 
Figure 14 shows that when the RF and ERT algorithms are utilized, the diverse data types still 
have higher scores than the Control Group's prediction results. The majority of methods 
outperform the Control Group, particularly when PM-based data is used, demonstrating that 
changing the data type enhanced the CWCD's prediction ability. 
 
3.3.3 Results Comparison of CDRD and CWCD 
The R2 of the CDRD and the CWCD are compared in Figure 15. It is discovered that, in most 
circumstances, the predictive performance of the CDRD and CWCD are comparable, but the 
CDRD forecast is more consistent than the CWCD forecast. 
 
 
Fig. 15 Performance comparison histogram of CDRD and CWCD in Control Group 
 
In most cases, the CDRD outperforms the CWCD in Figure 16. However, in some algorithms, 
the result score for CWCD prediction is greater than for CDRD prediction. This could be 
because the algorithm randomly selects a subset of the data as the test set. For those results 
49 
 
which CWCD has a higher score, the score of CDRD prediction may exceed CWCD prediction 
after some repeated calculations. 
 
 
Fig. 16 Performance comparison histogram of CDRD and CWCD in CFP Group 
 
Figure 17 compares the R2 under the condition of TOMR based data set. Under this data type, the 
prediction performance of various algorithms for CDRD is better than that for the CWCD. 
 
 
Fig. 17 Performance comparison histogram of CDRD and CWCD in TOMR Group 
50 
 
 
In Fig.18, the R2 score for all algorithms that predict CDRD is greater than the R2 score for 
algorithms that predict CWCD. Notably, the scores of deep learning algorithms are poor when 
predicting CWCD, indicating that deep learning methods are not suitable for predicting CWCD 
utilizing ICWN-based data types. 
 
 
Fig. 18 Performance comparison histogram of CDRD and CWCD in ICWN Group 
 
 
Fig. 19 Performance comparison histogram of CDRD and CWCD in PM Group 
51 
 
 
As illustrated in Figure 19, the CWCD prediction result score for most algorithms is more than 
the CDRD prediction result score, and the majority of approaches have an R2 greater than 0.9. 
One could argue that the PM-based data type is more suited to forecasting the CWCD. 
 
3.3.4 Comparison of Machine Learning and Deep Learning 
"TOTAL" indicates that pooled all the data sets, thereby increasing the sample size. This study 
collected all data types and utilized seven algorithms to predict CDRD and CWCD. Figure 20 
depicts R2 score histogram comparisons. All the data types in Figure 20 used to share the same 
data volume, 8200. Compared to the Control Group data type, the decrease of CWCD prediction 
can be seen clearly, Since the "TOTAL" data sets which provided had different feature 
distributions, which increases the challenge of training, and increasing the quantity of data did 
not affect the prediction performance of any method for CWCD. Additionally, all algorithms' 
prediction performance for the CDRD is superior to CWCD. 
 
 
Fig. 20 Performance comparison histogram of CDRD and CWCD in Total Group 
 
52 
 
To compare the prediction performance of deep learning and machine learning algorithms for the 
depth and width of pipeline cracks, this study chose the GRU with the highest prediction 
performance in the deep learning algorithm and the ERT with the highest prediction performance 
in the machine learning algorithm. 
Figures 21 and 22 illustrate different comparisons of the prediction performance of the GRU and 
ERT algorithms. As seen in the figures, the GRU performs similarly to the ERT algorithm when 
predicting the CDRD but outperforms the ERT method only when predicting the CWCD. 
 
 
 
Fig. 21 R2 comparison histogram 
 
53 
 
 
Fig. 22 R2 comparison histogram 
Due to the fact that several hyperparameters were changed during prediction to ensure that no 
overfitting or underfitting occurred, a loss curve was constructed, as illustrated in Figures 23 and 
24. The RMSE was used as the assessment criterion to compare data with aberrant values. 
 
 
Fig. 23 GRU loss curve in CDRD prediction 
 
54 
 
 
Fig. 24 GRU loss curve in CWCD prediction 
 
As illustrated in Figures 23 and 24, the drop curve exhibits the features of a good fit, indicating 
no over-or under-fitting in this experiment. 
By combining the comparison scores in Figures 21 and 22, it is possible to deduce that when the 
amount of data is sufficient, the GRU method can approach the prediction performance of the 
ERT algorithm. Because the predictive performance of a deep learning algorithm depends on the 
number of datasets and the settings of hyperparameters, one may argue that the GRU can 
outperform ERT methods in terms of predictive performance. 
 
3.4 Results & Discussion 
Throughout Section 3.3, it was discovered that altering the type of datasets affected the 
algorithms' prediction performance. For example, TOMR data is better suited to forecasting the 
CDRD, while PM data is better suited to predicting the CWCD. Additionally, increasing the 
number of datasets improves deep learning algorithms' prediction performance to a certain level. 
However, the pipeline's TOMR and PM values are constant in more realistic situations, making it 
extremely difficult to gather sufficient datasets by varying the TOMR and PM. Additionally, it is 
55 
 
not straightforward to expand the number of datasets used to train deep learning systems to 
improve their prediction performance. As a result, the outcome score achieved by modifying the 
CFP and ICWN data types is critical. 
By combining Figures 15, 16, and 18, it is found that when the data type is changed, the 
algorithms' prediction performance improves, and the algorithms’ prediction performance about 
CDRD is superior to that of CWCD. 
From a technical standpoint, when the data types are varied and the number of datasets is not 
fixed, the ERT retains an excellent predictive performance. As a result, the machine learning 
ERT technique is unquestionably the most reliable. 
However, from the perspective of developing a pipeline trainer, collecting sufficient datasets 
based on TOMR or PM would aid in predicting the CDRD and CWCD. However, it is difficult 
to collect large field data sets, and it is also vital to ensure that these data sets contain just 
changes in pipeline thickness or material. 
Chapter IV Conclusion 
This chapter employs seven neural network methods suitable for regression issues, four of which 
are Machine Learning methods and three of which are Deep Learning algorithms. The 
combination of machine learning and deep learning with 4 types of data which generated by 
Fortran software could predict two types of pipeline cracks, including Crack Depth in the Radial 
Direction (CDRD) and Crack Width in the Circumferential Direction (CWCD). Through 
histogram comparison, it was shown that the machine learning algorithm Extreme Randomized 
Tree (ERT) has the best prediction performance. Changing the type of training dataset enhances 
some algorithms' prediction performance. Choosing data types based on the Circular Frequency 
in Pipeline (CFP) and the Input Circumferential Wave Number (ICWN) can help enhance 
prediction performance under realistic settings. In theory, if sufficient data types pertaining to 
Pipeline Materials (PM) can be gathered, it would be advantageous to forecast the outcome of 
the CWCD. Gated Recurrent Units (GRU) are the optimal algorithm for deep learning. 
Increasing the amount of training data and adjusting the hyperparameter settings can assist 
enhance the prediction performance of a deep learning algorithm. However, the outcomes score 
56 
 
corroborates Tripathi et al. [6].'s assertion that deep learning is not always more favorable than 
machine learning. 
Nonetheless, after adjusting the hyperparameters and increasing the amount of data, the 
prediction accuracy of the deep learning algorithm for pipeline cracks dramatically improved 
compared to other data sets with fewer data points. Theoretically, it demonstrates that when 
sufficient data and a better-matched hyperparameter configuration are available, the prediction 
accuracy of the deep learning algorithm can exceed that of the machine learning Extremely 
Randomized Tree (ERT). Collecting massive amounts of data is time-consuming, but it is 
necessary for developing a more reliable and accurate pipeline crack prediction trainer. 
 
 
 
 
 
  
57 
 
Chapter V Future Works 
The Crack Depth in Radial Direction (CDRD) and the Crack Width in Circumferential Direction 
(CWCD) has been predicted in this study. Thus, it is possible that the Crack Thickness in Axial 
Direction (CTAD) of the pipeline cracks may still be anticipated, allowing for the establishment 
of three-dimensional space through the technology. The prediction of 3D crack may be proposed 
in the future, which could more intuitively describe the characteristics of the cracks. 
On the other hand, field data is important, so in the follow-up work, it is worth to collect field 
data and test them. Because in practical cases, response signals may be polluted or overwhelmed 
by signal noise, it will decrease the accuracy of prediction. So, filtering techniques may need to 
be used in subsequent work. 
What’s more, to obtain a better accuracy, deep Learning usually requires huge data sets to train. 
The data sets collected, however, are not enough for us to reveal the high-performance of deep 
learning, so, collecting more data sets might be considered in the future works. 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
58 
 
Reference 
[1] C. Tschope, E. Schulze, H. Neunubel, M. Wolff, R. Schubert and R. Hoffmann, 
"Experiments in acoustic structural health monitoring of airplane parts," 2008 IEEE International 
Conference on Acoustics, Speech and Signal Processing, 2008, pp. 2037-2040, doi: 
10.1109/ICASSP.2008.4518040. 
[2] Jiang Chengjun, Ju Ximin. The Development and Actuality of Oil& Gas Pipeline Testing [J]. 
Inner Mongolia Petrochemical Industry, 2008(3): 83-86. 
[3] A.A. Carvalho, J.M.A. Rebello, M.P.V. Souza, L.V.S. Sagrilo, S.D. Soares, Reliability of 
non-destructive test techniques in the inspection of pipelines used in the oil industry, 
International Journal of Pressure Vessels and Piping, Volume 85, Issue 11, 2008, Pages 745-751, 
ISSN 0308-0161, https://doi.org/10.1016/j.ijpvp.2008.05.001. 
[4] Zhao Caiping. Pipeline ultrasonic guided wave inspection data analysis and defect diagnosis 
system development. (Doctoral dissertation, Beijing University of Technology). 
[5] Herrera, Roberto & Christensen, 
 Paul & Elvers, Adrianus. (2019). Machine Learning in Pipeline Inspection: Applications of 
supervised learning in non-destructive evaluation. 
[6] Tripathi, G.; Anowarul, H.; Agarwal, K.; Prasad, D.K. Classification of Micro-Damage in 
Piezoelectric Ceramics Using Machine Learning of Ultrasound Signals. Sensors 2019, 19, 4216. 
https://doi.org/10.3390/s19194216 
[7] A. Mardanshahi, V. Nasir, S. Kazemirad, M.M. Shokrieh, Detection and classification of 
matrix cracking in laminated composites using guided wave propagation and artificial neural 
networks, Composite Structures, Volume 246, 2020, 112403, ISSN 0263-8223, 
[8] F. Zhang, K. Pinkal, P. Wefing, F. Conradi, J. Schneider and O. Niggemann, "Quality 
Control of Continuous Wort Production through Production Data Analysis in Latent Space," 
2019 IEEE International Conference on Industrial Technology (ICIT), 2019, pp. 1323-1328, doi: 
10.1109/ICIT.2019.8755111. 
59 
 
[9] Te Ma, Satoru Tsuchikawa, Tetsuya Inagaki, Rapid and non-destructive seed viability 
prediction using near-infrared hyperspectral imaging coupled with a deep learning approach, 
computers and Electronics in Agriculture, Volume 177, 2020, 105683, ISSN 0168-1699, 
https://doi.org/10.1016/j.compag.2020.105683. 
[10] Burlina P, Billings S, Joshi N, Albayda J (2017) Automated diagnosis of myositis from 
muscle ultrasound: Exploring the use of machine learning and deep learning methods. PLoS 
ONE 12(8): e0184059. https://doi.org/10.1371/journal.pone.0184059 
[11] Xiong Zhengsi, Huang Gang, Hao Lijun, etc. Research on non-invasive detection of liver 
cancer based on machine learning [J]. Beijing Biomedical Engineering, 2020, 39(1): 74-79. 
 
[12] William Sorteberg, Stef Garasto, Alison Pouplin, Chris Cantwell, and Anil A. Bharath. 
Approximating the Solution to Wave Propagation using Deep Neural Networks. In NeurIPS 
Workshop on Modeling the Physical World: Perception, Learning, and Control, December 2018. 
[13] Rautela M. and Gopalakrishnan S. Deep Learning frameworks for wave propagation-based 
damage detection in 1D-waveguides, 11th International Symposium on NDT in Aerospace, Paris. 
Nov.2019. 
[14] Yohei Nishizaki, Matias Valdivia, Ryoichi Horisaki, Katsuhisa Kitaguchi, Mamoru Saito, 
Jun Tanida, and Esteban Vera, "Deep learning wavefront sensing," Opt. Express 27, 240-251 
(2019) 
[15] P. Zhu, J. Isaacs, B. Fu and S. Ferrari, "Deep learning feature extraction for target 
recognition and classification in underwater sonar images," 2017 IEEE 56th Annual Conference 
on Decision and Control (CDC), 2017, pp. 2724-2731, doi: 10.1109/CDC.2017.8264055. 
[16] Rymarczyk T, Kłosowski G, Kozłowski E. A Non-Destructive System Based on Electrical 
Tomography and Machine Learning to Analyze the Moisture of Buildings. Sensors (Basel). 
2018;18(7):2285. Published 2018 Jul 14. doi:10.3390/s18072285 
[17] Datta, S.K., & Shah, A.H. (2009). Elastic Waves in Composite Media and Structures: With 
Applications to Ultrasonic Nondestructive Evaluation (1st ed.). CRC Press. 
https://doi.org/10.1201/9780429136696 
60 
 
[18] Nesreen K. Ahmed, Amir F. Atiya, Neamat El Gayar & Hisham El-Shishiny (2010) An 
Empirical Comparison of Machine Learning Models for Time Series Forecasting, Econometric 
Reviews, 29:5-6, 594-621, DOI: 10.1080/07474938.2010.481556 
[19] I. Jahan and S. Z. Sajal, Stock Price Prediction using Recurrent Neural Network Algorithm 
on Time-Series Data, the Midwest Instruction and Computing Symposium 2018, April 6-7, 2018 
Duluth MN, USA. 
[20] F. Altché and A. de La Fortelle, "An LSTM network for highway trajectory prediction," 
2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), 2017, 
pp. 353-359, doi: 10.1109/ITSC.2017.8317913. 
[21] Hochreiter S, Schmidhuber J. Long short-term memory [J]. Neural computation, 1997, 9 (8): 
1735-1780. 
[22] J.L.Elman, Finding structure in time, Cogn. Sci. 14(1990)179-211.
61 
 
Appendices 
Appendix A: Control group file about CDRD 
 
Table 9 Control group file about CDRD 
CDRD CWCD trscoe1 trscoe2 trscoe3 trscoe4 … trscoe46 refcoe1 refcoe2 refcoe3 refcoe4 … refcoe46 
0 0.05 1 0 0 0 … 0 0 0 0 0 … 0 
0.025 0.05 1.000273 0.000153 0.000272 0.00003 … 0.000251 0.000274 0.000155 0.000271 0.00003 … 0.000249 
0.05 0.05 1.006377 0.005495 0.010225 0.001167 … 0.009747 0.00954 0.005116 0.009429 0.001094 … 0.009062 
0.075 0.05 1.000742 0.000482 0.000966 0.000093 … 0.000854 0.00078 0.000455 0.000771 0.000081 … 0.000684 
0.1 0.05 1.001104 0.000719 0.001529 0.000139 … 0.001336 0.001216 0.000675 0.001203 0.000119 … 0.001048 
0.125 0.05 1.001149 0.000806 0.001872 0.00016 … 0.001622 0.001286 0.000689 0.001272 0.000119 … 0.001086 
0.15 0.05 1.001236 0.000985 0.002336 0.000195 … 0.002022 0.001342 0.000755 0.001328 0.000124 … 0.001116 
0.175 0.05 1.001567 0.001415 0.003252 0.000271 … 0.002817 0.00169 0.001065 0.001673 0.000163 … 0.001407 
0.2 0.05 1.003369 0.002398 0.004569 0.000428 … 0.00408 0.001811 0.000226 0.0018 0.000046 … 0.001314 
0.225 0.05 1.002482 0.002206 0.004917 0.000408 … 0.004285 0.002077 0.001192 0.002063 0.000165 … 0.001616 
0.25 0.05 1.002465 0.002438 0.005407 0.000446 … 0.004722 0.001981 0.001292 0.00197 0.000172 … 0.001513 
0.275 0.05 1.002624 0.002752 0.006083 0.000496 … 0.005318 0.002128 0.001447 0.002118 0.000185 … 0.001594 
0.3 0.05 1.002871 0.00312 0.006905 0.000553 … 0.006037 0.002432 0.00164 0.002421 0.000202 … 0.001795 
0.325 0.05 1.002996 0.003411 0.00755 0.000599 … 0.006612 0.002572 0.001796 0.00256 0.000218 … 0.001869 
0.35 0.05 1.002989 0.003639 0.008026 0.000635 … 0.007051 0.002524 0.001977 0.002511 0.000242 … 0.001808 
0.375 0.05 1.002663 0.004117 0.008725 0.000701 … 0.007703 0.002576 0.002832 0.002559 0.000342 … 0.001981 
0.4 0.05 1.003597 0.003931 0.008879 0.00069 … 0.007827 0.002544 0.001434 0.002533 0.000197 … 0.001654 
0.425 0.05 1.003535 0.004182 0.009505 0.000734 … 0.008391 0.002609 0.001866 0.002601 0.000252 … 0.001706 
0.45 0.05 1.003537 0.0043 0.00993 0.00076 … 0.008777 0.002646 0.00203 0.002642 0.000281 … 0.001729 
0.475 0.05 1.003581 0.00425 0.010127 0.000777 … 0.008966 0.002458 0.002034 0.002462 0.0003 … 0.001558 
62 
 
0.5 0.05 1.004379 0.005747 0.011275 0.000932 … 0.010269 0.001215 0.001967 0.001233 0.000377 … 0.000176 
0.525 0.05 1.002148 0.005103 0.010174 0.000833 … 0.008993 0.002795 0.004555 0.002786 0.000492 … 0.002406 
0.55 0.05 1.002077 0.005229 0.011923 0.000974 … 0.010628 0.003123 0.004374 0.003108 0.00052 … 0.002562 
0.575 0.05 1.001651 0.005928 0.013533 0.001127 … 0.012089 0.00377 0.005597 0.003751 0.000603 … 0.003197 
0.6 0.05 1.003924 0.008697 0.015491 0.001289 … 0.014297 0.002971 0.002185 0.002989 0.00035 … 0.001605 
0.625 0.05 1.001686 0.003603 0.011701 0.000534 … 0.009617 0.008683 0.009844 0.008604 0.000671 … 0.007919 
0.65 0.05 1.001536 0.007309 0.018405 0.001554 … 0.016768 0.003927 0.004383 0.003928 0.000622 … 0.002904 
0.675 0.05 1.002722 0.009873 0.020605 0.001811 … 0.019096 0.00281 0.002405 0.002836 0.000646 … 0.001458 
0.7 0.05 1.004137 0.017238 0.024002 0.00231 … 0.022931 0.001882 0.006533 0.001877 0.001004 … 0.001944 
0.725 0.05 1.007573 0.021213 0.024564 0.002335 … 0.023889 0.005957 0.010871 0.005887 0.00108 … 0.0061 
0.75 0.05 1.008552 0.025715 0.014066 0.001041 … 0.013717 0.016837 0.029928 0.016641 0.001835 … 0.016784 
0.775 0.05 0.999071 0.008017 0.024199 0.002214 … 0.022411 0.003267 0.007543 0.003265 0.001206 … 0.002566 
0.8 0.05 1.000905 0.012962 0.02716 0.002625 … 0.02577 0.000262 0.002757 0.000286 0.001376 … 0.001292 
0.825 0.05 1.003825 0.023203 0.03204 0.003364 … 0.031205 0.005789 0.007398 0.005635 0.001968 … 0.007113 
0.85 0.05 1.009373 0.026781 0.032686 0.003426 … 0.032414 0.010014 0.013802 0.009804 0.00196 … 0.011212 
0.875 0.05 1.004986 0.02444 0.012433 0.000603 … 0.011827 0.015836 0.036766 0.015711 0.002028 … 0.015791 
0.9 0.05 0.997405 0.011603 0.03207 0.003445 … 0.030645 0.007483 0.007577 0.00727 0.002735 … 0.007828 
0.925 0.05 0.999319 0.014209 0.035421 0.003997 … 0.034306 0.011179 0.004375 0.010891 0.003196 … 0.011681 
0.95 0.05 1.014179 0.031886 0.038319 0.004417 … 0.038834 0.016213 0.021792 0.016002 0.002677 … 0.01834 
0.975 0.05 1.004876 0.015964 0.029404 0.002465 … 0.02906 0.00732 0.021236 0.007185 0.00151 … 0.00879 
1 0.05 0.99927 0.006975 0.038837 0.004801 … 0 0 0 0 0 … 0 
… … … … … … … … … … … … … … 
 
 
 
 
63 
 
Appendix B: Control group file about CWCD 
 
Table 10 Control group file about CWCD 
CWCD CDRD trscoe1 trscoe2 trscoe3 trscoe4 … trscoe46 refcoe1 refcoe2 refcoe3 refcoe4 … refcoe46 
0 0.05 1 0 0 0 … 0 0 0 0 0 … 0 
0.025 0.05 1.004379 0.005747 0.011275 0.000932 … 0.010269 0.001215 0.001967 0.001233 0.000377 … 0.000176 
0.05 0.05 1.001169 0.003077 0.016093 0.000736 … 0.009721 0.011757 0.002905 0.011673 0.000114 … 0.00625 
0.075 0.05 0.999528 0.003942 0.022684 0.000697 … 0.008741 0.018545 0.004928 0.018288 0.000585 … 0.00459 
0.1 0.05 0.993939 0.005214 0.029408 0.00092 … 0.007889 0.030004 0.008709 0.029247 0.001232 … 0.015079 
0.125 0.05 0.994274 0.007145 0.040975 0.001278 … 0.010192 0.033827 0.006532 0.032866 0.001013 … 0.004718 
0.15 0.05 0.991237 0.007157 0.047383 0.00152 … 0.011431 0.040072 0.007909 0.038609 0.000682 … 0.006023 
0.175 0.05 0.990471 0.007806 0.052923 0.001105 … 0.010454 0.046543 0.01008 0.044184 0.001537 … 0.00501 
0.2 0.05 0.989817 0.011325 0.064116 0.000307 … 0.010658 0.053125 0.010546 0.049636 0.002729 … 0.002275 
0.225 0.05 0.985638 0.01136 0.06996 0.002118 … 0.012126 0.062135 0.01225 0.057029 0.001769 … 0.005941 
0.25 0.05 0.983165 0.011584 0.074986 0.002228 … 0.013526 0.068303 0.013405 0.061684 0.001261 … 0.005738 
0.275 0.05 0.979251 0.015881 0.084481 0.003783 … 0.009975 0.068797 0.012554 0.061584 0.007209 … 0.023109 
0.3 0.05 0.981644 0.015442 0.087945 0.00096 … 0.011154 0.082242 0.016263 0.070612 0.003841 … 0.003696 
0.325 0.05 0.978132 0.015771 0.092306 0.003063 … 0.014368 0.090277 0.017724 0.075508 0.001981 … 0.006962 
0.35 0.05 0.97589 0.0161 0.095606 0.002559 … 0.016331 0.096985 0.019076 0.078866 0.002067 … 0.006447 
0.375 0.05 0.977282 0.019142 0.100817 0.003827 … 0.011416 0.100617 0.018801 0.080327 0.0059 … 0.008018 
0.4 0.05 0.974737 0.019887 0.104007 0.000794 … 0.01352 0.11053 0.021652 0.083914 0.004874 … 0.00522 
0.425 0.05 0.971958 0.020374 0.105944 0.003205 … 0.017103 0.118605 0.023031 0.086596 0.002385 … 0.007456 
0.45 0.05 0.962419 0.025171 0.108637 0.00201 … 0.02234 0.119983 0.02212 0.086857 0.005986 … 0.038071 
0.475 0.05 0.972587 0.024007 0.10925 0.008472 … 0.015318 0.129428 0.024142 0.088664 0.008519 … 0.00449 
0.5 0.05 0.969347 0.024548 0.11038 0.005038 … 0.017564 0.137971 0.02681 0.088597 0.008859 … 0.005498 
64 
 
0.525 0.05 0.966401 0.024891 0.110188 0.002794 … 0.020918 0.146943 0.02857 0.088542 0.006826 … 0.007198 
0.55 0.05 0.966836 0.028126 0.108811 0.000613 … 0.01051 0.149875 0.027227 0.088635 0.004456 … 0.012186 
0.575 0.05 0.966973 0.028227 0.10726 0.003986 … 0.013008 0.158609 0.029987 0.087543 0.000869 … 0.00389 
0.6 0.05 0.965569 0.029169 0.105328 0.003263 … 0.018153 0.166594 0.031934 0.085398 0.002448 … 0.00871 
0.625 0.05 0.957648 0.029972 0.104673 0.005943 … 0.032564 0.176499 0.034761 0.08141 0.010427 … 0.027957 
0.65 0.05 0.963602 0.032881 0.098807 0.001169 … 0.010585 0.178506 0.0327 0.08098 0.003427 … 0.007792 
0.675 0.05 0.963209 0.032829 0.095346 0.002941 … 0.014029 0.186322 0.035105 0.077515 0.001437 … 0.006074 
0.7 0.05 0.962728 0.033755 0.091419 0.002379 … 0.020511 0.194738 0.037166 0.072876 0.002069 … 0.0085 
0.725 0.05 0.96337 0.040355 0.081756 0.006886 … 0.012393 0.196099 0.032954 0.072516 0.005732 … 0.024377 
0.75 0.05 0.961257 0.037233 0.079373 0.001229 … 0.010424 0.206733 0.038201 0.065095 0.002399 … 0.004188 
0.775 0.05 0.961126 0.03752 0.074665 0.002307 … 0.015657 0.214097 0.039919 0.059936 0.001122 … 0.00763 
0.8 0.05 0.952582 0.039044 0.072143 0.005129 … 0.038905 0.226221 0.043707 0.052131 0.005043 … 0.031682 
0.825 0.05 0.961391 0.042386 0.058223 0.001407 … 0.008969 0.226992 0.041149 0.049442 0.001868 … 0.009334 
0.85 0.05 0.959739 0.04177 0.052657 0.001027 … 0.010474 0.234647 0.043169 0.0428 0.001409 … 0.004281 
0.875 0.05 0.960432 0.042157 0.047833 0.001352 … 0.017018 0.241897 0.044768 0.0365 0.000963 … 0.007502 
0.9 0.05 0.96417 0.045747 0.036264 0.001139 … 0.011453 0.247093 0.044824 0.031485 0.000683 … 0.01241 
0.925 0.05 0.960393 0.04637 0.027416 0.000564 … 0.006491 0.255581 0.046764 0.023075 0.000766 … 0.002896 
0.95 0.05 0.959423 0.046296 0.022042 0.000482 … 0.010714 0.262032 0.04759 0.016808 0.000676 … 0.006539 
0.975 0.05 0.963705 0.046935 0.012189 0.000556 … 0.007296 0.262046 0.049494 0.017328 0.000908 … 0.013368 
1 0.05 0.95682 0.049832 0 0 … 0 0.278287 0.050428 0 0 … 0 
… … … … … … … … … … … … … … 
 
 
 
 
 
65 
 
Appendix C: CDRD prediction codes in Extremely Randomized Tree 
from sklearn import datasets # Create the environment 
from sklearn import metrics  
from sklearn.metrics import r2_score  
from sklearn.model_selection import train_test_split  
from sklearn.ensemble import ExtraTreesRegressor 
from sklearn.preprocessing import StandardScaler # Preprocessing function 
from numpy import *  
  
pipeline_test= pd.read_csv(r"C:\Users\admin\data\CDRD\CDRD.csv")  
x=pipeline_test.drop('CDRD',axis=1)  
y=pipeline_test['CDRD']  
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.2) # f(x) = (x - means) / standard deviation 
scaler = StandardScaler()  
scaler.fit(x_train)  
x_train = scaler.transform(x_train) # Standardization 
x_test = scaler.transform(x_test)  
  
etr = ExtraTreesRegressor(n_estimators=10, max_features=200, # Build Extremely Randomized Tree model 
min_samples_split=10) 
etr.fit(x_train,y_train)  
y_predict = etr.predict(x_test)  
print(np.sqrt(metrics.mean_squared_error(y_test, y_predict))) # MAE 
print(svr.score(x_test, y_test)) # R2 Score 
 
 
66 
 
Appendix D: Part of the CDRD prediction codes in Gated Recurrent Unit 
xlsfile=pd.read_csv('CDRD TOTAL.csv',header=None) # Read data 
data=np.array(xlsfile)  
n=np.random.randint(0,data.shape[0],data.shape[0]) # Random array used to scramble the data set 
  
train_data = data[n[0:6500],2:] # Divide data features into training set, validation set and test set 
valid_data = data[n[6500:7500],2:]  
test_data = data[n[7500:],2:]  
  
train_label = data[n[0:6500],0:2] # Obtain the corresponding labels of the training set, validation set and test set, column 
0-1 in the table 
valid_label = data[n[6500:7500],0:2]  
test_label = data[n[7500:],0:2]  
  
tf.set_random_seed(100)  
  
num_epochs = 200 # Training related hyperparameters 
batch_size = 2  
alpha = 0.0001  
hidden_nodes = 40  
input_features = 222  
sequence_len = 1  
output_class = 2 # Regression 8 input with 4 output 
X = tf.placeholder("float", [None, sequence_len, input_features]) # Input placeholder 
Y = tf.placeholder("float", [None, sequence_len, output_class])  
  
67 
 
weights = { # Define weights, gaussian distribution 
    'out': tf.Variable(tf.random_normal([hidden_nodes, output_class])) 
} 
biases = {  
    'out': tf.Variable(tf.random_normal([output_class])) 
} 
 # Define the GRU network 
def GRU(x): # Reshape input tensor into batch x sequence length x # of features 
x = tf.reshape(x , [-1, sequence_len, input_features])  
gru_cell1 = tf.nn.rnn_cell.GRUCell(num_units=hidden_nodes) # 3 GRU with hidden number of nodes each layer 
gru_cell2 = tf.nn.rnn_cell.GRUCell(num_units=hidden_nodes)  
gru_cell3 = tf.nn.rnn_cell.GRUCell(num_units=hidden_nodes)  
  
gru_cell = tf.nn.rnn_cell.MultiRNNCell([gru_cell1, gru_cell2, gru_cell3]) # Stack of those layers 
  
init_state = gru_cell.zero_state(tf.shape(x)[0], dtype=tf.float32) # Initialize state 
  
outputs, _ = tf.nn.dynamic_rnn(gru_cell, x, dtype=tf.float32, initial_state=init_state) # Get the output of each state 
output_sequence = tf.matmul(tf.reshape(outputs, [-1, hidden_nodes]), weights['out']) + biases['out']  
  
return tf.reshape(output_sequence, [-1, sequence_len, output_class])  
68 
 
 
 
 
 
 
This page is intentionally left blank.