Journal Search Engine
Search Advanced Search Adode Reader(link)
Download PDF Export Citaion korean bibliography PMC previewer
ISSN : 1598-7248 (Print)
ISSN : 2234-6473 (Online)
Industrial Engineering & Management Systems Vol.16 No.3 pp.415-419
DOI : https://doi.org/10.7232/iems.2017.16.3.415

Economic Operation of a Data Backup Scheme for Computer Systems under Disastrous Information Threats: A Queueing Approach

Won Seok Yang, Nam K. Kim*
Department of Business Administration, Hannam University, Daejeon, Korea
Department of Industrial Engineering, Chonnam National University, Gwangju, Korea
Corresponding Author, freedom@jnu.ac.kr
March 3, 2017 May 1, 2017 May 2, 2017

ABSTRACT

In this paper, we deal with the economic operation of a data backup scheme for various computer systems under disastrous information threats. To this end, we introduce a queueing model that combines disasters and two-stage queues. Making use of the matrix-geometric approach, we obtain its steady-state probabilities. From these results, we compute the average total cost with a recovery cost and a backup cost taken into account. With sample numerical results, we investigate the optimal backup operation that minimizes the average total cost.


초록


    1.INTRODUCTION

    Recently, information security threats, including distributed denial-of-service (DDoS) and malware, such as computer viruses, Trojan horses, spyware, and worms, have increased rapidly and caused serious damages to computer and communications systems. Several studies have analyzed the impact of information threats on system performance using rigorous mathematical models such as queueing models: Considering the SYN flooding attacks, Wang et al. (2007) take a queueing approach to analyze DDoS attacks against computer systems. In addition, Wang et al. (2010) deal with email systems under attacks against email accounts, such as password cracking, malicious emails and email bombs. They use a multiple queueing model to study the performance metrics of system security. Lee and Yang (2013) consider wireless sensor networks having unreliable network connections caused by external attacks and study the effective operation of the power consumption. Yu et al. (2014) apply the M/M/m queueing model to a resource allocation problem arising in a cloud environment vulnerable to DDoS attacks. Kondakci (2015) summarizes existing studies that have taken stochastic approaches to analyze the reliability of network security.

    In this paper, based on a queueing analysis, we deal with the economic operation of a data backup scheme for computer systems under disastrous information threats. Damages from such information threats include data loss, which may be the most disastrous one to individuals and organizations. They suffer from computer viruses, for example, that wipes out all the files from their computers. One of the countermeasures to protect their data in a computer is to back them up to an external hardware. Among various countermeasures, such as anti-viruses and firewalls, data backup is a basic method to ensure infor- mation security against various threats.

    Now, let us consider the following data backup scenario that frequently arises in computer systems dealing with files, data, and computer programs: A user stores her work in her computer, every time she completes her work. As soon as the amount of her completed work (the number of completed files, for example) is added up to a certain point, all her work is (all her files are) backed up in a batch to external hardware or a cloud server. This data backup process can be modeled by the two-stage queueing model (Yang et al., 2016). In this model, the individual work corresponds to an individual service of the first stage and the batch work in backup is a batch service of the second stage. In addition to this model, we consider disasters in this paper to describe disastrous information threats. We assume that disasters occur stochastically to the system and wipe out all the work in the first stage as well as the second stage. The queueing model we consider in this paper, as a result, is a combination of the twostage queueing model and the disaster model.

    In the literature, the external causes that flush out all the customers in queueing systems are called disasters and have been extensively researched (see, Lee et al., 2011; Lee and Yang, 2013). Disasters are also called queue flushing (Towsley and Tripathi, 1991), catastrophes (Chao, 1995; Kumar and Arivudainambi, 2000), mass exodus (Chen and Renshaw, 1997), and stochastic clearing systems (Artalejo and Gomez-Corral, 1998). In this paper, by disasters we mean all kinds of attacks that flush out all the customers in the system.

    In order to analyze the proposed model mathematically, we use the continuous-time Markov chain. We make use of the matrix-geometric approach (Neuts, 1981), in particular, to obtain the steady-state probabilities. In addition, we investigate the cost-effective operation, in which the optimal backup size is determined, with two types of costs taken into account: a recovery cost and a backup cost (see Section 4 for their definitions). Sample numerical examples are presented as well.

    The contribution of this paper is threefold: First, we propose a mathematical model that describes the data backup process arising in computer systems under disastrous information threats. Second, we introduce a queueing model that is an interesting combination of the two-stage queueing model and the disaster queueing model. Third, we provide a cost analysis for the economic operation of the data backup process, which would be useful for security management. To the best of our knowledge, this paper is the first attempt to study, based on queueing analysis, the data backup process under disastrous threats and analyze its economic operation.

    The rest of this paper is organized as follows: We describe the queueing model in detail in Section 2 and mathematically analyze it in Section 3. Then we present the economic analysis together with numerical examples in Section 4. Finally, we conclude the paper in Section 5.

    2.MODEL DESCRIPTION

    The two-stage queueing model that we analyze in this paper is described as follows: Jobs arrive at the system according to a Poisson process with rate λ. Jobs represent works, such as files or data requiring computer processing, which are assumed to be processed by the single server. In the first stage of this model, the server processes jobs individually on a first-come first-served (FCFS) basis. The processing time is assumed to be exponentially distributed with rate μ. After processing a job, the server passes the completed job to the second stage. Whenever the number of jobs in the second stage reaches a certain threshold, denoted by K, the server immediately moves to the second stage and processes all these K jobs simultaneously (that is, the server serves them in a batch of size K). Its processing time is assumed to be exponentially distributed with rate α. After finishing a batch service in the second stage, the server moves back to the first stage. In addition, we assume that disasters occur to the system according to a Poisson process with rate δ. Whenever disasters occur, all the jobs in the system, i.e., those in the first as well as in the second stage, are wiped out. Finally, it is assumed that the arrival processes of disasters as well as jobs, and the service times of jobs at the first as well as the second stages are independent of each other.

    3.QUEUEING ANALYSIS

    Let (n, m) denote the system state where n and m represent the numbers of jobs (including those in process) in the first and the second stages, respectively, for n = 0,1, 2, … and m = 0,1, …, K. Then, the stochastic process of the system state constitute a continuous-time Markov chain, defined by the transition rate matrix Q; arranging its states in the lexicographic order, we have(1)

    Q = ( B 0 A 0 0 0 0 B 1 A 1 A 0 0 0 C A 2 A 1 A 0 0 C 0 A 2 A 1 A 0 )
    (1)

    The matrices in Q are all square matrices of order (K +1) and they are given as follows: First, the matrix A0 is given by A0 = λI, where I is an identity matrix of order (K +1). Next, the elements of A1 are given by

    • (A1)i, i = − (λ + μ + δ) for i = 1, …, K,

    • (A1)K+1, K+1 = − (λ + μ + δ),

    • (A1)K+1,1 = a

    The elements of A2 are given by

    • (A2)i,i+1 = μ for

    • i = 1, …, K.

    The elements of B0 are given by

    • (B0)i, j = δ for

    • i = 2, …, K,

    • (B0)K+1, 1 = δ + α,

    • (B0)1, 1 =λ

    • (B0)i, j = − (λ + δ) for

    • i = 2, …, K,

    • (B0)K+1, K+1 = -(λ + δ + α) =

    The elements of B1 are given by

    • (B1)i, 1 = δ for i =1,⋯, K +1,

    • (B1)i,j+ 1 = μ for i =1,⋯, K.

    Finally, the elements of C are given by

    • (C)i, 1 = δ for

    • i = 1, …, K +1.

    All the other elements of A1, A2, B0, B1 and C that are not specified as given above, are set to be zero.

    Now, let x(n, m) denote the steady-state probability of the continuous-time Markov chain with its steady-state probability vector x n = ( x ( n , 0 ) , , x ( n , k ) ) for n = 0,1, 2, …. Note that with δ > 0 , the system is ensured to be stable, so that the steady-state probabilities exist for any set of the system parameters. Then, making use of the matrix-geometric approach (Neuts, 1981), the steadystate probabilities can be obtained as follows: First, we have

    x n = x 0 R n ,  for  n = 1 , ​  2 ,
    (2)

    The matrix R in (2) is a square matrix of order (K + 1) and satisfies the equation R 2 A 2 + R A 1 + A 0 = 0 . As a result, it can be computed iteratively as follows:

    R 0 = λ A 1 1 , R k = λ ( A 1 + R k 1 A 2 ) 1

    for k =1, 2, ….

    Finally, x0 is obtained by solving the following equations simultaneously:

    X 0 ( B 0 + R B 1 + R 2 ( I R ) 1 C ) = 0 and X 0 ( I R ) 1 e=1

    where e is a column vector with all of their (K +1) elements being 1.

    Let N1, N2, and N denote the number of jobs in the first stage, in the second stage, and in the entire system in steady state, respectively. Then, the expected value of N1, N2, and N are obtained as follows:(3)(4)(5)

    E [ N 1 ] = n = 0 m = 0 K x ( n , m ) n = n = 0 | n x n e = x 0 R ( I R ) 2 e
    (3)

    E [ N 2 ] = n = 0 m = 0 K x ( n , m ) m n = n = 0 x n g = x 0 ( I R ) 1 g
    (4)

    E [ N ] = E [ N 1 ] + E [ N 2 ] = x 0 R ( I R ) 2 e+ x 0 R ( I R ) 1 g
    (5)

    where

    • g = (0,1, 2, …, K) .

    4.ECONOMIC ANALYSIS

    We consider two types of costs in economic analysis: a recovery cost and a backup cost. The recovery cost represents the cost for recovering the jobs destroyed by disasters, and it is assumed to be charged to each of those destroyed jobs.

    The backup cost, on the other hand, is assumed to consist of a fixed cost and a variable cost. The fixed backup cost represents the cost that is assumed to be incurred after each backup. The variable backup cost, in contrast, is assumed to be charged to each job that has been backed up. This variable backup cost represents the cost for purchasing an external hardware, renting a cloud facility, transmitting data to back-up facility, and so on.

    In this section, we consider the optimal value of K that minimizes the average total cost with both the recovery cost and the backup cost taken into account. To this end, we first define a T-cycle, in which a duration between two successive arrival points of disasters is denoted by T. Note that this T-cycle with E[T] = 1/δ is regenerative because the Markov chain starts over at the state (0, 0) upon every arrival of disasters. Note also that since the arrival process of disasters is Poisson, they see the time average of the number of jobs in the system (PASTA (Wolff, 1982)). Applying the renewal reward theorem (Wolff, 1989), we have the average number of jobs destroyed by disasters per unit time, denoted by L, as follows:(6)

    L = E [ #  of jobs lost in queue during  T ] E [ T ] = E [ N ] E [ T ] = δ x 0 R ( I R ) 2 e + δ x 0 ( I R ) 1 g
    (6)

    Let cR denote the recovery cost (charged to each job that is destroyed) and CR (δ, K) denote the average recovery cost per unit time. Note that CR (δ, K) is dependent, particularly on the disaster arrival rate δ and the batch size K, among others. Then, we have

    C R ( δ , K ) = c R E [ N ] / E [ T ] = c R L
    (7)

    In order to get the backup cost, we first let π = (π0, π1, …, πK), where πj for j = 0,1, …, K represents the steady-state probability of the number of jobs in the second stage. Then, from π j = i = 0 x ( i , j ) we have(8)

    π = x 0 ( I R ) 1
    (8)

    Next, we let UD denote the expected number of (successful) backups per unit time. Then it is given by

    U d = α π K
    (9)

    (When a disaster arrives during the backup, recall that all the jobs in the middle of backup are assumed to be wiped out. Such a failed backup is not counted in calculation of UD in (9).)

    Now, we let cF and cV denote the fixed backup cost (incurred after each backup) and the variable backup cost (charged to each job that has been backed up), respectively. Also, we let CB(δ, K) denote the average backup cost per unit time. Since exactly K jobs are backed up at every backup, we have

    C B ( δ , K ) = c F U D + c V U D K
    (10)

    Next, we let C(δ, K) denote the average total cost with the given δ and K. Then, combining (7) and (10), we have(11)

    C B ( δ , K ) = c R δ ( x 0 R ( I R ) 2 e   +   x 0 ( I R ) 1 g ) + ( c F + c V K ) α π K
    (11)

    Finally, the optimal value of K can be obtained from the following optimization problem:

    min    C ( δ , K ) = c R δ ( x 0 R ( I R ) 2 e + x 0 ( I R ) 1 g ) + ( c F + c V K ) α π K
    (12)

    s u b j e c t t o      K < K max

    Now, we investigate the characteristics of the cost minimization in (12) with sample numerical examples. To this end, we use the following parameters: λ = 0.99, μ =1, α = 5, 1, cR = 1, cF = and cV =0.1. In addition, we consider five arrival rates of disasters, varying from δ = 0.001 to δ = 0.005.

    Figure 1 gives the average total cost over the various backup sizes (K), in which the symbol “+” for each δ indicates the optimal backup size that minimizes the average total cost. From Figure 1, we have the following observations: First, it shows that the average total cost has a convex shape, such that an optimal backup size exists. Second, it shows that the optimal backup size increases, as disasters get less frequent. Third, it shows that as the disaster arrival rate gets higher, the economic batch size saves a considerable amount of the total cost (the convexity gets sharper). This saving decreases as disasters get less frequent.

    Finally, let us consider a situation, where disaster arrival rate can be controlled as well. We consider such a situation, in which information security is reinforced, so that the disaster rate decreases from δ1 to δ2, where δ1 > δ2. Then the average cost saving per unit time due to this reinforcement is given by C(δ1, K) − C(δ2, K)

    5.CONCLUSION

    In this paper, we present a two-stage queueing model with disasters that describes the data backup process arising in computer systems under disastrous information threats. Making use of the matrix-geometric approach (Neuts, 1981), we obtain its steady-state probabilities. From these results, we compute the average total cost with a recovery cost and a backup cost taken into account. With sample numerical results, we also investigate the optimal backup operation that minimizes the average total cost.

    Recently, the damages caused by information security threats have increased rapidly with the explosive growth of data in the era of big data. Therefore, much attention has to be paid to the reinforcement of information security, which requires substantial investment in the workforce and equipment. We hope that the queueing model and its analysis presented in this paper would be beneficial to the information security investment and management for various computer systems that are vulnerable to disastrous information threats.

    ACKNOWLEDGEMENT

    We sincerely express our gratitude to one of anonymous reviewers for his/her careful reading and valuable suggestions. This paper was supported by the 2017 Hannam University Research Fund.

    Figure

    IEMS-16-415_F1.gif

    The average total cost.

    Table

    REFERENCES

    1. Artalejo J.R. , Gomez-Corral A. (1998) Analysis of a stochastic clearing system with repeated attempts , Commun. Stat. Stoch. Models, Vol.14 (3) ; pp.623-645
    2. Chao X. (1995) A queueing network model with catastrophes and product form solution , Oper. Res. Lett, Vol.18 (2) ; pp.75-79
    3. Chen A. , Renshaw E. (1997) The M/M/1 queue with mass exodus and mass arrivals when empty , J. Appl. Probab, Vol.34 (1) ; pp.192-207
    4. Kondakci S. (2015) Analysis of information security reliability: A tutorial , Reliab. Eng. Syst. Saf, Vol.133 ; pp.275-299
    5. Kumar B.K. , Arivudainambi D. (2000) Transient solution of an M/M/1 queue with catastrophes , Comput. Math. Appl, Vol.40 (10-11) ; pp.1233-1240
    6. Lee D.H. , Yang W.S. (2013) The N-policy of a discrete time Geo/G/1 queue with disasters and its application to wireless sensor networks , Appl. Math. Model, Vol.37 (23) ; pp.9722-9731
    7. Lee D.H. , Yang W.S. , Park H.M. (2011) Geo/ G/1 queues with disasters and general repair times , Appl. Math. Model, Vol.35 (4) ; pp.1561-1570
    8. Neuts M.F. (1981) Matrix-Geometric Solutions in Stochastic Models: An Algorithmic Approach, The Johns Hopkins University Press,
    9. Towsley D. , Tripathi S.K. (1991) A single server priority queue with server failures and queue flushing , Oper. Res. Lett, Vol.10 (6) ; pp.353-362
    10. Wang Y. , Lin C. , Li Q.L. (2010) Performance analysis of email systems under three types of attacks , Perform. Eval, Vol.67 (6) ; pp.485-499
    11. Wang Y. , Lin C. , Li Q.L. , Fang Y. (2007) A queueing analysis for the denial of service (DoS) attacks in computer networks , Comput. Netw, Vol.51 (12) ; pp.3564-3573
    12. Wolff R.W. (1982) Poisson arrivals see time averages , Oper. Res, Vol.30 (2) ; pp.223-231
    13. Wolff R.W. (1989) Stochastic modeling and the theory of queues, Prentice Hall,
    14. Yang W.S. , Kim T.S. , Park H.M. , Lim D.E. (2016) Analysis of a two-stage queue with a single server and N-policy , Am. J. Math. Manage. Sci, Vol.35 (3) ; pp.261-270
    15. Yu S. , Tian Y. , Guo S. , Wu D.O. (2014) Can we beat DDoS attacks in clouds? , IEEE Trans. Parallel Distrib. Syst, Vol.25 ; pp.2245-2254