DATABASE

Failures and their types in distributed database

This article will be all about the failures and their types in Distributed Database Systems. The article will also discuss the cause of failure and possible failure fixes in short.


    BY RABINDRA THAPA
    20 OCT 2021 • 2 MIN READ
Failures in distributed database

 

Failure of distributed database is the situation when distributed databases do not work as they were intended. It makes the distributed database not so reliable to work with.

 

To design a reliable distributed system that can recover from failures, we need first to identify the types of failures with which the system has to deal. The reasons for these failures can be traced back due to both hardware and software issues.

 


 

Mainly, distributed database system has four types of failures: 

 

  1. Transaction failures
  2. System failures
  3. Media failures
  4. Communication failures

 

Chain of Events Leading to System Failure

Chain of Events Leading to System Failure

 


 

Transaction failures

 

When instructions on a transaction are not committed in a distributed database, we can say that the transaction has failed.

 

Reasons why transactions fail? 

 

 

The usual approach to take in cases of transaction failure is to abort the transaction.

This resets the database to a state prior to the start of a transaction.

 


 

System failures

 

System failures are also known as site failures. It can occur due to hardware or software failures or even both in some cases.

 

System failures are always assumed to result in the loss of main memory contents.

 

Any part of the database that was in the main memory buffer is lost as a result of a system failure. However, the database that is stored in secondary storage is assumed to be safe and correct. 

 

There are two types of system failures :

 

  1. Total failure:  failure of all sites in the distributed database system
  2. Partial failure: failure of only some sites (others remain operational) 

 


 

Media Failures

 

Media failures are also known as disk failures. It refers to failures of the secondary storage devices that store the database. 

 

Reasons why Media failures occur? 

 

Media failures assume all or part of the database that is on the secondary storage is considered to be destroyed and inaccessible. 

 

How to solve Media failures? 

 


 

Communication Failures 

 

The three types of failures described above are common to both centralized and distributed DBMSs. But Communication failures are unique to the distributed databases. 

 

The most common types of communication failures are:

 

Handling errors and improperly ordered messages are the responsibility of the computer network.

 

Lost or undeliverable messages are typically the consequence of communication line failures or (destination) site failures. 

 

If a communication line fails, in addition to losing the message(s) in transit, it may also divide the network into two or more disjoint groups. This is called network partitioning. If the network is partitioned, the sites in each partition may continue to operate.

 

We need to detect the undelivered messages and use reliability protocols to ensure the messages are passed successfully. 


Tags


database tech distributed database failures


Comments


warning   You need to Sign Up to Comment

Related Blogs

See all    
Machine Learning Project: Heart Disease Prediction System (HDPS) using Logistic Regression from Scratch
Free Hosting(VPS) in Microsoft Azure using Azure for students