Blockchain technology is praised for its ability to lay the foundation for a transparent, tamper-resistant data ecosystem that far surpasses the capabilities of traditional systems while also providing complex data backup mechanisms that are integral for any company that operates with large data volumes. The blockchain-enabled near real-time data backup paired together with database replication mechanisms can ensure new levels of security, facilitate business and operational continuity during an attack, while also enabling complete data reconstruction in case of a disaster scenario.

Before delving into the topic at hand, it must be noted that various media channels and online publications often use the term “real-time backup” to describe blockchain’s backup capabilities. Blockchain is a type of Distributed Ledger Technology (DLT) that distributes the information to every node that is part of the network. This way the system manages to guarantee high levels of availability and transparency to the data it stores, ensuring that only one version of the data is true. Because of this, when we refer to blockchain’s backup capabilities, we should label them as near real-time, because the information does not propagate instantaneously over every network participant.

Datafication is a concept which entails the transformation of day-to-day actions, which to most of us already seem commonplace and mundane, into raw data through digital interfaces. Although most of us do not realize it, the extensive process of datafication transformed society on a fundamental level, ushering in a data-driven society, where every decision, action and business strategy is shaped by the incredible amount of data collected by companies and enterprises. As such, it is no surprise that data has become the most important asset a company can hold. To name a few examples, the insurance and banking domain relies heavily on data to determine an individual’s credit risk profile, while CRM companies rely on datafication to figure out customer behavior and the best way to engage them to receive a positive response.

Because we live in a society where data has become valuable, companies, and enterprises have made it their priority to ensure data security. Current database systems employ a snapshot mechanism, where at a predetermined period of time, the entire database is automatically copied and stored in a separate secure environment. The problem is that this tried and tested method comes with a series of shortcomings such as the need for huge hard disks to store the snapshots, a time gap between snapshots in which data is vulnerable, as well as the dangers posed by centralized forms of organization.

What is near real-time backup

Near real-time backup, also known as continuous backup or continuous data protection is a data security mechanism that automatically makes a copy of every change made to the data stored in a system. Near real-time backup works by capturing every version of the data stored within a given system, allowing administrators to restore data to any previous version if the need occurs. The ability to restore data to any point in time can prove invaluable for companies that operate with large databases, as security-related issues and data breaches are a constant threat.

Near real-time data backup distinguishes itself from traditional backup mechanisms in that it allows users to roll-back the data to any given point in time, enabling full restoration. On the opposite spectrum, traditional backup mechanisms such as database snapshots, are limited only to the data from the time the backup was made. Another glaring difference between the two methods is related to the time frame in which they are performed. As the name implies, near real-time backup is continuous, taking place each time new data is added. Snapshots, although automated, rely on a pre-established schedule.

The impact of near real-time backup

A data system that has access to near real-time data backup functionalities becomes more tolerant to external tampering from malicious actors in the sense that it has a higher chance to fully recover from a disaster scenario without any permanent damage to data records. As such, the benefits of near real-time data backup are more visible in data loss prevention and data restoration scenarios.

Data loss prevention

In a society where data has become the cornerstone that supports and fuels business interactions, data protection has cemented itself as a critical component that safeguards the interests of companies that operate and process large amounts of data. As the data pool of a company expands in size and scope, the challenges faced by Chief Information Security Officers also multiply. Success in a business venture is quantified by the growth of a company’s system. The problem is that larger systems are more complex and difficult to manage, presenting numerous entry points that can be exploited.

In this context, data loss prevention has emerged as a strategy focused on ensuring a company’s tolerance to data loss and data leaks. Data loss occurs when a company loses access to sensitive data that may compromise operations. In general, data loss can be the result of a hardware failure, human error or malware attack. Regardless of the cause, a data loss can have dire consequences for a company which translates to financial losses, the creation of bottlenecks, and diminished client trust.

To reduce the incidences of data loss, companies usually enforce strict internal policies concerning data management, third party data access control software, and complex cryptographic algorithms that act as a last line of defense.

Data loss prevention solutions are implemented across the following areas:

  • storage based – implementing access permission mechanisms and encryption to protect data at rest
  • network-based – to ensure the protection of data in motion certain data filters, restrictions and encryption is implemented to make sure that sensitive data isn’t hijacked during transit
  • endpoint-based – monitoring data transfers through social media channels, data storing on unauthorized external storage, blocking activities that may pose a security threat through strict access policies and strong authentication mechanisms

Modex BCDB is a middleware software solution that can be used to supplement established data loss prevention mechanisms by facilitating blockchain-enabled near real-time data backup. Modex BCDB intervenes in the development stack by positioning itself between an existing software application and database system to add a blockchain layer that ensures confidentiality, integrity, and availability. Through Modex BCDB companies can move from a centralized model to a decentralized, distributed model, secured by complex encryption and hashing algorithms. In addition, the Modex solution comes with inbuilt customizable data access mechanisms and a separate blockchain authorization network that stores user credentials and passwords, to create an ecosystem inherently resistant to data loss.

Modex BCDB makes use of blockchain technology to extends the concept of data backup even further. A database connected to Modex BCDB benefits from a blockchain backend that stores the hash of every entry in the database (depending on system configuration). Due to the unique properties of hashing, if the information is modified in the database without authorization or by mistake, the system will compare the hash stored in the blockchain with the new one generated by the modified information, detect the difference and restore the data to its previous value.

Restoring data with recovery

Despite the growing reliability of digital storage mediums and devices, data loss is still a common occurrence. Losing personal information can be quite an inconvenience for people in general, but the situation escalates considerably in an enterprise context where high volumes of sensitive data and critical information is analyzed and processed on a daily basis. This is another scenario where near real-time data backup can make a difference.

In general, data loss can be the result of human error, malicious attacks, power outages, software malfunctions or hardware failure. Regardless of its point of origin, losing sensitive data can put a significant strain on business operations, decrease customer trust and attract legal sanctions for breaching data protection regulations like GDPR, HIPAA, and PCI DSS.

In the case of database corruption or loss, it is the job of the database administrator (DBA) to restore the database and recover all the information. In SQL servers the DBA can restore a database with recovery or no recovery. If the DBA is restoring a database using multiple backup files, the no recovery method is employed for each restore except the last. This puts the database into a restoring state in which additional backups can be restored, but in this state, the database cannot be accessed by users. If the DBA issues a restore database or restore log command, the with recovery method is implemented by default. If the database is in a recovery state and the DBA does not wish to restore additional backups, they can issue a restore database with recovery command to bring the database online and make it available to users.

As a hybrid software solution that enhances the capabilities of traditional database systems through a blockchain engine, Modex BCDB improves traditional database recovery mechanisms. By distributing a centralized database over a peer to peer node network, client systems become more resilient as their data can be reconstructed from multiple partial nodes or from a full node that acts as a backup.

Database replication

In distributed database systems, replication is a process employed to create and maintain multiple copies of a database, or to integrate database objects between different databases. Enterprises rely on this mechanism to streamline the availability and performance of their database systems by offloading data entries from their primary database to a secondary system for analytic purposes and to facilitate distributed data processing on various locations. Replication can be used to serve a myriad of purposes, including but not limited to migrating data from a legacy system to a new IT infrastructure, create a sandbox environment for testing operations, reinforce data from multiple sites, increase data availability across multiple remote offices, and consolidate security with an additional backup in case of a failover.

Companies and businesses utilize database replication to unlock a series of benefits:

  • streamline performance and increase the availability of company data
  • facilitate data access to satellite offices
  • enhance the scalability of data-centered server-side applications
  • ensure data redundancy for failover purposes

Types of database replication

Currently, the market is flooded with a multitude of database systems, each of them designed to improve upon certain operational aspects and answer different storage needs depending on a company’s business model. Regardless of this aspect, database replication has established itself as a balancing act between data consistency and system performance.

Due to this paradigm, three general types of replication have emerged as a standard in the industry (keep in mind that this is not an exhaustive list, as there are numerous replication methods):

  • Snapshot replication – as the name implies, this type of replication takes a snapshot of a database (publisher) and moves it to a different server or database system (subscriber). After the initial snapshot, the data is refreshed based on an established schedule. This replication method is usually used in systems where data changes are sparse. Although it is the easiest type of replication to maintain, snapshot replication is considerably slower than other replication methods as it copies all the data whenever the table is refreshed.
  • Transactional replication – is a type of database replication where the subscriber initially receives a snapshot of the publisher. After the initial copy is received, the subscriber is updated in near real-time as changes occur in the publisher. The advantage of transactional replication over snapshot replication is that it guarantees transactional consistency. This is because the system accurately replicates each change in the publisher.
  • Merge replication – in this replication method data from multiple databases is combined into a central database. Similar to transactional replication, an initial snapshot is synchronized to the subscriber databases. The difference between the two methods stems from the fact that in merge replication the subscriber and the publisher can independently make changes to the database, even if the subscribers aren’t connected to the network. When the subscribers reconnect to the network, the system combines all the changes made to the data and replicates them on the publisher. Usually employed in server-to-client environments, merge replication is considered one of the most complex database replication types.

Enhancing database replication with Modex BCDB

Database replication is a process through which a central database system is replicated to a secondary database often referred to as a subscriber. The goal of this procedure is to enhance the performance of existing systems, increase the availability of data to satellite offices as well as to provide continuity to business operations in case of a failover scenario where the primary database malfunctions or is compromised by an attack. Although the majority of database engines on the market come with an embedded replication mechanism, more often than not they are characterized by an unoptimized, rigid toolset that hinders performance. Because of this, a large segment of database administrators is scouting the market for third party database replication mechanisms.

Modex BCDB is a hybrid solution that fuses the advantages of traditional database systems with a blockchain layer. Database replication in Modex BCDB challenges the established dogma through its agnostic take on both database and blockchain engines. This feature enables Modex BCDB to remove a major barrier in database replication operations – replication across different database systems. As such, a MongoDB database can be successfully replicated on an Oracle or Elasticsearch database, without compromising data structures or impacting data consistency and overall performance. Furthermore, Modex BCDB removes the notion of subscriber databases which are usually relegated to read operations. Due to its blockchain backend, inserts made through Modex BCDB API into a database, are automatically replicated in near real-time across every database from the network.

In traditional database replication mechanisms, any modification made to a database is automatically replicated across subscriber databases. This can constitute a major security issue if an external party gets access to the database, as any malicious tampering will be replicated across the whole network. Database replication in Modex BCDB can be performed only through the Modex API which acts as a gatekeeper for the information stored in databases. Due to the nature of blockchain technology, any modifications made directly in a database system are discarded and reconstructed through the record versioning functionality. As such, any modification that isn’t performed through the Modex API will be treated by the system as a mistake or a potential attack.

Multi-database replication is a highly sought after functionality by database administrators who need to operate and maintain different database systems. The advantages to this type of database replication have deep ramifications on both the business side as it reduces the time and subsequently the costs involved in migrating data multiple times, and also on the development side as database administrators are no longer required to initiate multiple replication operations. Modex BCDB facilitates multi-database replication, regardless of the database engines involved, due to its agnostic take on this technology. This feature is further strengthened by the fact that in the Modex BCDB ecosystem, nodes no longer follow the publisher-subscriber relation (also known as the master-slave relation), as each node is treated equally in the system.

In a data-driven society that needs to constantly be on the lookout for new emerging cybersecurity threats, near real-time backup and database replication should become deeply ingrained in the security-related practices of companies that operate with large data pools.

 Although backups and replication differ from each other in terms of functionality and purpose, they are both procedures that help companies and enterprises safeguard against data threats. As such, they should be envisioned as data security measures that complement each other. The problem is that in traditional systems, it is quite expensive to implement both mechanisms. With Modex BCDB, this is not the case because the blockchain component is purposely designed to facilitate near real-time data backup and enhanced database replication.