Home >Common Problem >Backup Storage Differences
Storage and backup have always been two similar concepts, but they are very different. If you are not a professional technical expert, it is difficult to understand the difference between the two. Especially with the emergence of the cloud, these two concepts are often confused and looked at together. So what are the differences between backup and storage?
Backup cannot exist independently of the data container and is always built on storage(Recommended learning: PHP video tutorial)
Storage is the general term for data storage containers, such as floppy disks, optical disks, magnetic disks, disk arrays, NAS for small and medium-sized businesses, professional tape libraries, and professional optical fiber storage networks SAN. Storage capacity ranges from a few MB to 100TB and even P-level. In recent years, a new solution has emerged, cloud storage, which itself is divided into personal use and enterprise use. Personal data storage uses such as Baidu Netdisk, 360 Netdisk, DropBox, etc., commonly known as saving some personal information, pictures, documents, etc.; enterprise purposes such as AWS's S3, Alibaba Cloud's OSS, and Qiniu Cloud Storage, and Paiyun, Kingsoft Cloud's cloud storage is usually used in key business systems, such as the storage of documents, pictures, videos and other data generated by users. One advantage of cloud storage is that it can dynamically expand storage space. Platform providers use low-cost hard disks and other solutions, and use distributed technology to gather their own cheap hard disks to build a storage solution with higher reliability; some When a large platform has economies of scale and is spread evenly among users, the TCO cost will be much lower. This is a weakness of traditional storage solutions.
Backup is a data protection mechanism and solution, and its implementation must depend on the specific storage container. Currently, there are many brands in the backup market, such as Symantec's NBU, CommVault's backup products, IBM's TSM, EMC's NetWorker, and Multi-Backup, which focuses on hybrid cloud data backup protection services. Backups are usually used to protect core data generated by business systems or important personal data. General backup systems are usually combined with hardware storage devices to form a backup solution. Multi-backup is currently built on Alibaba Cloud Storage, AWS S3, Qiniu, Kingsoft Cloud, Baidu Cloud and other cloud storage. All backup data will be stored on these large storage platforms.
Storage usually solves the problem of geographical spatial access; while backup solves the problem of redundant storage in geographical space
The WORD software we use for work, if there is no data storage medium , the documents generated by editing cannot be saved. When equipped with an IDE or SATA hard drive, the data generated by the application software can be quickly saved on the hard drive. This is a simple example of hard drive storage support software working. Once the local hard drive is broken, you may have to start over after a week of hard work to succeed.
Usually when designing the architecture of important business systems, we will fully consider the composition of the storage solution, what kind of business system, in several locations, how the data is distributed, the required capacity, expansion requirements, etc. for planning and design. Focus on solving the growing data storage problems of business systems. Generally, the storage architecture is deployed close to the business application server. Whether it is cloud storage or traditional storage architecture, there is one goal: to ensure stable and continuous access to business systems in different locations and storage spaces.
Data in one place is always unreliable, such as power outage in the computer room, line failure, hardware failure, fire, etc. Especially for important business systems, such as payment systems, once the business system maintains continuous access to the business, many considerations must be taken into account. The data mirroring in each location has a bit of a backup flavor. In terms of redundant storage of spatial data, cloud storage generally also provides redundant storage of data in areas where data is lost to prevent disasters.
On this basis, backup further encapsulates logic and can customize different replication strategies for data in different places. More important data can usually be redundant in one place, such as user-generated logs, pictures, etc.; for more critical data, such as user registration data, data storage index data, transaction data, and financial systems Relevant data, etc., may be redundant if necessary. The emergence of cloud storage makes cloud-based backup solutions easier to implement. It is easy to build channels in different geographical locations on demand. As long as you are willing, data can be backed up to cloud storage centers in dozens of regions around the world. All of this can be done with the simplest manual replication solution, or with an automated management solution, such as multiple backups.
Storage usually solves the problem of continuous data reading, writing and saving; backup solves the problem of time version freezing and backtracking
Save a word document, upload a movie, and modify it A post sends a WeChat message, which are either sequentially written to the hard disk or written to a professional database or file system. This is a typical storage application scenario, which is to continuously respond to data storage requirements sent from business or software. Documents, movies, and posts will only have the latest status in the end, and the historical status is always overwritten by the latest status.
Since there are additions, there are also deletions and modifications, so the storage does not recognize the intention of the upper-layer software. It may be normal, or it may be malicious intrusion, or misoperation. Additions and deletions are also performed at the bottom layer. . Some storage designs have certain backup and recovery capabilities. Of course, using backup and recovery capabilities may cost more than deploying a backup solution. We all know that recovering data from a hard drive usually costs thousands of dollars. The hard drive is not valuable, but the data inside is.
To solve the impact of addition, deletion, modification and other intentional or unintentional behaviors on the data storage system, the professional function - backup comes into play. One of the most important functions to consider in a backup system is timeline version freezing and backtracking. Each backup of the storage system will form a data mirror version of the current backup time. When restoring, you can directly select the corresponding version to restore, and the data will return to the previous state. Of course, different products implement different backup solutions. For multiple backups based on hybrid cloud architecture, the version can theoretically be maintained and restored as desired. In addition, different implementations have different image consistency results. For scenarios with high consistency requirements, the version may freeze write requests, such as directly adding write locks to database backups, which will have a short-term impact on the business; if you say that you are a wealthy person and can afford the price, you need to have almost no interference with the business system. , it doesn’t matter whether it’s tens or millions, you can buy a relatively good continuous data protection solution (CDP), such as the foreign FalconStor CDP.
Storage is usually designed for security against hardware failures; backup solves data security issues caused by multiple factors including software and hardware failures
In our daily concepts, Storage equals security, especially after the emergence of the concept of cloud computing, some technical experts in the surrounding area also have similar views. In fact, this is a misunderstanding.
Starting from the most commonly used mechanical hard disks, they are usually designed around temperature, read and write life, impact resistance, etc. Some hard disks begin to work abnormally after reading and writing more than a few hundred TB. SSD hard disks begin to work abnormally at ambient temperatures. Changes may also cause changes in data validity. With the strengthening of storage security technology, redundant arrangement technology has emerged to aggregate multiple hard drives and write data to multiple hard drives, thus improving the reliability of a single hard drive. After entering dedicated storage solutions such as NAS and SAN, storage capacity and the reliability of the entire read and write link are further improved through redundant arrays, channel redundancy, snapshot mirroring and other technologies. But these are designed around hardware failure or storage area failure. The core design goal of cloud storage, including object storage and elastic block storage, is still the ability to recover data from other nodes when there is a problem with the hardware or storage node or regional system.
An important design idea of the backup system is the design around recovery. Backup copies data from one node, one system, to another node, one system, avoiding the possibility of hardware and software problems occurring at the same time; backup systems usually add high-level redundancy configurations to the data storage, or Redundant replication, or low-cost arithmetically redundant data distribution. Through time versioning and spatial redundant distribution, the backup system further avoids data loss caused by various intentional and unintentional data reading and writing actions, including human operations, system failures, software defects, hackers, viruses, natural disasters, etc. New issues, modifications, etc. Some well-designed solutions will easily restore data. Even if there is a problem with the backup system software, the data can be restored through a certain process. After a problem occurs in the business system, multiple backups can even be restored independently on a selected database table or a certain file; if the data adopts a hybrid cloud model, even if the data is in the TB level, it can be restored to the business system in an extremely short time.
The beauty of storage is usually to make the data scale "larger", while the beauty of backup is usually to make the data "smaller" to the greatest extent.
The big thing here is It refers to the problem that storage solves is how to store extremely large-scale data. In our daily conversations, we usually communicate about whose hard drive supports 1TB, 2TB, 4TB, etc., how many users the business system supports, how many TBs of data are generated, what the database storage scale is, etc., to measure whether a storage system is well designed. , that is, under large-scale data, the system responds well and users feel smooth. To support this goal, we usually talk about the number of hard disk blocks supported by the storage server, the capacity of a single hard disk, and of course IO channel capabilities are also supporting indicators.
In order to support large enough storage, special storage switches are equipped to quickly map large-capacity storage arrays to different servers; while data are stored centrally in SAN networks, some up to hundreds of TB or even PB levels. . Even so, the feature of supporting compression can come in handy. Especially for huge amounts of data, the stored data will be compressed or deduplicated by default to reduce the space occupied by the data. The emergence of cloud storage has pushed the capacity to almost unlimited; a single cloud platform may be accurately limited by the capacity of the storage computer room area and the number of areas. Due to the dynamic expansion feature, when the storage network is insufficient and the IDC space is sufficient, nodes can be continuously added to complete the increase in storage nodes.
Usually when the data is large enough and the data is old enough, the data needs to be archived and backed up. Backup is about how to maintain complete data while better reducing storage overhead. Therefore, generally speaking, backup will use technologies such as deduplication on the original end or global deduplication on the storage end to maintain the minimum backup space. At the same time, solutions such as compression will also be automatically introduced in the backup process. In the traditional implementation plan, since the full volume needs to be done regularly, the data will become very large, so it is generally retained for 3 months, or half a year, or full volume is done once a year. Under the snapshot model, due to the relatively large scale of the data generated, Generally, the snapshot backup taken by the cloud host on the cloud disk usually maintains several snapshots. Traditional backup products also have a full-incremental strategy, but when combined with hardware storage solutions, the use and management costs are still relatively high; the advantages of the full-incremental strategy adopted by multiple backups are mainly concentrated in the hybrid cloud index incremental model, and the data maintains a minimum growth scale and is reliable While storing, it can also maintain a simpler quick recovery experience, and large-scale data backup is very space-saving.
For more PHP related technical articles, please visit the PHP Graphic Tutorial column to learn!
The above is the detailed content of Backup Storage Differences. For more information, please follow other related articles on the PHP Chinese website!