In this article, we invited Mr. Wang Wenyu, CEO of DataSecOps, to provide insight into the evolution of data security technology and to share his views regarding the future development of data security.
An introduction to data security technology
According to DIKW's model definition, data comprises boring 0s and 1s. People must extract information based on data and purify knowledge based on information to gain the wisdom that is ultimately used to create value. Therefore, data can be considered a type of foundation, whereas data security pertains to protecting information, knowledge, and the security of the data ontology.
There are currently three broad categories of data security technology. First, fundamental security technologies, such as cryptography, including classical and modern cryptography. Cryptographic technologies have played a prominent role in different domains at different periods and are also an integral cornerstone of data security technology.
The second aspect is access control. Access control is a topic that cannot be ignored as long as it involves security. Like a double-edged sword, whose use and how data is used will make a difference.
Specifically, when it comes to access control, the subject is the user, and the object is the data and behavior, that is, how the data is manipulated and accessed and some broader correlations between these elements.
Engineering is the most challenging aspect of access control. The field of data security encompasses technology, science, and engineering practices that are highly integrated and involve elements such as cost, impact, and many other factors. It is vital to choose an access control model that users can "normally use" to control access in data security.
Third, trusted computing, in which TCM, TPM, TPCM, and other trusted computing logics are closely related. Trusted computing establishes a chain of trust from the hardware layer through the software layer while also applying various technologies, including encryption. Presently, many operating systems include trusted computing, and some chips are equipped with trusted modules.
We will now look at several typical security technologies. File encryption is the first example. Encryption can provide security in a straightforward manner; however, it can also be burdensome. Therefore, the primary goal of encryption technology is to achieve a balance between security and usability, with a minimum of side effects.
Encryption itself is a highly sensitive operation with high requirements for real-time and security of encryption and decryption. A malfunction in a particular process, such as a disruption in the order and position of encryption and decryption, can result in a corrupted file. In this way, encryption can be considered a type of security technology that is both violent and effective but possesses certain drawbacks.
Additionally, database encryption technology is similar to file encryption technology. Database encryption technology has been developed for approximately ten years. It has fewer application scenarios because it relies on the database platform, architecture design, and high concurrency and is largely handled at the application layer. Furthermore, there are fewer operable points to avoid the negative impact of encryption. Due to technical principles and performance considerations, a cautious approach to adopting database encryption is necessary.
Encryption technology must be used carefully and prudently for highly sensitive documents and data. In general, encryption is an effective method of addressing some security concerns. However, before utilizing it, it is essential to determine whether the current scenario is appropriate and what encryption method should be used, including whether the side effects produced by encryption are acceptable.
Second, the data masking technology. The data masking protocol was first implemented in development, experimentation, and production environments in the finance and carrier industry. This technology arose from a need to pre-test the data used for business purposes. The data that was being tested did not have to be true but still had to be truth-like.
Generally, data masking can be divided into two categories: static and dynamic ones. A static data mask has a broader range of applications, whereas a dynamic data mask is often limited by specific business scenarios or performance requirements. Nevertheless, both are now more mature and have excellent standards of practice in various fields, both at home and abroad.
Among the traditional security technologies, digital watermarking has evolved along several paths. An essential requirement is to protect the robustness of digital watermarking and data originals from being destroyed. Essentially, watermarking serves as both a deterrent and a retroactive function. Compared to other technologies, digital watermarking has a larger impact during the interim and post-facto phases.
A DLP (Data Leakage Protection) is a more sophisticated type of protection and is characterized by content identification as the basis of protection. The DLP can be further divided into terminal DLP, network DLP, mail DLP, etc. It is usually used for unintentional leakage to prevent good people from doing bad things.
In recent years, privacy computing has become a hot area in security. It also consists of many different branches, such as homomorphic encryption, MPC, regulatory sandboxes, and some other applications. There is still a long way to go in the privacy computing area, as it involves dealing with the strict requirements of algorithms and mathematical problems to balance ideals and realities.
Furthermore, data security can also be understood as a methodological logic. It is necessary to consider management and system factors before using various forms of security technologies and to choose the right technical approach based on those factors. Businesses in different sectors tend to have different expectations of data security and may have other requirements for how data is protected.
Historically, data security has been regarded as a subset of network security, but as technology has advanced and the IT environment has changed, data has become a core asset. Similarly, data security is also progressively becoming a sub-dimension of cyberspace security at the same level as network security. Data security will likely bring additional sub-dimensions in the future, which has historically been the trend.
Driving forces of data security development
There are several important drivers for the development of data security, and the first is the legal-driven pathway. With the establishment of network and data security regulations, cyberspace security has risen to the legal level, and the severity of laws has also been acknowledged as significant in Chinese legislation.
Risk is the second driving force. Nowadays, as we are concerned about data security, relevant incidents are occurring more frequently and causing severe losses. In 2020 alone, there have been more data security incidents than the total in the previous 15 years, with an average loss of five million dollars. Due to this, major companies have released data security-related industry standards and classification and ranking standards. Some industries have even incorporated data security into assessments and KPIs.
The last one is technology development. A large amount of data will be generated due to technologies including artificial intelligence, big data, cloud computing, 5G, and the Internet of Things. It is imperative that we protect these data, ensure that they are protected, and guarantee the security of these data, which is also a major challenge we face in the current era.
Overall, the new era is a major driving force behind the development of information security. Business security has always been an incredibly crucial component of every company's operations. As businesses continue to grow and become more complex, there is an increase in the demand for security. Therefore, data security should evolve at the pace of the emergence of new technologies to better serve business needs.
Different stages of the development of data security require respective techniques. During the device-centric phase, encrypted databases, static data masking, and encrypted files are prevalent practices. As a result of the boundary- and network-centric stage, there will be more blocking technologies, such as data auditing, database firewalls, DLP, data masking, etc.
In the next decade, in order to rebuild a data-centric security system, we will need to apply more new techniques, such as offline data security, privacy computing, chain-wide data tracing protection, and data risk assessment, to build an integrated data security governance platform and to break conventional silos to form more new ideas, standards, and logic.
Data security principles and evolution
As part of its development, data security has undergone four different stages, or types, of technology. The first is cage-type, equivalent to a safe, where data is blocked with the goal of not losing it. Up until now, this has been a prevalent security measure.
Secondly, "shackle-type" measures, such as file and database encryption, are as effective as a lock on a door. Although such a strategy may be better suited for less interactive domains, the shackle-type approach may not work for all scenarios, given that it can be susceptible to issues such as strength, granularity, and applicability.
The third is recognition-type, and the primary method is content-based recognition.
The fourth type is a comprehensive one, i.e., the application of multiple technologies simultaneously. It sounds good, but it will have few positive effects in practice because overlaying too many technologies will increase operating costs and inefficiency. Despite the great expense, it will not have a guarantee of a certain return for a large number of pre-existing blind spots, and gaps in the system would still be challenging to fill quickly.
Presently, there is an emphasis on platform-type technology with complete and unified data mapping. On top of such a foundation, unified data mapping, unified identity, unified data identity, unified control strategy, and automated monitoring are implemented to ensure a high level of data security.
Since data has become increasingly valuable in recent years, stealing data has become increasingly difficult to detect. There are many instances of malicious insiders stealing information, APT attacks, and Trojan use, which are observed in some local business competitions. Furthermore, there are hacks undertaken by countries to obtain data from target countries. The inability to track unknown security issues is a major problem for data security, and DataSecOps has been working to resolve it.
In DataSecOps, the left-shift of data security is applied, continuously monitoring data processing and usage at the first site to determine the actual risk source. It has been preferred so far that data security has been left shifting in its evolution over the past few decades.
There are three core capabilities related to data security left-shift. The first is chain-wide data identification and tracking, which means monitoring endpoint-side, server-side, traffic-side, API-side, Docker-side, and other indicators of data use. The use and flow of data are always of concern for data security. There is a more superficial dimension, which, for cost-related considerations, is often overlooked.
Second, lightweight adaptive protection can significantly reduce the costs associated with end-of-life operations without interfering with the chain-wide identification and tracking processes. This type of protection is most useful when it is not viable to bear the burden of chain-wide identification and tracking. With an adaptive approach, the quantitative risk assessment will facilitate monitoring and analysis of the whole process to safeguard accordingly.
Lastly, a risk assessment related to data security regulations should be carried out. The objective of security risk assessment is to move away from the traditional technology-assisted, human-driven approach to become more tool- and product-driven, utilizing an automated system for risk assessment and enhancing the application of the business process.
There are three principal planes of DataSecOps; the first is the infrastructure plane, which includes storage, hosts, and various business systems and terminals. From this process, we can observe actual data flows and business processes.
The second concerns the data that runs on the infrastructure, known as the data plane. Data may be collected from several different sources and have different types, users, and APIs, classified as business data, privacy data, etc.
The third element, which is of the utmost importance, is the data security plane, which has different requirements for personal privacy management and different data subject requirements. A detailed list of protection terms can be found in the Personal Information Protection Act, such as adaptive protection of sensitive data, monitoring, and data security risk assessment. DataSecOps' growth and development are logically based on the establishment of automated data protection and security as an integrated platform.
Factors affecting data security development
Besides the subjective factors described above, the global digital environment is also of paramount importance in the development of data security. Several countries have promulgated security-related regulations in the past two years, representing a worldwide consensus on data localization logic.
The digital economy, based on the "14th Five-Year Plan," is rapidly expanding in size and scope, and more funds and workforce are being invested. As the value of data assets increases, the importance of data security also elevated. Data security has become an increasingly universal requirement for ensuring economic development.
The cybersecurity and data security fields are highly regulated. Enterprises have native drivers for protecting their IT infrastructure. Conversely, they do not possess any native drivers for protecting individual data. More regulations will be needed to advance security development, which is also the significance of implementing data security regulations.
Humans are a vital driving force of productivity and the source of risk in a production environment. A large share of the factors that lead to security incidents are caused by human factors. There are two types of data leakage: malicious and unintentional. Ignorant or careless individuals can cause accidental data leaks. Other such leaks may occur due to specific considerations, such as money. Although human wisdom is vast, how and where to use it is a matter of discernment.
The indirect influence of humans is also prevalent nowadays, and carelessness on the part of developers often results in potential safety hazards. Most enterprises have separate development and security teams, and in such cases, the security team may be able to control the risk solely through systematic measures. Even so, flaws always exist in practice processes, and risks and problems are likely to arise whenever weaknesses exist.
The foresight of data security development
There is always a simultaneous and concomitant relationship between data security and development/utilization. Data security cannot be done on its own. It will always act as a bodyguard and an assistant. There is a need for data security to support the primary goal, namely, the development and use of data, which is the eternal theme of the future. This issue should be explored and understood from the perspectives of the nation, the organization, and the global marketplace. Eventually, data security will be regarded as one of the basic consensuses associated with human beings, along with labor, capital, land, etc.
For technicians, data security is more than just technology or human management; it requires a comprehensive approach. In the past, the prevalent belief was "30% technology, 70% management." However, with the advancement of technology, the human management process will become redundant and uncontrollable, so there is a higher possibility that data security will become "70% technology, 30% management", or even "90% technology, 10% management".
With the increasing amount of data generation and its use, the application scenarios for data will become ever more complex, forming a huge and anfractuous data network structure. In short, technology will always be the driving force for data security and will be essential to the development of a secure future.