news

data governance, it’s time to break the stereotype

2024-09-23

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

lower the threshold for data governance, lower the threshold for enterprises to make good use of data assets, and make enterprise data consumption more convenient.

text|xu xin and you yong

editor: zhou luping

in the past, data centers faced some challenges and misunderstandings. as they often cost hundreds of millions of yuan, data centers were once seen as expensive and heavy.high construction costs, which has become a stumbling block for small and medium-sized enterprises with small amounts of data. however, at the same time, these enterprises have a strong demand for data construction and governance.

the bigger problem is that the rapid technological evolution hasscalability of enterprise data governance frameworks poses new challengesthe gartner report pointed out that by 2028, 50% of china's data analysis and ai platforms built before 2023 will become obsolete due to decoupling from the ecosystem. the field of data construction is calling for a revolution.

recently, peng xinyu, vice president of alibaba group and ceo of lingyang, pointed out at the lingyang data×ai forum at the yunqi conference that if enterprises want to embrace the ai ​​era, they need to complete scenario deconstruction and business reconstruction.the data infrastructure sector is also experiencing a wave of reconstruction

in response to the construction difficulties of high cost and difficult expansion of data governance in the industry, lingyang dataphin product has been fully upgraded. the newly launched agile version is used to solve the problem of scenarios where the amount of enterprise data is not large but data needs to be built. dataphin's evolvable and scalable data architecture system reserves development space for enterprise data governance, and dataagent built on a large model provides convenience for enterprises to make good use of data assets.

01

it’s time for data governance

when it comes to data construction and governance, the most well-known concept may be the data middle platform.

a few years ago, the big data wave swept the world, and pioneering companies in all walks of life attached great importance to mining the value of corporate data. in 2017, the economist also mentioned in a cover article,data has replaced oil as the world’s most valuable resource

at that time, a group of pioneering enterprises that had accumulated a large amount of data were the first to realize the importance of breaking the data silos within the enterprise and centrally managing and uniformly processing the internal data of the enterprise. the concept of "data middle platform" emerged. as the originator of the data middle platform concept, alibaba also took the lead in building a data middle platform within the enterprise and providing products and services to enterprises. a number of leading enterprises in traditional industries also regard the construction of data middle platforms as an important means to integrate internal massive data assets and give full play to the value of data from the perspective of strategic layout and active change.

because the data of early entrants is highly complex and large in scale, huge investments are required in data governance and construction, and the construction cycle is relatively long. this has also caused some controversy in the industry. for example, an industry insider once observed that the data middle platform has a large investment, the effect is difficult to quantify, and it is difficult to implement in general-sized enterprises.

this year, gartner listed the concept of "data middle platform" as a gradually outdated technology development stage in its "china data analysis and artificial intelligence maturity cycle" report.

however, industry veterans believe that the "data middle platform" cannot be understood only from the level of products and tools, nor can the value of the "data middle platform" be viewed only from the popularity of the concept.

the data center is more of a concept and model, which means that for an enterprise, data assets are an important part of enterprise assets. for this important asset, the enterprise needs a way to integrate data, clean, process and manage it in a unified manner, so as to form data assets that are easy to use. "the above person said.

gartner also mentioned in the report that under the current technological wave, "data infrastructure" related to the construction of technical capabilities such as data integration, metadata management and data quality is in a period of rapid growth. it will be a reusable base for data analysis and ai applications within the enterprise.the concept represented by "data center" is still leading the development of the industry, and the technical level is also continuing to evolve rapidly.

in addition,promoting the marketization of data elements at the national policy level, and it is also enabling enterprises to accelerate the construction of more comprehensive data governance and application capabilities.

on january 1 this year, the interim provisions on accounting treatment of enterprise data resources (hereinafter referred to as the interim provisions) was officially implemented. the data resources of listed companies are listed as new accounting items under the balance sheet and constitute part of shareholders' equity. according to statistics from china securities journal, as of august 31 this year, 39 listed companies disclosed data inclusion in the table, with a total amount of 1.357 billion yuan. for many companies, how to achieve full-domain data governance and build data assets has even become a must-answer question.

the consensus in the industry is that the reason why these leading companies are able to be the first to incorporate data assets into their balance sheets is closely related to their long-term and continuous emphasis on data governance.

driven by macro policies and technological waves, the concept of data-driven business development has become increasingly popular, and more and more companies have realized the importance of data governance platforms and data asset construction.

in this wave, the needs of small and medium-sized enterprises cannot be underestimated. for example, wang sai, vice president of lingyang, has seen thatsmall and medium-sized enterprises have a strong demand for data governance and data asset construction"compared to leading companies, the amount of data in an enterprise may not be very large, but it is complex and diverse, and these companies need to do some light governance on this data."

however, these companies face many problems in data governance. "small and medium-sized enterprises may not have enough talent pool for big data, and do not have much budget to invest in data governance." a senior person believes that many companies also lack knowledge of data asset construction and data governance.

based on these pain points, lingyang carried out lightweight transformation on dataphin, an intelligent data construction and governance platform built based on alibaba's internal data governance experience and serving external large corporate customers, and launched dataphin agile edition.

existin the newly launched dataphin agile edition, the product architecture has become lighter, which can help small and medium-sized enterprises start data governance at a lower cost.taking the requirements for operators as an example, dataphin's agile version is compatible with relational databases. the company's data management talents do not need to master cutting-edge big data technologies, they only need to master sql to operate, and the subsequent operation and maintenance difficulty is also very small, which greatly reduces the talent threshold for data governance.

“enterprises only need to investthree hardware devices, investment onlytwo or three hundred thousand yuanyou can start data governance based on the dataphin agile edition. "dong fangying, general manager of lingyang data system product line, told digital intelligence frontline. this also means that compared with the previous powerful and complex dataphin version, small and medium-sized enterprises now have one more choice.

02

data governance: how to balance the present and the long term

when companies with relatively small amounts of data start data governance, they will consider one question: as the business grows and the amount of data becomes huge, do they need to replace the system? will this cause trouble for future data governance?

for example, a leading domestic retail company has encounteredgrowing painsdue to its extensive business layout,the complexity and processing difficulty of enterprise data needs have become extremely high

previously, they built many business application systems with different functions based on the actual needs of the business. however, as the company developed multiple brands and channels, the amount of data became extremely large. at the same time, different business data existed in dozens of independent data silos in different systems. in addition, the data definitions of different business lines were different, making data governance extremely difficult.

the reason for this situation is that the company lacks a long-term data governance perspective. the previous data architecture was based on isolated business needs.treat the construction of the data center as a long-term taskto this end, they have also formulated a plan for the next three to five years to build the company's data middle platform.

coincidentally, the data manager of another consumer finance company also realized that the construction ideas of data governance needed to change. "in the past, we paid more attention to what data was generated, which business processes could be digitized, and data compliance issues." however, he found that looking ahead five years, as the amount of corporate data continues to grow, the traditional data warehouse construction ideas can no longer support the company's needs for storing, managing, and using data.

this is also a common problem faced by many companies in data governance.how can data architecture reserve space for future development to meet more complex data governance needs in the future?

based on this common pain point in the industry, lingyang's dataphin product innovatively launched a new data system architecture.the core features are scalability and evolution

in short, small businesses can choose the lightweight and low-cost dataphin agile edition product in the early stage based on their own considerations.the underlying computing engine can be expanded, upgraded freely, and evolved smoothly, meeting the future data governance needs and business development needs of enterprises. this is due to the fact that the dataphin agile version and native version use the same underlying architecture.

thisit is conducive to meeting the more complex data governance needs after the scale of enterprise data expandsafter upgrading from the agile edition to the dataphin intelligent development edition, the underlying database can be expanded from a relational database to an interactive, mpp database, such as starrocks, clickhouse, hologres, lindorm, impala, and other databases with stronger analytical capabilities and computing power, thereby supporting more dimensional data scheduling and operation and maintenance and other governance tasks.

as the scale of enterprises continues to expand, the underlying data support of enterprises can be further upgraded to a big data engine, and even expanded to support lake warehouse integration. "small, medium and large, we are all under the same deployment structure, which can help enterprises upgrade seamlessly," said wang sai.

this takes into account the long-term development characteristics of enterprise data governance. enterprises can freely choose appropriate products based on their own data scale and governance requirements.

in addition, in the field of data governance and operation, enterprises also face another major problem, which peng xinyu defines asthe contradiction between personalization and cost-effectivenesslarge enterprises often pursue private deployment based on their own business needs, but this also means higher costs. standardized cloud products are obviously cheaper, but they also lose the ability to customize configurations.

to solve this problem, lingyang dataphin's solution is:in addition to the traditional public cloud tenant model and private deployment, it provides enterprises with a "semi-hosted" model, you can enjoy both the exclusive controllable environment and the elastic scheduling of the public cloud.

for example, in some group-type enterprises, different business modules or sub-businesses have different data processing requirements. some financial and membership data require local computing, while other data with low security sensitivity can be uploaded to the cloud and linked with cloud-based businesses and processed.

this type of enterprise is suitable for the semi-hosted model. compared with the "rent an apartment" service of the public cloud model and the "build a villa" service of independent physical deployment, the semi-hosted model is similar to"rent a single-family villa", which can meet the needs of those companies that want to improve data processing capabilities and have personalized customization requirements, but at the same time consider economic efficiency.

in general, in the field of data governance and data operations, dataphin is building on alibaba group's many years of experience in systematic data governance construction to provide large, medium and small enterprises with scalable and upgradeable products that span multiple types of engines and adapt to various environmental requirements.

in the field of data governance, enterprises are entering a new stage of on-demand procurement and easy upgrades.

03

in the ai ​​era, how to make data truly useful

dong fangying has seen many data governance platform projects and she found a pattern: if the other party is a pure it team,no concept of data asset operation, the success rate of such projects is often not very high.

if data is only stored in the database, it will only become a cost and a burden."we have a philosophy that is ingrained in us that after the data is built, we must use it," dong fangying told digital intelligence frontline. therefore, lingyang places great emphasis on asset operations. it is not enough to just aggregate the data, but also to make better use of the data.

however, there is a huge gap between enterprises and data. dong fangying found that on the surface, enterprises have data and business problems, and matching the two can solve the problem, but how to connect the two faces a huge test.

for business personnel,understanding business and understanding data are two different things. oftentimes, business personnel lack data thinking and still need help from data experts to obtain data, which involves a lot of communication and time costs.

in the specific process of obtaining data, the data team is also under great pressure. they often face a large number of inquiries about where the data is, what the data means, how to use it, and where to use it. in addition, it is not easy to find the desired data in the massive data assets.

this reflects that the data needs of enterprises are not only to obtain a specific data result from chatbot, but also involvefind and use internal data assets based on business needsdong fangying mentioned a specific example. she often encounters clients asking questions like this: the company's business opportunity conversion rate is low. what kind of data can solve the problem?

the core of solving the problem lies in the business process. first, go back to the business and find out which people and organizations, and which processes are involved in the problem. this way, it is possible to provide users with valuable guidance instead of just returning a data result.

starting from this pain point,lingyang launched the industry's first data asset intelligent entity this year - dataphin dataagentwith the support of the big model, users can customize their own agents, and business personnel can implementfull-link self-service operation from problems to ideas, data, and usage.