news

News Analysis: Large-scale outages sound the alarm for global information technology security

2024-07-21

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

On the 19th, Microsoft Windows system and some of the company's other applications and services experienced large-scale outages, causing many countries' aviation, railways, shipping, finance, medical, hotel and other industries to be unable to operate normally, and the work and life of many companies and individual users were seriously disrupted.

Microsoft CEO Satya Nadella posted on social media X that day to confirm that a software update released by CrowdStrike, a security technology company that provides services to Microsoft, was the main cause of the global outage.

The scope and severity of this outage are extremely rare, and it has sounded the alarm for governments, industries and individual users. Junaid Ali, a cybersecurity expert at the Institution of Engineering and Technology in the UK, pointed out that the scale of this outage may be "unprecedented", posing a major challenge to the global information technology (IT) industry team, but also providing important experience for software engineering professionals.

It will take time to completely eliminate the impact

According to foreign media reports, the US-based company CrowdStrike has more than 20,000 customers worldwide, including technology giants such as Microsoft and Amazon. The company's CEO George Kurtz posted on social media X on the 19th that the incident did not involve a cyber attack, but was caused by a "defect" in the software update released by the company for Microsoft Windows. The problem has been identified, isolated, and repair measures have been deployed.

Kurtz also said in a media interview that day, "We deeply apologize for the impact we have caused to our customers, travelers and all those affected." The company is working hard to resolve the problem, but some systems may take "some time" to recover from the failure.

Although CrowdStrike has worked with Microsoft to quickly restore most of its services, experts believe that the long-term impact of this outage needs to be further evaluated. Adam Smith, a cybersecurity expert at the British Computer Society, pointed out that the fix must be applied to a large number of computers around the world, which will take some time. But if the computer enters a blue screen and infinite loop, recovery may be more difficult and take days or even weeks.

Junaid Ali believes that CrowdStrike is addressing this incident as a top priority. "The long-term impact of this outage is not yet fully understood, but they will affect the timely adoption of critical security updates in the future."

Stay alert to IT system risks

Experts believe that the outage incident highlights the vulnerability of the global Internet infrastructure, and we need to remain vigilant to the complexity of IT systems and the potential risks of various fields that are highly dependent on network infrastructure. Ian Corden, an expert from the British Institution of Engineering and Technology, said that major IT system outages around the world reflect the increasing dependence on digital services in the economy, defense and national security, and therefore highlight the importance of digital service security and resilience.

Omoronya, an expert at the School of Computer Science at the University of Bristol in the UK, believes that we need to be vigilant about cloud infrastructure and other critical systems "that we rely on every day." Today's network infrastructure is very complex, with extensive dependencies, and these risks are often not obvious to those responsible for building them.

There are also complex circumstances in this incident that the public is still unclear about. For example, many foreign media mentioned that Microsoft Windows and some of the company's other applications and services had problems. Some media quoted a Microsoft spokesperson as saying that the problems with Microsoft 365 services from the night of July 18 to 19 were not related to the software update of "Zhongji". In general, industry insiders generally believe that the reason for the large-scale downtime of Microsoft Windows is the mistake made by "Zhongji" in the software update.

Industry insiders said this shows that companies should thoroughly review the potential risks of their cybersecurity solutions before deploying security software. Al Lakhani, founder and CEO of digital security company IDEE, said in a statement: "The lesson here is obvious: investing in cybersecurity is not just about getting the latest or most popular tools, but also about ensuring those tools are reliable and resilient."

Emergency response capabilities need to be improved

The impact of this incident spread across the globe, and also exposed the inadequacy of emergency response capabilities of some "lifeline" industries and large enterprises that are highly dependent on IT systems. For example, the global aviation industry was severely impacted by the downtime. The Associated Press cited data from a flight tracking website and reported that as of the evening of the 19th Eastern Time, nearly 2,800 flights in the United States were canceled, nearly 10,000 flights were delayed, and about 4,400 flights were canceled worldwide.

Industry insiders pointed out that enterprises should establish and improve network failure emergency response plans and conduct drills regularly to ensure rapid response and recovery when failures occur.

Corden pointed out that in order to mitigate the impact of network failures, enterprises should install backup systems, leave redundancy in infrastructure, conduct disaster recovery tests regularly and develop strict software update protocols. In addition, enterprises should use advanced monitoring tools, train IT personnel on how to deal with emergencies such as downtime, and work closely with third-party vendors to ensure that a strong security strategy is in place.

Tom Worthington, a computer expert at the Australian National University, warned that the widespread outage showed the risk of relying on a single technology to provide important services, and that backup communication links should be established using different software. This does increase security and maintenance costs, but "if you put all your eggs in one basket, you may end up losing face."