news

Microsoft releases SpreadsheetLLM, which can greatly improve AI's ability in Excel

2024-07-16

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

On the 12th, Microsoft released a new large language model, planning to develop a new AI large language model-SpreadsheetLLM for spreadsheet applications such as Excel and Google Sheets.

Microsoft pointed out in the paper that SpreadsheetLLM, as a new AI model, will be widely used to understand and process complex spreadsheet data.

SpreadsheetLLM has the potential to transform spreadsheet data management and analysis, paving the way for smarter and more efficient user interactions.

This may make accountants and data analysts worry about their future job prospects. Netizens joked on the social platform X that "Karen's job will soon be replaced by artificial intelligence."

'Karen may soon be out of a job'

The researchers pointed out that current spreadsheet applications are feature-rich and provide users with a large number of choices in layout and format, which makes it difficult for traditional AI large language models to play a role in spreadsheet processing. SpreadsheetLLM is an AI model designed specifically for spreadsheet applications.

Microsoft has also developed the SheetCompressor tool to help SpreadsheetLLM better understand and process spreadsheet data.


The researchers said that the potential applications of SpreadsheetLLM are very wide, from automating routine data analysis tasks to providing intelligent insights and recommendations based on spreadsheet data. For example, SpreadsheetLLM can be used to automatically generate financial reports, identify anomalies or trends in data, and provide personalized product or service recommendations to customers.

As a result, SpreadsheetLLM has the potential to revolutionize the way businesses handle data.

One user claimed: “LLMs who can write SQL will kill the entire data engineering industry as we know it.”


Another wrote, “SaaS is in deep trouble.”


“This will have a huge impact on the financial world”


Ethan Mollick, associate professor at the Wharton School at the University of Pennsylvania, tweeted: “This is yet another sign that LLMs will soon be able to handle both structured and unstructured spreadsheet data. This will unlock many use cases (forecasting, finance, valuation, etc.), and having a spreadsheet source of truth tends to reduce illusions.”


How does SpreadsheetLLM work?

SpreadsheetLLM works by encoding spreadsheet data into a format that a large language model (LLM) can understand, enabling the LLM to reason about spreadsheet data, answer questions about the data, and even generate new spreadsheets based on natural language prompts.

The core of SpreadsheetLLM is the "SheetCompressor" framework, which can effectively compress and encode spreadsheet data to make it easier for LLM to process. SheetCompressor consists of three modules:

▲Structural Anchor-Based Compression: Place “Structural Anchors” throughout your spreadsheet to help LLM understand the structure of your data.
▲Inverse Index Translation: Convert spreadsheets into a more compact format and eliminate redundant data.
▲Data format aware aggregation: Group adjacent cells based on number format and data type.


Illustration of the SHEETCOMPRESSOR framework (Image: Microsoft)

Microsoft claims that SpreadsheetLLM significantly improves the performance of spreadsheet detection tasks, outperforming common methods by 25.6% in the contextual learning setting of GPT4, reduces the cost of using tokens by 96%, and provides better processing results.

Microsoft has not announced when it will release SpreadsheetLLM to the public. The paper points out that the model still has some limitations, such as its limited ability to understand complex or highly structured data; SheetCompressor currently cannot compress cells containing natural language, etc.