academician yao qizhi's new research on the large model has solved the question of "which is bigger, 9.11 or 9.8"

academician yao qizhi's new research on large models has solved the question of "which is bigger, 9/11 or 9/8"

2024-09-25

the article is reproduced from qbitai
author: west wind

led by academician yao qizhi, a new reasoning framework for large models was introduced, and the cot "crown" could no longer be worn.

propose the diagram of thought to make the big model think more like humans.

the team also provided a mathematical basis for this reasoning process, formalizing dot through topos theory to ensure its logical consistency and rationality.

compared to cot, which represents the reasoning process as a linear sequence, dot can better capture the complexity of human reasoning.

compared with the introduction of branch structures tot and got, dot does not need to rely on external control mechanisms or collaboration of multiple models, and training and deployment are simpler.

the secret is that dot models iterative reasoning in llm as building a directed acyclic graph (dag) within a single model.

dag consists of nodes representing propositions, criticisms, refinements, and verifications, and edges represent the logical relationships or dependencies between them. edges are directed and there are no circular paths.

this acyclic feature ensures that the reasoning process is not affected by circular dependencies and can more truly reflect reasonable logical deductions.

questions like which one is bigger, 9.11 or 9.8, how many "r"s are there in strawberry, etc., are all solved with the help of dot.

this research has received considerable attention since it was proposed.

netizens have expressed that this is the right path.

code, code, code

let’s take a closer look at what dot looks like.

a new framework for complex reasoning on large models

as mentioned earlier, dot models the logical reasoning process as constructing a directed acyclic graph (dag) within a single llm.

there are three key roles managed within its framework:

proposer: generates propositions or reasoning steps, adds new nodes.
critics: evaluate propositions, identify errors, inconsistencies, or logical fallacies, and add criticism nodes.
summarizer: synthesizes verified propositions into a coherent chain of thought, effectively performing topological sorting of the dag to produce the final reasoning output.

these three roles are controlled by using special tokens

the reasoning process begins with a proposer introducing a proposition, adding a node to the dag.

it is then up to the reviewer to evaluate the validation or provide criticism. if criticism is provided, a new node is added and an edge is established between the proposition and the criticism.

based on the criticism, the proposer generates a refined and improved proposition, represented as a new node in the dag.

this processrepeat,。

once sufficiently valid propositions have been established, the summarizer synthesizes these inferences and topologically sorts the dag to produce a coherent chain of thought.

by exposing the model to both correct and incorrect reasoning, dot allows the llm to learn from its mistakes and refine its reasoning over time, which is more like how humans solve problems.

this approach not only captures the nonlinear and iterative nature of reasoning, but also provides richer feedback than binary signals through natural language criticism.

training dot involves using training examples formatted into the dot structure, including role-specific tokens and dag representations. during reasoning, the model generates propositions, criticisms, and summaries based on contextual clues and role-specific tokens.

this approach simplifies deployment and eliminates the need for multi-llm collaboration or external control mechanisms, while aligning with standard llm training paradigms for easy integration into existing workflows.

the authors also provide a rigorous mathematical foundation for the dot framework, usingthe reasoning process is formally described.

in this framework, propositions are modeled as sub-objects of terminal objects in the topology, logical relations and reasoning steps are represented as morphisms, and the criticism and improvement processes correspond to morphisms of sub-object classifiers and morphisms between propositions, respectively.

by introducing the prenet category, they also successfully captured the dynamic and concurrent nature of the reasoning process.

this mathematical foundation not only ensures the logical consistency and completeness of the reasoning process, but also provides a conceptual framework for designing the next generation of ai models specialized for reasoning.

tsinghua university branch school led by yao qizhi and yuan yang

this paper was led by yao qizhi and yuan yang from tsinghua university's institute of cross-disciplinary information sciences, and the first author of the paper is zhang yifan.

zhang yifan

zhang yifan graduated from undergraduate program in 2021peking university yuanpei college, is currently a doctoral student at the school of interdisciplinary information sciences at tsinghua university, under the supervision of assistant professor yuan yang.

his main research directions are theory and algorithms of basic models (large language models), self-supervised learning, and trustworthy artificial intelligence.

yuan yang

yuan yang is an assistant professor and doctoral supervisor at the school of interdisciplinary information sciences at tsinghua university.

he graduated from the department of computer science at peking university in 2012; received a ph.d. in computer science from cornell university in 2018; and worked as a postdoctoral fellow at the school of big data science at the massachusetts institute of technology from 2018 to 2019.

his main research directions are intelligent healthcare, ai explainability, and ai large systems. he has achieved many research results in the fields of non-convex optimization theory, neural network optimization theory, mechanism design, etc.

yao qizhi

yao qizhi is an academician of the chinese academy of sciences and dean of the institute of cross-disciplinary information sciences at tsinghua university. he is alsoturing awardhe is the first asian scholar to win the award since its founding and the only chinese computer scientist to have received this honor so far.

professor yao qizhi resigned from his tenured position at princeton in 2004 and returned to tsinghua to teach; in 2005, he founded the "yao class", a computer science experimental class for tsinghua undergraduates; in 2011, he established the "tsinghua quantum information center" and the "institute of cross-disciplinary information sciences"; in 2019, he founded the artificial intelligence school class for tsinghua undergraduates, referred to as the "zhi class".

today, the institute of cross-disciplinary information sciences at tsinghua university, which he leads, has long been renowned, and both the yao class and the zhi class are affiliated with the institute of cross-disciplinary information sciences.

professor yao qizhi's research interests include algorithms,cryptography、quantum computingetc., are international pioneers and authorities in this field.

One More Thing

at about the same time a year ago, academician yao qizhi led the proposalcumulative reasoning(cumulative reasoning, cr) method.

dot is a further development of cr.

at that time, cr coordinated an iterative process involving different specialized large language models, with different models taking on the roles of proposer, verifier, and reporter.

dot directly builds a directed acyclic graph within a single model, does not rely on external control mechanisms or multiple models, and is simpler to train and deploy.

in dot, the critical feedback generated by the model is in natural language, rather than just binary signals like cr. this allows the model to receive detailed explanations of the errors, which helps improve the propositions more effectively.

this time, dot also has a strong mathematical foundation, which clarifies the relationship between the dot reasoning process and categorical logic, and theoretically ensures the consistency and reliability of reasoning.

news

academician yao qizhi's new research on large models has solved the question of "which is bigger, 9/11 or 9/8"

introduction

my contact information