news

"the time for end-side chatgpt has arrived", mianbi intelligence launches the open-source minicpm3-4b ai model

2024-09-06

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

it home reported on september 6 that the official wechat account of mianbi intelligence published a blog post yesterday (september 5), announcing the launch of the open source minicpm3-4b ai model, claiming that "the end-side chatgpt moment has arrived."

minicpm3-4b is the third generation product in the minicpm series. its overall performance exceeds that of phi-3.5-mini-instruct and gpt-3.5-turbo-0125, and is comparable to many ai models with 7 billion to 9 billion parameters.

compared with minicpm1.0 / minicpm2.0, minicpm3-4b has a more powerful and versatile skill set and can be used for a wider range of purposes. minicpm3-4b supports function calls and code interpreters.

below are the differences between the three versions of the model structure (1->2->3):

vocabulary size: 123k->73k->73k

model layers: 40->52->62

hidden layer nodes: 2304->1536->2560

maximum length: 4k->4k->32k

system prompt words: not supported -> not supported -> supported

tool invocation and code interpreter: not supported -> not supported -> supported

minicpm3-4b has a 32k context window. with llmxmapreduce, minicpm3-4b does not need to occupy too much memory and can handle theoretically unlimited context.

mianbi intelligence also released the rag suite minicpm-embedding model and minicpm-reranker model, and a fine-tuned version of the minicpm3-rag-lora model for rag scenarios.