2024-10-05
한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina
original title: google’s cheapest ai model, gemini 1.5 flash 8b, will be commercially available: cut in half and break through at $0.15 to buy millions of tokens for output
it house reported on october 5 that technology media neowin published a blog post yesterday (october 4), reporting that google will soon commercialize the gemini 1.5 flash 8b model, becoming google’s cheapest ai model.
it house reported in august this year that google launched three gemini experimental models, among which gemini 1.5 flash 8b is a smaller model of gemini 1.5 flash. it has 8 billion parameters and is designed for multi-modal tasks, including large volume tasks and long text summarization tasks.
compared with the original gemini 1.5 flash, gemini 1.5 flash 8b has lower latency and is especially suitable for chat, transcription and long text translation tasks.
another highlight of gemini 1.5 flash 8b is the affordable price. the relevant billing will take effect on monday, october 14th. it home has attached the relevant information as follows:
under the context window of less than 128k, the cost of inputting prompt words per million tokens is us$0.0375 (currently about 0.26 yuan)
under the context window of less than 128k, the cost of outputting prompt words per million tokens is us$0.15 (currently about 1.1 yuan)
under the context window of less than 128k, the cost of caching prompt words per million tokens is us$0.01 (currently about 0.071 yuan)
for comparison, the gemini 1.5 flash model costs $0.3 per million output tokens. this price will be implemented on august 12, 2024, which means that the price of the new version of gemini 1.5 flash 8b is directly halved compared to the original version.