openai holds a low-key developer conference: significantly reducing costs and launching real-time api public beta

2024-10-02

highlights:

openai held its second developer conference in san francisco on tuesday in a low-key manner. the media was not invited to attend and no new products were released.
this developer conference will be held in san francisco, london, and singapore respectively, with the other two held on october 30 and november 21.
openai has launched four new tools for developers, shifting its focus from competing directly in end-user applications to empowering the developer ecosystem.
currently, more than 3 million developers have used the openai model to develop applications, demonstrating the attractiveness and competitiveness of its platform.

according to news on october 2, the past week has been full of challenges for openai, including the departure of top management and important fundraising activities, but the company has refocused on attracting attention at its 2024 developer conference (devday). developers leverage their ai models to build tools. openai ceo sam altman, who has received much media attention recently, did not appear at the developer conference.

at a press conference held on monday, openai’s chief product officer kevin weil assured the media present that although the company’s chief technology officer mira murati (mira murati) and chief research officer bao bob mcgrew recently announced his departure, but this change will not affect the company's progress. "i'll start by saying that bob and mira are incredible leaders," ware said with great respect. "i've been deeply influenced by them and they've been instrumental in getting us to where we are today. and we have no plans to slow down." the speed of our development.”

last year, openai held the first developer conference (devday 2023) that caused a sensation in the industry in san francisco, usa. the company made the big announcement during a 45-minute keynote, which was attended by a large number of media. it launched a series of new products and tools, including gpt-4 turbo with 128k context support, api price reduction, new assistants api, and gpt-4 turbo for visual features, dall·e 3 api, and a greatly improved json model, as well as the ill-fated gpts and app store-like platform gpt store. microsoft ceo satya nadella also made a guest appearance.

olivier godement, openai's platform product lead, said the company will no longer release new models at developer conferences, letting the models follow their own research and safety timelines. the change comes against a backdrop of openai being criticized for moving the technology too quickly. openai, which started out as a nonprofit, is in the midst of a restructuring phase that could see the nonprofit entity lose control and transform it into a traditional startup -- a move designed to help it raise capital, recruit and retain talent. but these changes are "tearing the company apart," and mulati and chief scientist ilya sutskever left because the company was growing too fast.

after experiencing high-level personnel changes after last year's developer conference, openai chose a more low-key approach to hold its developer conference this year. compared with last year's event, openai's developer conference this year appears to be more restrained. the company has previously stated that it will not invite the media to participate. according to official information, the 2nd devday developer conference will be held in san francisco, london, and singapore on october 1, october 30, and november 21 respectively. the activities include technical seminars. , group discussions, product demonstrations, etc. participants of this event can participate after successfully applying on the official website and paying a registration fee of us$450.

openai's management stated that although the company is facing leadership changes, the company still has more than 3 million developers using its ai models for development, demonstrating the attractiveness and competitiveness of its platform. nonetheless, openai is aware of increasing competition in the market, especially price pressure from competitors such as meta and google. in order to attract and retain developers, openai has reduced the cost of accessing its api by 99% over the past two years, a strategy that may be in response to challenges from competitors.

openai did not release a new artificial intelligence cutting-edge model at this developer conference. instead, it focused on ecosystem construction, choosing to focus on helping developers connect with each other and gain an in-depth understanding of new artificial intelligence functions and products. as openai transitions from industry disruptor to platform provider, its success will depend on its ability to cultivate a vibrant developer ecosystem. by providing more advanced tools, lowering costs, and increasing support, openai has laid a solid foundation for continued growth and stability in the field of artificial intelligence. while the direct impact of this strategy may not be obvious, it is expected to ultimately lead to sustainable and deeper adoption of ai across a wider range of industries.

openai launched four major innovations at this developer conference: vision fine-tuning, realtime api, model distillation and prompt caching. these new tools mark a shift in openai's strategic focus from competing directly in end-user applications to empowering its developer ecosystem.

01 prompt caching: a money-saving tool for developers

openai announced a revolutionary feature at the developer conference - prompt caching, which will significantly reduce developers' costs and operation delays. this feature can automatically identify and cache input tokens recently processed by the model, and provide price discounts of up to 50% for these cached tokens. this is a huge boon for applications that frequently use the same context.

"we've been working hard," said gudmont, openai platform product lead. "look back two years ago, gpt-3 was the dominant market leader. today, we have successfully reduced the cost by 1,000 times. i can't think of any other technology that can achieve such a significant cost reduction in two years. reduce."

this significant cost reduction opens the door for enterprises and startups of all sizes to explore new applications, especially projects that have been delayed in launching due to cost issues. now it is finally possible.

02 visual fine-tuning: a new era of visual artificial intelligence

another important announcement is the introduction of visual fine-tuning capabilities for openai’s latest large language model, gpt-4o. this new feature allows developers to leverage images and text to customize the visual understanding of their models. the implications of this feature are far-reaching and could have a significant impact on areas such as self-driving cars, medical imaging, and visual search capabilities.

openai said southeast asian food delivery and ride-hailing company grab is already using the technology to improve its mapping services. with just 100 examples, grab achieved a 20% improvement in lane counting accuracy and a 13% improvement in speed limit sign location. this real-world application demonstrates how visual fine-tuning can leverage small batches of visual training data to significantly improve the possibilities for artificial intelligence services across a variety of industries.

03 instant api: filling the gap in conversational ai

openai also launched a public beta version of its instant api. this is a new service that allows developers to create low-latency, multi-modal experiences, especially in speech-to-speech applications. this means developers can start adding chatgpt’s voice control capabilities to their apps.

to demonstrate the api's potential, openai showed off an updated version of the travel planning app wanderlust it showed at last year's conference. leveraging the instant api, users can talk directly to the app to plan their trip in a natural conversational manner.

while travel planning is just one example, instant apis open up a wide range of possibilities for voice-activated applications in a variety of industries. from customer service to education and accessibility tools, developers now have powerful new resources to create more intuitive and responsive ai-driven experiences. “whenever we design a product, we basically think about startups and enterprises,” gudmont explains. "so, in alpha testing, we have a lot of enterprises using apis, new models for new products."

instant apis inherently simplify the process of building voice assistants and other conversational ai tools, eliminating the need to combine multiple models for transcription, inference, and text-to-speech conversion. early adopters, such as health and fitness coaching app healthify, and language learning platform speak, have integrated instant apis into their products. the pricing structure of the instant api, although $0.06 per minute of audio input and $0.24 per minute of audio output is not cheap, but may still represent significant value for developers looking to create voice-based applications.

04 model distillation: a new chapter in the popularization of artificial intelligence

openai also released model distillation technology, which may be its most transformative advancement. this technology allows developers to leverage the output of advanced models such as o1-preview and gpt-4o to enhance the performance of more efficient models such as gpt-4o mini.

this innovation enables small and micro businesses to achieve capabilities comparable to large models at lower computational costs, thereby resolving a long-standing contradiction in the ai industry: between resource-intensive and accessible but limited-feature systems. gap. for example, a small medical technology startup could leverage model distillation technology to develop ai-powered diagnostic tools for rural clinics. the company is able to train a lightweight model that not only runs on standard equipment but also provides diagnostic accuracy approaching that of larger models, which will hopefully improve medical care in resource-limited areas. (wuji, specially compiled by tencent technology)

news