news

Microsoft Azure AI Voice Service launches virtual human image to support text-to-video conversion

2024-08-23

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

IT Home reported on August 23 that Microsoft Azure AI Speech Service allows developers to build multi-language generative AI voice applications. Azure AI Speech Service has recently launched a text-to-speech virtual human function that can convert simple text into natural human speech videos.

Today, Microsoft announced the general availability of the Text to Speech Avatar feature. This new feature enables developers to create personalized avatars for their users. The service outputs videos at a resolution of 1920 x 1080 at 25 frames per second.

Text to Speech Avatar has the following features:

Convert text into natural-sounding human-speaking videos powered by Azure AI Text-to-Speech.

Provide different character preset images.

The image's voice is generated by Azure AI text-to-speech.

Synthesize text to speech portrait videos asynchronously or in real time using the batch synthesis API.

Provides content creation tools in Speech Studio to create video content without coding.

Enable real-time human conversations with the Live Chat Avatar tool in Speech Studio.

In terms of pricing,The charge for the text-to-video service will be calculated based on the length of the video output and will be charged per second.The service is now available in Southeast Asia, Northern Europe, Western Europe, Central Sweden, South Central US, and West US.