news

Unveiling FancyTech: Algorithm innovation behind “strong restoration” and “hyper-convergence”

2024-08-25

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

In the recent wave of technological change, AIGC (artificial intelligence generated content) is becoming an important tool for people to express themselves and create. The driving force behind this wave of technological innovation is not just a huge algorithm model, but a deeply customized solution that focuses on the needs of specific fields. In the past two years, the development of AIGC has exceeded many people's expectations, and its application has expanded from text generation to the entire field of images and videos.
Recently, Synced interviewed a Chinese startup called FancyTech, which not only rapidly expanded the market by providing standardized commercial visual content generation products, but also took the lead in proving the advantages of the vertical model in practical applications.
"Synced" also introduced in detail FancyTech's latest vertical video model DeepVideo, which successfully coped with the challenge of how to accurately restore and naturally integrate products in the video, ensuring that the products remain unchanged in motion.
FancyTech's vertical model is based on an open-source underlying algorithm framework, superimposed with its own data annotation and retraining, and only requires a few hundred GPUs for continuous training iterations to achieve good generation results. In comparison, the two factors of "product data" and "training method" are more critical to the final landing effect.
Based on the accumulation of a large amount of 3D training data, FancyTech introduced the idea of ​​spatial intelligence to guide the model's 2D content generation. In terms of image content generation, the team proposed a "multimodal feature generator" to ensure the restoration of the product, and through special data collection to ensure the natural integration of the product and the background. In terms of video content generation, the team rebuilt the underlying link of video generation, designed a framework and performed data engineering, and generated videos centered on the product.
In addition, Synced reveals how FancyTech applies spatial intelligence research ideas to visual generation models. Unlike traditional generation models, spatial intelligence analyzes a large amount of sensor data and performs precise calibration, allowing the model to perceive and understand the real world.
FancyTech uses lidar scanning instead of traditional studio shooting and has accumulated a large amount of high-quality 3D data. This data is combined with 2D data as model training data, greatly enhancing the model's understanding of the real world.
For the challenging task of shaping light and shadow effects in visual content generation, FancyTech deployed multiple lights with adjustable brightness and color temperature in each environment to collect as much natural light and shadow data as possible to improve the spatial layering of the generated images.
This high-intensity data collection simulates the lighting of real shooting scenes, making it more in line with the characteristics of e-commerce scenarios. Combined with the accumulation of high-quality 3D data, FancyTech has made a series of innovations in the algorithm framework, organically combining spatial algorithms with image and video algorithms, allowing the model to better understand the interaction between core objects and the environment.
The exploration of commercialization has never stopped in the field of AIGC. Although there is a consensus, there are also different development directions. In the article, Synced revealed FancyTech’s algorithm innovations behind “strong restoration” and “hyper-convergence”.
FancyTech's "multimodal feature generator" extracts product features in multiple dimensions, and then uses these features to generate images that blend into the scene. Feature extraction is divided into global features and local features: global features include basic elements such as the outline and color of the product, which are extracted using a VAE encoder; local features focus on the details of the product, which are extracted through a graph neural network. This method can capture the details inside the product and the relationship between key pixels in detail, thereby improving the accuracy of restoring product details.
On the road to commercialization, whether adopting a universal model or a vertical model, the ultimate goal is to achieve commercial success. FancyTech has gained wide recognition in domestic and international markets by leveraging its rich and unique data and industry expertise, and has established partnerships with international partners such as Samsung, LG and Lazada e-commerce platforms in Southeast Asia; in the United States, it has won the favor of local brands such as Kate Somerville and Solawave; in Europe, it has won the LVMH Innovation Award and has in-depth cooperation with European customers.
In addition, FancyTech also provides full-link automatic publishing and data feedback functions for AI short videos, effectively driving the continued growth of product sales.
The successful application of the vertical model not only promotes the development of the commercial market, but also makes it easier for the general public to use AIGC technology to improve productivity.
With the popularization of technology, almost everyone can now shoot videos, record music, and share their creations with the world through their mobile phones. We look forward to a future where AIGC technology can once again unleash personal creativity - allowing ordinary people to easily cross the professional threshold and turn creativity into reality, thereby promoting a leap in productivity in all walks of life and giving birth to more emerging industries.
Text/Link who focuses on AI
Report/Feedback