Microsoft’s new AI tool is a nightmare for deep fakes

The advancement of artificial intelligence (AI) has made remarkable strides in recent years, and it’s almost nostalgic to remember when it could only generate images from text.

Thanks to tools like Sora, generative AI has become increasingly powerful and has made the leap from photos to videos. And now, Microsoft has introduced a tool that is as impressive as it is frightening.

VASA-1 is an image-to-video AI model that can generate videos from a single photo and a voice audio snippet. The videos show synchronized facial and lip movements “a wide range of facial nuances and natural head movements that contribute to the perception of authenticity and liveliness.”

 

Microsoft explains how the technology works:

Key innovations include a holistic model for generating head movements and facial dynamics that operates in a latent face space, as well as the development of such an expressive and deinterlaced latent face space using video.

Through extensive experiments, including evaluation of a number of new metrics, we show that our method significantly outperforms previous methods in various dimensions.

Our method not only delivers high video quality with realistic face and head dynamics, but also supports online generation of 512×512 videos at up to 40 FPS and negligible initial latency. “It paves the way for real-time interactions with realistic avatars that mimic human conversational behavior.”

In other words, it is capable of creating videos Deepfake based on a single image. It’s interesting that Microsoft insists that the tool is a “Research demonstration and no product or API release plans.”

Apparently to allay fears, The company doesn’t expect VASA-1 to reach users’ hands any time soon.

Read Also:  Microsoft is improving its snipping tool with these handy new features

Recent Articles

Related News

Leave A Reply

Please enter your comment!
Please enter your name here