Musical and visual creation using generative AI

Meta presents Emu Video and Emu Edit, video generators and image editing using AI.
YouTube, powered by Google DeepMind’s Lyria model, announces Dream Track, a music creation tool allowing users to generate original fragments with the AI-generated voice of selected artists.
Collaboration between platforms and artists to balance AI with creator protection.

This article highlights the latest developments in generative AI, focusing on Emu Video, Emu Edit and YouTube Dream Track, while analyzing the possible consequences and opinions on the role of AI in the music industry.

Meta has presented two previews: Emu Video and Emu Edit

These recent technologies offer users unprecedented control over the generation of visual content.

Emu Video: Video generation

Compared to previous methods, Emu Video uses only two models to create high-resolution (512×512) four-second videos at 16 frames per second. Human evaluations indicate that Meta outperforms the competition in terms of quality and fidelity to the text.

This advancement generates videos from text, in addition to making it possible to “animate” images provided by the user based on a text message.

Emu Video features:

Unified architecture for video generation tasks.
Supports text-only, image-only, and combined text-image inputs.
Factored approach for efficient video generation.
State-of-the-art performance in human evaluations.
Ability to animate user-provided images.

Emu Edit: Precision in image editing

On the image editing side, Emu Edit provides precise control using recognition and generation techniques. It allows you to modify only the relevant pixels, ensuring that the edits are accurate and respectful of the instructions given.

Meta has trained the model with an extensive dataset of synthesized samples (10 million), resulting in superior performance in terms of instruction accuracy and image quality.

Emu Edit features:

Free-form editing via instructions.
Pixel precise alteration.
Control with computer vision tasks.
Editing results.
Next generation performance.

AI in the music industry

YouTube has announced its latest tool: Dream Track, an artificial intelligence-based tool that allows users to create original song fragments of up to 30 seconds with the AI-generated voice of selected artists.

YouTube Dream Track: How it works

Users can create original song snippets of up to 30 seconds with the chosen artist’s AI-generated voice.

It uses the Lyria model to simultaneously generate lyrics, backing tracks and vocals in the style of the selected artist.

The watermark guarantees the authenticity of the AI-generated content.

At the same time, YouTube has outlined policies for identifying and handling AI-generated content and is committed to working with the music industry to develop effective protections.

The use of AI in the music industry

What is your opinion on the use of AI in the music industry? As technology advances, collaboration between digital platforms and artists becomes essential to balance advances in AI with the protection of creators. AI is meant to amplify human creativity, not replace it, and YouTube is embarking on this journey collaboratively and responsibly.