Skip links
Sora: Nova inteligência artificial capaz de criar vídeos de até 1 minuto a partir de textos

Sora: New artificial intelligence capable of creating videos of up to 1 minute from texts

Sora: New artificial intelligence capable of creating videos of up to 1 minute from texts

Sora: Nova inteligência artificial capaz de criar vídeos de até 1 minuto a partir de textos

In February this year, OpenAI, developer of ChatGPT, announced Sora. A technology capable of creating videos through text. You promotional videos are impressive. 

The program can generate videos of up to 1 minute, maintaining visual quality and adherence to user instructions. 

Sora is capable of generating complex scenes with multiple characters, specific types of movement, and precise subject details.  The model understands not only what the user asked for in the prompt, but also how those things exist in the physical world.

SORA RESEARCH TECHNIQUES

Sora is a diffusion model, which generates a video starting with static noise and gradually transforms it by removing the noise over several steps.

It is capable of generating entire videos at once or extending generated videos to make them longer. By providing the model with the prediction of many frames at the same time, they solve a challenging problem of ensuring that an object stays the same when it leaves view.

In addition to being able to generate a video just from text instructions, the model is capable of taking an existing still image and generating a video from it, animating the image content with precision and attention to small details. The model can also take an existing video and understand it. 

Sora serves as the basis for models that can understand and simulate the real world, a capability they believe will be an important milestone in achieving AGI

Similar to GPT models, Sora uses a transformer architecture, unlocking superior scalability performance.

Representing videos and images as collections of smaller units of data called patches, each of which is similar to a token in the GPT. By unifying the way they represent data, they can train broadcast transformers on a wider range of visual data than was previously possible, covering different durations, resolutions and aspect ratios.

SECURITY

In addition to developing new techniques to prepare for deployment, they leverage existing security methods for products using DALL·E 3, which are also applicable to Sora.

For example, the text classifier will check and reject text input requests that violate usage policies, such as those requesting extreme violence, sexual content, hateful images, celebrity likenesses, or third-party IP. It will also develop robust image classifiers that are used to review the frames of each generated video to help ensure it complies with usage policies before being shown to the user.

TESTS AND DESIGN 

SORA is still becoming available for red teamers to assess important areas for damage or risks. And it's also being made available to designers, visual artists and filmmakers to give feedback on how the product might be useful to creative professionals, as well as to share the progress of the research in advance to start work and get feedback from people outside OpenAI and to give the public a sense of what AI features are in the process.

The current model still has some problems, such as the difficulty of simulating the physical accuracy of a complex scene and may not understand specific instances of cause and effect.

The model can still have problems confusing spatial details of a prompt, for example mixing up right and left, and can have difficulties with accurate descriptions of events that occur over time, such as following specific camera trajectories.

In another article, we talked about AI and how it can be used for your business through digital marketing. It is worth checking!

Did you already know the platform? Do you believe it will be innovative? How will it impact the internet?

Did you like the content? Check out other articles at our blog!

Leave a comment

en_USEnglish
Open chat
Hello 👋
Can we help you?