ChatGPT just got a major upgrade. OpenAI’s popular chatbot can now create images on its own, no extra help needed. This new feature is powered by GPT-4o, a model that can understand and generate text and images at the same time.
Before, if you wanted ChatGPT to create an image, it had to work with DALL-E, a separate model that’s really good at making pictures. But this wasn’t always easy to use. You had to switch back and forth between the two models, which was frustrating.
DALL-E uses something called a diffusion model to create images. It starts with random noise and then slowly makes the image clearer. This works pretty well, but it can get confused if you ask it to create something complicated, like a picture with many objects.
The new GPT-4o model is different. It uses a technique called autoregressive to create images from top to bottom and left to right. This makes it better at creating images with text and handling complicated requests.
For example, you can ask GPT-4o to create an image with multiple objects, and it will do a great job. It’s also really good at creating text within images, which is something DALL-E struggled with.
OpenAI trained GPT-4o using public data and images from partners like Shutterstock. The company also has rules in place to prevent the model from creating images that copy an artist’s work without permission. It also won’t let the model save images from websites that don’t allow it, and it prevents the creation of certain types of images, like those that are obscene or misleading.
This new feature is available to ChatGPT Plus, Pro, and Team members, as well as free users, who can create up to three images per day. It’s also available in Sora, a tool for creating videos.
This update comes after Google tested a similar feature on its Gemini 2.0 Flash model. That model can also understand text, answer questions, and create images all on its own. But some people used it to remove watermarks from copyrighted images and create images with copyrighted characters.
How it works
GPT-4o is a powerful model that can understand and generate text and images at the same time. This makes it really good at creating images with text and handling complicated requests.
Here are some examples of what you can do with GPT-4o:
- Create images with multiple objects
- Generate text within images
- Make images with specific styles or themes
Sources
This information comes from: