I tested Gemini’s latest image generator and here are the results

Back in November, I tested the image generation capabilities within Google’s Gemini, which was powered by the Imagen 3 model. While I liked it, I ran into its limitations pretty quickly. Google recently rolled out its successor — Imagen 4 — and I’ve been putting it through its paces over the last couple of weeks.

I think the new version is definitely an improvement, as some of the issues I had with Imagen 3 are now thankfully gone. But some frustrations still remain, meaning the new version isn’t quite as good as I’d like.

How often do you create images with AI?

15 votes

It’s a daily thing for me.

Maybe once per week.

13%

A few times per month.

13%

Never.

73%

So, what has improved?

The quality of the images produced has generally improved, though the improvement isn’t massive. Imagen 3 was already generally good at creating images of people, animals, and scenery, but the new version consistently produces sharper, more detailed images.

When it comes to generating images of people — which is only possible with Gemini Advanced — I had persistent issues with Imagen 3 where it would create cartoonish-looking photos, even when I wasn’t asking for that specific style. Prompting it to change the image to something more realistic was often a losing battle. I haven’t experienced any of that with Imagen 4. All the images of people it generates look very professional — perhaps a bit too much, which is something we’ll touch on later.

One of my biggest frustrations with the older model was the limited control over aspect ratios. I often felt stuck with 1:1 square images, which severely limited their use case. I couldn’t use them for online publications, and printing them for a standard photo frame was out of the question.

While Imagen 4 still defaults to a 1:1 ratio, I can now simply prompt it to use a different one, like 16:9, 9:16, or 4:3. This is the feature I’ve been waiting for, as it makes the images created far more versatile and usable.

Imagen 4 also works a lot more smoothly. While I haven’t found it to be noticeably faster — although a faster model is reportedly in the works — there are far fewer errors. With the previous version, Gemini would sometimes show an error message, saying it couldn’t produce an image for an unknown reason. I have received none of those with Imagen 4. It just works.

Still looks a bit too retouched

While Imagen 4 produces better images, is more reliable, and allows for different aspect ratios, some of the issues I encountered when testing its predecessor are still present.

My main problem is that the images often aren’t as realistic as I’d like, especially when creating close-ups of people and animals. Images tend to come out quite saturated, and many feature a prominent bokeh effect that professionally blurs the background. They all look like they were taken by a photographer with 15 years of experience instead of by me, just pointing a camera at my cat and pressing the shutter.

Sure, they look nice, but a “casual mode” would be a fantastic addition — something more realistic, where the lighting isn’t perfect and the subject isn’t posing like a model. I prompted Gemini to make an image more realistic by removing the bokeh effect and generally making it less perfect. The AI did try, but after prompting it three or four times on the same image, it seemed to reach its limit and said it couldn’t do any better. Each new image it produced was a bit more casual, but it was still quite polished, clearly hinting that it was AI-generated.

You can see that in the images above, going from left to right. The first one includes a strong bokeh effect, and the man has very clear skin, while the other two progress to the man looking older and older, as well as more tired. He even started balding a bit in the last image. It’s not what I really meant when prompting Gemini to make the image more realistic, although it does come out more casual.

Imagen 4 does a much better job with random images like landscapes and city skylines. These images, taken from afar, don’t include as many close-up details, so they look more genuine. Still, it can be a hit or miss. An image of the Sydney Opera House looks great, although the saturation is bumped up quite a bit — the grass is extra green, and the water is a picture-perfect blue. But when I asked for a picture of the Grand Canyon, it came out looking completely artificial and wouldn’t fool anyone into thinking it was a real photo. It did perform better after a few retries, though.

Editing is better, but not quite there

One of my gripes with the previous version was its clumsy editing. When asked to change something minor — like the color of a hat — the AI would do it, but it would also generate a brand new, completely different image. The ideal scenario would be to create an image and then be allowed to edit every detail precisely, such as changing a piece of clothing, adding a specific item, or altering the weather conditions while leaving everything else exactly as is.

Imagen 4 is better in this regard, but not by much. When I prompted it to change the color of a jacket to blue, it created a new image. However, by specifically asking it to keep all other details the same, it managed to maintain a lot of the scenery and subject from the original. That’s what happened in the examples above. The woman in the third image was the same, and she appeared to be in a similar room, but her pose and the camera angle were different, making it more of a re-shoot than an edit.

Here’s another example of a cat eating a popsicle. I prompted Gemini to change the color of the popsicle, and it did, and it kept a lot of the details. The cat’s the same, and so is most of the background. But the cat’s ears are now sticking out, and the hat is a bit different. Still, a good try.

Despite its shortcomings, Imagen 4 is a great tool

Even with its issues and a long wishlist of missing functionality, Imagen 4 is still among the best AI image generators available. Most of the problems I’ve mentioned are also present in other AI image-generation software, so it’s not as if Gemini is behind the competition. It seems there are significant technical hurdles that need to be overcome before these types of tools can reach the next level of precision and realism.

Other limitations are still in place, such as the inability to create images of famous people or generate content that violates Google’s safety guidelines. Whether that’s a good or a bad thing is a matter of opinion. For users seeking fewer restrictions, there are alternatives like Grok.

Have you tried out the latest image generation in Gemini? Let me know your thoughts in the comments.

Source

📰 Crime Today News is proudly sponsored by DRYFRUIT & CO – A Brand by eFabby Global LLC

Design & Developed by Yes Mom Hosting