The Astounding Eyes of AI

The recent update for Open AI which now includes visual recognition and description is quite astounding.

I’ve now been given access to the Dalle3 and ChatGPT integration. I did some testing and it was mind blowing. It can create super realistic images, but the thing I find astounding is its ability to describe what it sees.

A picture tells thousand words….. and AI now knows them all.

But rather than have me explain what it does – let me just show you. I’ve put 3 picture below and the command to show you the output. There is also a drop box video link of each in case you want to see how it does it live. I’d love to know what you think – Just email me back

Image 1 – Pic of Tom at my local cafe this morning…

Prompt: Describe this image in detail (My local cafe)

Answer: (Click here to watch it do it)

The image appears to be set in a cozy cafe or coffee shop. Here’s a detailed description:

  1. A young man with shoulder-length black hair is positioned to the right of the image. He’s smiling and looking at the camera. He’s wearing a white t-shirt and a baseball cap with an emblem reading “Anchorage.”
  2. The cafe setting is adorned with various pieces of equipment commonly associated with coffee making. This includes an espresso machine decorated with numerous stickers such as a heart, a cat, letters like “R”, and several other designs.
  3. In front of the espresso machine, there are multiple stainless steel pitchers, possibly used for steaming milk. Several white ceramic dishes and stacks of white cups are also visible on the counter.
  4. A countertop with a wooden finish dominates the lower portion of the image. On this countertop, a couple of beverages can be seen, including one that seems to be a latte or cappuccino with foam on top, placed next to a white ceramic cup.
  5. There’s a window behind the young man, revealing an outdoor setting with a glimpse of trees and a building. It appears to be daytime based on the natural light coming through.
  6. An air conditioning unit is mounted on the wall above the window.
  7. Various other items, including bottles of syrup or flavoring, a digital screen, and other miscellaneous objects, are scattered throughout the space, adding to the ambiance of a working coffee shop.
  8. The overall vibe of the image is warm and welcoming, capturing a moment in a cafe where the barista seems to be happily engaging with the photographer.

Image 2 – Pic from us printing our first house for Macro3D

Prompt: Tell me what you think is happening in this picture

Answer: (Click here to watch it do it)

The image depicts a construction or industrial site where 3D printing technology seems to be in use for building or fabricating structures.

Here’s a breakdown:

  1. At the forefront, there’s a large red robotic arm machine labeled “ACRO 3D”. The machine appears to be a kind of 3D printer or construction robot mounted on tracks, similar to those found on heavy-duty construction machinery.
  2. This robotic machine is in the process of creating a gray, textured wall or barrier. The texture and layering of the structure suggest that it’s being built layer by layer, indicative of 3D printing methods.
  3. To the right of the image, a man wearing an orange high-visibility jacket and a beanie stands overseeing the operation. He holds a controller or tether connected to the robotic machine, suggesting he might be operating or guiding the machine.
  4. In the background, other construction equipment, like an orange tractor or loader, can be seen, hinting at a larger construction or development project taking place.
  5. The sky appears to be clear with scattered clouds, suggesting a good day for outdoor work.

In summary, it seems like a construction process is underway using advanced 3D printing technology to build or erect structures on-site. The operator is actively involved in guiding or controlling the 3D printing robot.

Image 3 – Picture from window of plane flying back into Melbourne

(This one is interesting – because I pushed the AI to give me more detail – and it did… ProTip: Don’t accept the first answer anymore than you would with a human…

Prompt: Describe this image…

Answer: (Click here to watch it do it)

The image provides an aerial view from the window of an airplane. In the foreground, there’s a portion of the airplane’s wing with a visible red emblem of a kangaroo. Below, you can see a vast urban landscape with a dense grid of streets, houses, and buildings. A significant portion of the image is dominated by a harbor area with multiple shipping containers, docks, and waterways. A river meanders through the landscape and leads to a downtown area with tall skyscrapers and modern buildings. The horizon is slightly hazy, and you can see distant mountain ranges. The overall impression is of a bustling metropolitan area viewed from above.

2nd Prompt: Where do you think it could be? Tell me more about the type of plan & airline

The emblem of a kangaroo on the airplane’s wing suggests that this is a plane from Qantas, which is Australia’s largest and flag carrier airline. Given the aerial view of the urban landscape, shipping harbor, and river leading to a downtown area with skyscrapers, it’s plausible that this could be a view of Melbourne, especially considering the Yarra River and the port areas that are characteristic of this city. However, without further context, this is just a speculation, and it could be another major city in Australia or elsewhere served by Qantas.

As for the plane, Qantas operates a range of aircraft, including the Boeing 737, Boeing 787 Dreamliner, and Airbus A330 among others. Without seeing more specific details of the plane, it’s challenging to identify the exact model. The wing’s design could be consistent with models like the Boeing 737 or 787, but again, it’s speculative without more information.

So there you have it – AI is now starting to merge its various forms of intelligence – Thought, Sight, Sound, Voice… everything. Soon it will be inside soft robots… we ain’t seen nothin’ yet.

Keep Thinking,

Steve