Microsoft’s Newest AI System Can Draw Images From Descriptions

Robot stylish looking back with planet Earth from space. Future technology concept, artificial intelligence. Elements of this image furnished by NASA

Artificial intelligence is capable of doing a lot of things, and that makes us extremely excited. Not too long ago, both a Chinese company an Microsoft managed to create an AI system that could defeat humans at reading comprehension, but it doesn’t end there.

Microsoft is back yet again with a new type of artificial intelligence system that might one day change the world. Apparently, the company has created a bot that can draw images from text descriptions of an object.

Microsoft leading the charge

In a recent blog post, Microsoft says the technology can create images of anything from regular scenes such as livestock walking about, or the absurd such as a floating double-decker bus. What’s interesting is the fact that each image contains several details that are missing from the description.

This suggests to the company that whatever AI system it is working, appears to have an artificial imagination. From our standpoint, it means we’re getting closer to a future where robots can think for themselves, and whether or not that’s a good thing is still left up for discussion.

“If you go to Bing and you search for a bird, you get a bird picture. But here, the pictures are created by the computer, pixel by pixel, from scratch,” said Xiaodong He, a principal researcher and research manager in the Deep Learning Technology Center at Microsoft’s research lab in Redmond, Washington. “These birds may not exist in the real world — they are just an aspect of our computer’s imagination of birds.”

Xiaodong and his colleagues started off with technology that writes photo captions automatically. It’s called CaptionBot, but that’s something we do not have a lot of information to talk much about.

From there, he moved on to a new bot that answers questions asked by humans about images. For example, this particular bot could gather information on the location and attributes of objects, which is great for aiding blind people.

“Now we want to use the text to generate the image,” said Qiuyuan Huang, a postdoctoral researcher in He’s group and a paper co-author. “So, it is a cycle.”

There are a lot of challenges

We should point out that image generation is much more challenging of a task when compared to image captioning, therefore, we have to give kudos to Microsoft for breaking the barrier. The complexity comes from the need for the drawing bot to come up with ideas that are not in the description, and for artificial intelligence, this is problematic.

If things continue to go down this road of success, a few decades from now and the world will have its very own Westworld. Let’s just hope the robots never feel the need to turn on their makers, because that wouldn’t be a fair fight for us puny humans.