NVIDIA’s AI Creates Beautiful Images From Your Sketches

Dear Fellow Scholars, this is Two Minute Papers
with Károly Zsolnai-Fehér. I know for a fact that some of you remember
our first video on image translation, which was approximately 3 years and 250 episodes
ago. This was a technique where we took an input
painting, and a labeling of this image that shows what kind of objects are depicted, and
then, we could start editing this labeling, and out came a pretty neat image that satisfies
these labels. Then came pix2pix, another image translation
technique which in some cases, only required a labeling, a source photo was not required
because these features were learned from a large amount of training samples. And it could perform really cool things, like
translating a landscape into a map, or sketches to photos, and more. Both of these works were absolutely amazing,
and I always say, two more papers down the line, and we are going to have much higher
resolution images. So, this time, here is the paper that is,
in fact, two more papers down the line. So let’s see what it can do! I advise you that you hold on to your papers
for this one. The input is again, a labeling which we can
draw ourselves, and the output is a hopefully photorealistic image that adheres to these
labels. I like how first, only the silhouette of the
rock is drawn, so we have this hollow thing on the right that is not very realistic, and
then, it is now filled in with the bucket tool, and, there you go. It looks amazing. It synthesizes a relatively high-resolution
image and we finally have some detail in there too. But, of course, there are many possible images
that correspond to this input labeling. How do we control the algorithm to follow
our artistic goals? Well, you remember from the first work I’ve
shown you where we could do that by adding an additional image as an input style. Well, look at that! We don’t even need to engage in that, because
here, we can choose from a set of input styles that are built into the algorithm and we can
switch between them almost immediately. I think the results speak for themselves,
but note that not only the visual fidelity, but the alignment with the input labels is
also superior to previous approaches. Of course, to perform this, we need a large
amount of training data where the inputs are labels, and the outputs are the photorealistic
images. So how do we generate such a dataset? Drawing a bunch of labels and asking artists
to fill them in sounds like a crude and expensive idea. Well, of course, we can do it for free by
thinking the other way around! Let’s take a set of photorealistic images,
and use already existing algorithms to create the labeling for them. If we can do that, we’ll have as many training
samples as many images we have, in other words, more than enough to train an amazing neural
network. Also, the main part of the magic in this new
work is using a new kind of layer for normalizing information within this neural network that
adapts better to our input data than the previously used batch normalization layers. This is what makes the outputs more crisp
and does not let semantic information be washed away in these images. If you have a closer look at the paper in
the video description, you will also find a nice evaluation section with plenty of comparisons
to previous algorithms and according to the authors, the source code will be released
soon as well. As soon as it comes out, everyone will be
able to dream up beautiful photorealistic images and get them out almost instantly. What a time to be alive! If you have enjoyed this episode and would
like to support us, please click one of the Amazon affiliate links in the video description
and buy something that you were looking to buy on Amazon anyway. You don’t lose anything, and this way, we
get a small kickback which is a great way to support the series so we can make better
videos for you. Thanks for watching and for your generous
support, and I’ll see you next time!

Posts created 3637

99 thoughts on “NVIDIA’s AI Creates Beautiful Images From Your Sketches

  1. this is amazing but at the same time sad when it comes to what art is, humans keep finding short cuts to everything these days

  2. Someday I will be able to make 3D models of my own video game characters, and make my own open world game, with AI generated voices and speech xD

  3. You can try a demo/low quality version here: https://zaidalyafeai.github.io/pix2pix/scene.html
    Please like this so other people can see 🙂

  4. can i download this programm and play around with it myself? will you release a version where we can feed data ourselves? that would be awesome! i'd pay for this

  5. You could do this in 3D for easily doing amazing Computer Game worlds. Same technology, but you would need 3D lazer scan samples or multiple angle photos instead of 2D Photos. Is it possible to eventually do it with creatures too? If so we could come up with simple sketches for Monsters/Aliens and have the ai create the skin tone, texture, warts, mouth, muscle definition, etc, etc.

  6. a lot of people are talking this up. Honestly we aren't at a point where it's entirely photo realistic, if you noticed objects had a pretty wide blur outline around them. I'm not saying this is a bad demonstration because i believe you could still do some really cool abstract work on this that could translate over well in designing backgrounds without having to overuse/rely on source material.

  7. Train an A.I to translate nouns into shapes, verb into motions, etc and input it to this and say goodbye to shitty human-made adaptations

  8. ya, all that creative brain synapse stuff is over rated.. Humans will devolve into globules of advanced mindlessness.

  9. The sick thing is, this is almost like what your brain does in real life……….
    This is kind of like, how an AI algorithm would visualize the world. And in some ways, if it has enough fidelity, its own "imagination" of how the world works, isnt more valid than ours we generate in our own brains.
    Granted, humans have far higher fidelity, we take "super high res sample data input" in the sense that we get very fine light information via our eyes, but our brains DO interpolate and transform this data into out best "guess" for what it is we're looking at, based on native established parameters in our evolved brain, lots of fine lines of green data is interpreted as a field of grass, for instance. This is what the AI algorithm does too, but with far lower accuracy and input resolution.

    But my point is this, the AI's interpretation, it's "guess" at what the world "looks like" based on the input data, is the same thing we're doing with out eye input, in our brains. And in some sense, who's to say which one is more accurate?
    Perhaps in the future, given enough training data and input resolution, the AI algorithm will generate a more accurate representation of reality, than our own eyes and brains can.

  10. AI can fake faces.
    AI can synthesize Joe Rogan's voice almost perfectly.
    AI can now render entire scenes like this.
    Can AI create any movie with any actors, eventually? Actors and celebrities can upload their entire body profiles as data and create timeless, immortal versions of themselves captured in that exact moment. Maybe with a few updates before death, maybe not. (I'd only hope Lucy Liu could get in on it before she's too old.)

  11. This is some awesome stuff, just imagine pretending to be an amazing photographer when you really are not.

  12. Its kinda scary to see an image that I think is real, but it's actually a computer-generated fake…

  13. The death of the technical aspect of art wasn't really what I had in mind going into the 21st century, but…alright…

  14. Can we make extremely low texture video games and use this AI to create realism?

    Or better, use as pairs of original minecraft and raytraced minecraft as input data pairs

  15. I mean I guess it's cool… It's basically just sticking images over one another, you even tell it what image to use.

  16. As someone who spends a significant amount of time merging photos to create backgrounds for architectural renders, this is exceptionally exciting.

  17. yea, cool. Good luck dealing with copyright claims with this one. 😀 Amazing tho. edit: It is using photos someone has made as "brushes" therefore, if those are not free for use, you have to pay for each photo used.

  18. You all underestimate what this can do:
    Imagine you manage to make this fluent, and then have a 3d mesh scene that only describes the labeling.
    Then let the NN generate the graphics on the fly.
    You could create TRUE photorealistic graphics with this – for all content. This means a game like this will only need a little bit of labeling and everything else will generate automatically.

  19. This is just the beginning of the process of creating a Matrix style world. Next, will be 3D models generation. So on and so forth until eventually the machine overlords take control over us.

  20. I want this so i can try to make the most STUPID thing imaginable like… Clouds + sea + a stickman made out of lake on the side

  21. People who always wanted to create Video games but were restricted due to their lack of art skill can use this now.
    God bless machine learning.

  22. Imagine if we used this with an improved version of that software that can transform any image of an object from 2d to 3d, and through virtual reality we could visit our own creations

Leave a Reply

Your email address will not be published. Required fields are marked *

Begin typing your search term above and press enter to search. Press ESC to cancel.

Back To Top