サーバーレスでGPUを使用すると、何ができるのでしょうか？¶

URL: https://www.youtube.com/watch?v=BvH6y7tC_eQ
文字起こし日: 2026-02-24 02:27

内容概要¶

Cloud RunでサーバーレスにGPUを使用できるようになった。
Stable Diffusion XLなどのオープンソースモデルを実行できる。
AI推論アプリケーションを構築し、画像を生成できる。
GoogleのImagenモデルはフォトリアリスティックな画像を生成するが、よりコントロールが必要な場合はオープンモデルを使用できる。
Cloud RunでStable Diffusion XLを設定し、Colabで試すことができる。
必要なAPIをGoogle Cloudプロジェクトで有効にする。
TorchServeを使用して、Stable Diffusion XLモデルを簡単に操作できる。
Dockerfileを使用してコンテナを構築し、Cloud Runにデプロイする。
GPUタイプを指定して、Nvidia L4 GPUを使用する。
Cloud Netを設定して、Hugging Faceからのダウンロード速度を向上させる。

文字起こし¶

Now that I can use GPUs serverlessly in Cloud Run, what can I do with it?

Good question.

You can run different open source models on it, like stable diffusion XL for image generation.

And let me show you how.

Welcome to the show Lisa.

What do you do here at Google Cloud?

I'm a product manager at Google Cloud, focusing on Cloud Run and GKE.

And my years of experience in the cloud industry have given me a deep understanding of the challenges enterprises face.

I am passionate about using this knowledge to drive product development and ensure a seamless user experience for our customers.

Seamless user experience.

I love it.

Uh so what are we building today, Lisa?

We're building a simple AI inference application that generates images.

It uses the stable diffusion XL model with torch serve, running seamlessly on cloud run.

Once deployed, the application will allow you to generate images simply by providing a text prompt describing the image you want.

And here's an example of what it can do.

But you could also use Google's hosted Imagen model for image generation.

Uh why are we using stable diffusion?

Google's Imagen model is really good.

It's hosted by Google and generates photorealistic images.

It's easy to call from your code.

So you can get your application to market quickly if you use that.

On the other hand, if you host your own model or open models on corun, you get more control, you get more transparency and customization.

And that's what we're doing today.

Ah, I see.

So you said before that we're going to set up stable diffusion XL in Cloud Run.

Uh can viewers try it for themselves?

Yeah, of course.

You can follow along in the collab.

Martin, can you please make sure to include the link in the show notes?

Okay, we'll do.

All right.

The first step is to enable the right APIs in your Google Cloud project.

And I have done that in my project, but the collab list all the commands for this.

Got it.

Next, I will create a torch serve app.

Torch serve is an open source model serving library that makes my life easier to walk with the stable diffusion XL model.

And the stable diffusion XL model will run in a GPU enabled current service.

Then I will create a requirements file that list all the libraries the application will use.

I will add a file called config.properties that gives torch serve is configuration.

Now, it's time to add the main code of the application.

I will create a new file called stable diffusion handler and paste the code from the collab in there.

Whoa, that's a lot of code.

Uh where does the image generation happen?

Here, in the inference method, and that's where the real action happens.

This inference method takes a list of input prompts and generate images using the loaded models.

It first use the pipeline to generate a initial image from the pump.

And then it use the refiner to refine the image and remove the artifacts.

Ah, got it.

Next, I will create a shell script file that starts the torch serve application.

I will ask corun to call this shell script.

Corun runs and scales containers for us in the cloud.

So I need a docker file to build that container.

Here, at the end of the docker file, I'm asking corun to execute the shell script that starts the torch serve application.

Ah, understood.

And now it's time to deploy the app.

Yeah, that is right.

It takes some time to download the stable diffusion XL model from the internet.

So to speed things up, let's run these two commands to set up a Cloud Net.

And that will increase the bandwidth to hugging face website where the model is stored.

Sounds useful.

Uh deployment time?

Yeah, it is.

First, I will run G Cloud build summit to build a container.

And this command will build a container in the cloud on a Google hosted build server.

It will take a while, so let's take a tea break.

And we're back.

The build was completed successfully.

Now, I will run GCloud Run deploy to deploy the build container and give it a public HTTPS address.

And it looks like you're setting up Cloud Run to use GPUs here, Lisa.

Yeah, GPU equals one means that the service will use one GPU and GPU type flex specified that the service will use the Nvidia L4 GPU.

Let's run that command.

It will take a minute to complete.

Looks like the deployment finished.

It did.

Let's try it out.

First, let's put the service URL in a environment variable.

Next, let's set up environment variable for the prompt.

And I'll ask for a image of a dog running in the park since the dog is my favorite pet.

And then I will run this curl command to send the request to my app with the prompt.

And this will take a minute also due to the cloud run service cold start.

And it's done.

Here's the image.

Very nice.

Um, could the dog wear a pink shirt and sunglasses?

Sure, let's do it.

I will send another request to the app with a new prompt.

And here you are, this is the image.

How do you like it, Martin?

It looks great, Lisa.

Uh well, thank you for showing us this, Lisa.

Yeah, thanks for having me, Martin.

And thank you, everyone for watching.

If you have any questions for Lisa or me, please add them in the comments.

Also, please let me know what you thought of this episode.

I read every single comment.

I can't wait to see you at your build.