Using DALL-E and Power Automate To Create and Share Images
One big current trend, for good or ill, is AI-generated images. There are a number of services out there to generate these images and most of them have an API that you can hit against and which have various price points for that access. For our example today, we’ll use DALL-E 2, from OpenAI. Their introductory level gives you 50 free credits the first month and 15 free credits each month after that. One of the cool features of DALL-E 2 is that in addition to the ability to create new images from text suggestions, you can also upload an existing image and make edits to that based on your suggestions.
Create Our Custom Connector
Since Power Automate doesn’t have an existing connector to OpenAI’s API, we’ll need to create a custom connector. You can see my previous blog posts on creating custom connectors here: Part 1 – Part 2. I won’t go through all the details once again, just those details relevant to the OpenAI API. Also, while OpenAI offers a number of features for their API, we’ll just stick with creating a link to the image generation API for now.
Remember that creating and using customer connectors in Power Automate requires a premium-level license.
You’ll need an account with OpenAI in order to obtain an API key to use with them. You can sign up using a Google or other account on their website. Once you have an account with OpenAI, you can obtain your API key here. Save that for later. And remember that’s your personal key. Never share it with anyone you don’t want to use your access.
For image generation, there are a few details you’ll need beyond the API key. The first is you’ll need to select the size of the image you want to be generated. There are 3 options currently: 256x256, 512x512, or 1024x1024 pixels. You’ll pass that in as a size parameter.
The second parameter you’ll need to pass is the number of images you want to be generated. It can range from 1-10. Remember that each image will cost you a credit, so those 15 free credits per month can go fast.
The last parameter you’ll need for each request is the text prompt you want it to generate the image from. This can be any string you like.
Your API call payload will look something like this:
{
"prompt": "A knight dancing a waltz with a princess",
"n": 4,
"size": "1024x1024"
}
A call like that might net you something like the following images:
The final version of your custom connector’s Swagger definition should look something like this:
swagger: '2.0'
info:
version: 1.0.0
title: Image Gen
description: Image Gen
host: api.openai.com
basePath: /
schemes:
- https
consumes: []
produces:
- application/json
paths:
/v1/images/generations:
post:
summary: Generate AI Image From Prompt
description: Generate AI Image From Prompt
operationId: GenerateAiImageFromPrompt
parameters:
- name: Content-Type
in: header
required: true
type: string
default: application/json
description: Content-Type
enum:
- application/json
- name: body
in: body
schema:
type: object
properties:
prompt:
type: string
description: prompt
title: Text prompt to generate image from
'n':
type: integer
format: int32
description: 'n'
title: Number of images to generate
enum:
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
size:
type: string
description: size
title: Image size to generate
default: 1024x1024
enum:
- 256x256
- 512x512
- 1024x1024
default:
prompt: Optimus Prime riding a dinosaur
'n': 2
size: 1024x1024
required:
- prompt
- size
required: true
responses:
default:
description: default
schema:
type: object
properties:
created:
type: integer
format: int32
description: created
data:
type: array
items:
type: object
properties:
url:
type: string
description: url
description: data
definitions: {}
parameters: {}
responses: {}
securityDefinitions:
API Key:
type: apiKey
in: header
name: Authorization
security:
- API Key: []
tags: []
Creating Our Flow
Now that our custom connector is in place for generating our AI images, it’s time to create a flow that takes advantage of that. Because we want to prompt for some text to generate the image from, we’ll go ahead and create an instant cloud flow so we can provide that prompt. From our create flow dialog, select “Instant cloud flow”, give it a name, and choose “Manually trigger a flow”. Click create.
Under our trigger, we’re going to need to prompt for 3 values (two text prompts and one number prompt). Our text prompt, the image size, and the number of images. Add the prompts so that it looks like the following:
Now we need to call our custom connector to Open AI. Click “New step” and in the connectors list search for the custom connector you created. Then select the action for generating an AI image from the prompt. For the 3 parameters, pass in your prompt values from the trigger. It should look like this after you do.
Now, all we need to do is do something with the result. In this case, we’ll send out a tweet for each image we generated. In order to do so, we’ll need to first use the HTTP action to retrieve the generated image from the URL returned from OpenAI. Click “New step” and search for the control “HTTP”. For the method select “GET” and in the URI parameter, select the URL output from our “Generate AI Image From Prompt” action. This should automatically recognize that it is an array of values and enclose the HTTP action in an “Apply to each” loop.
You should end up with the following:
Now for our final step: Tweeting those images out for all to see. Click on “New step” and search for the “Post a tweet” action under the Twitter connector. For our tweet text parameter, we’ll enter our prompt text plus a couple of hashtags. And for our Media parameter, we’ll need to use the base64ToBinary function to convert our HTTP response into the format that Twitter needs:
base64ToBinary(body('HTTP')['$content'])
And so our “Post a tweet” action will look something like this:
Now it’s time to test our flow. Click “Save” and then “Test”. Enter a prompt, select an image size, and the number of images, and click “Run flow”. Here we have 2 tweets of “Kermit the Frog riding a bullet train”.
Conclusion
A couple of things to remember about all this. Your free credits with Dall-E won’t go very far. Also, the OpenAI API does seem to have some bandwidth issues at times, so it might not always work the first time. Just wait a bit and try again. And the images you generate are only stored for an hour on the OpenAI servers. So make sure you download it in that timeframe.
And DALL-E 2 seems to avoid anything that might be infringing on a copyright. For example, the two tweets above look nothing like Kermit the Frog. They do look like A frog, but not Kermit. So take that as you will. I tried another image with Optimus Prime and aside from a couple of wheels, nothing about the image might be construed as looking like him. Either that or DALL-E 2 just isn’t as good as some others I’ve tried.
Also, there’s an increasing controversy around the legality and ownership and ability to copyright AI-generated images. We’re a long way from coming to a conclusion on that legal matter. In the meantime, it’s a fun diversion to see what the various AI platforms generate from the prompts you give them. So take it all with a grain of salt and have some fun along the way until the lawyers and politicians ruin it for everyone.
And one last image for the road.