Each week it seems like a creative new AI technology emerges and enhances the capabilities of AI by leaps and bounds. Chat-GPT’s meteoric rise that earned 100 million new users in 2 months is a prime example of the newest applications of AI and marks the fastest user acquisition in history. Using the latest advances in AI technologies, AI photo booths have become increasingly popular, capturing the imaginations of millions.
While artificial intelligence once seemed limited to solving repetitive issues, it can now assist with creative and complex tasks like writing, coding, and creating music and images. AI-based technologies like Dall-E 2, Midjourney, and Stable Diffusion continue to revolutionize the way we interact with AI photo booths.
So, how can brands harness this energy?
The Role of Artificial Intelligence in AI Photo Booths
We at Cognition have been following the rapid growth of AI technologies and understand the risks involved with using them. Implicit bias and accusations of art theft have made AI art controversial. Image diffusion models use random noise as a seed, meaning the results are a black box that is hard to predict and control. In this article, we propose ways to mitigate these issues.
Cognition has created several unique custom-themed AI photo booths, including ones with custom lighting and robotic camera movements. We can generate a virtual background displayed on a large LED screen in the style of virtual production (like that seen in The Mandalorian), fabricate a green screen to key the background, or use computer vision to separate the user from the background. Each method has its pros and cons.
Virtual Production Backgrounds for AI Photo Booths
COGNITION using virtual production backgrounds to power one of its AI photo booths.
A virtual production AI photo booth consists of a camera, a computer, and an LED wall behind your guest. It requires no compositing as the background is displayed behind the user. This produces realistic lighting and reflections, especially when paired with an RGB key light to illuminate the foreground of the user when taking a photo or video. We can control the hue of the RGB key light to match the scene the LED wall is projecting for a more realistic and pleasing look. We recommend at least an 8’x8’x8’ footprint to use an LED wall large enough to fill the background.
Pros Best looking, no compositing necessary. Most immersive for guests and has all the benefits of newer technology.
Cons Hardware is more expensive and requires a larger footprint.
Green Screen Backgrounds for AI Photo Booths
Without an LED wall, an AI photo booth with a greenscreen provides the cleanest edges when masking the background out of the photo. Software compositing can easily remove the background without artifacts. The virtual background can be rendered live in a selfie-facing screen so the guest can see what is being composited on the screen.
Pros Good looking and easily composited.
Cons Larger footprint, impacted by the aesthetics of the green screen.
Segmented Backgrounds for AI Photo Booths
COGNITION using a standalone AI photo booth to and image segmentation to create segmented backgrounds with a green screen or LED wall.
Example of COGNITION using image segmentation to generate incredible backgrounds for one of its AI photo booths.
Similar to virtual backgrounds in Zoom, Cognition can use image segmentation to identify and separate guests from the background without a green screen or LED wall. Your AI photo booth can simply exist as a kiosk. It takes a decent amount of processing power and results may have artifacts, but it requires the smallest footprint. We have found that Google’s MediaPipe has one of the most accurate models.
Pros Smallest footprint, no background needed.
Cons Heavier engineering lift and more computer processing required. Artifacts are possible in your guest’s final image.
In all of these methods, Cognition can compose additional images or animations in the foreground or background of the result, such as logos, frames, or moving backgrounds. We can provide a printout, digital photo, animation, or video as the final takeaway.
Stable Diffusion Packs Advantages for AI Photo Booths
Although Dall-E 2 and Midjourney use some of the best models and have paid APIs available, we prefer Stable Diffusion for being open source, free, and constantly improved by the community. Another advantage is that Stable Diffusion can be run locally, so it does not require internet access, which can be difficult to access at conferences, festivals, or large events.
Depth to Image Model
A demo of COGNITION using Stability AI’s Depth to Image model to generate an AI photo booth background.
Earlier in 2022, Stability AI released an “Image to Image” method that combines text with an input image to generate a new image. In November 2022, Stability AI released a “Depth to Image” model that analyzed existing images and generated a depth map from them for more accurate reproductions of the existing image. This model is ideal for AI photo booth backgrounds because you can choose images that have space in them and are well-suited for backgrounds.
AI Photo Booth Prompt Guidelines
AI Prompt Example: how to engineer better prompts for your AI photo booth backgrounds.
In our opinion, it would be poor UI and add risk to clients to allow users to create their own prompts from scratch and then add client branding to the UGC. Instead, we recommend giving guests multiple-choice quizzes that correspond to a part of the prompt (invisible to the user) that generates the background. Using prompt engineering techniques, an existing background seed, and limiting user input allows us to craft safe, high-quality, unique backgrounds for any purpose.
Stable Diffusion also allows you to add negative prompts, which are keywords you want to avoid when generating an image. We recommend that you stay away from generating images of people to avoid issues of bias or a potentially negative impact on the quality of your finished image. We can combine negative prompts with our seed image, depth model, and multiple-choice prompts to further control what gets generated.
Example seed image being used to generate an AI photo booth background.
AI photo booth background variations using img2img with the depth2img model.
Our finished AI-generated photo booth background in action.
Original image by Yegide Matthews on Unsplash
Using the seed image of a jungle background above, here are variations we generated with Stable Diffusion that include a jungle, a beach, a spaceship, a landscape, a view underwater, an abstract drawing, a rainbow, and a city.
Many models are still being generated by the community. Some datasets have only photorealistic images, while others focus on anime or cartoons. Released in March, Gligen is another example that gives you more control simply by placing bounding boxes in the prompt image.
Alana Balagot is a Creative Technologist, Artist and Software Engineer in Los Angeles.