Pinterest Outlines AI Background Generation Process for Product Shots
Pinterest is developing its own AI text-to-image generation process, though Pinterestâs approach is slightly different to what youâre seeing in other apps.
As outlined in a new overview from the Pinterest Engineering team, Pinterestâs âCanvasâ model aims to provide generated options for product backgrounds, without altering the product shot itself as the main focus.
Which takes a little more training. Most large language models are designed to create an image based on a description, by matching the text notes from other images to the actual visual outputs. Most product shots, however, donât describe the background within the caption, so Pinterestâs team has had to come up with a new way to isolate the background and foreground, and then make it easy to guide the tool with simple commands.
As per Pinterest:
âTraining Pinterest Canvas gives us a strong base model that understands what objects look like, what their names are, and how they are typically composed into scenes. However, as previously stated, our goal is training models that can visualize or reimagine real ideas or products in new contexts.â
So, conceptually, Pinterest is looking to use its existing database of product images to establish common framing, placement and background types, in order to better facilitate AI background generation requests.
Itâs a complex approach, but Pinterest has now built a system that can do this with a high level of accuracy.
â[We] use a segmentation model to generate product masks by separating the foreground and background. Existing text captions typically describe only the product while neglecting the background, which is critical to guide the background inpainting process, so we incorporate more complete and detailed captions from a visual LLM. In this stage, we train a LoRAÂ on all UNet layers to enable rapid, parameter efficient fine-tuning. Finally, we briefly fine-tune on a curated set of highly-engaged promoted product images, to steer the model toward aesthetics that resonate with Pinners.â
So, again, the system is specifically designed to generate backgrounds based on existing Pin images, while Pinterest has also sought to align the model around certain visual styles, in order to further simplify creation.
In the end, that should enable brands to type in whatever style they like, based on common descriptors, and Pinterestâs system will be able to provide options for your product shots in that aesthetic.
Itâs an interesting concept, which Pinterest is already testing with selected ad partners.
It could be a good way to create more variations of your Pin images, and enhance your productâs appeal within different design approaches.
You can read more about Pinterestâs approach to AI background generation here.