Behind the design: Adobe Photoshop Reference Image

Reshaping a generative AI beta feature to match creative intent

Two stacked images showing the before and after of a photo editing process. The top image shows the ankles of a person standing on a curb wearing black sneakers with white soles. Dotted outlines around the sneakers indicate a selection. The bottom image shows the same ankles now wearing white sneakers with green soles. A small editing dialog box appears between the images.

What happens when creative intent meets generative AI? The story behind Adobe Photoshop’s Reference Image is a testament to how listening to users and iterating quickly closed the gap between generative technology and creative vision.

With Generative Fill in Photoshop, users could already transform images simply by describing what they wanted to see. Reference Image took that capability one step further: Now, creators can upload an image, which Photoshop can use as inspiration to shape results that better match creative intent.

Staff Experience Designers Avalon Hu and Dana Jefferson, Staff Experience Researcher Roxanne Rashedi, PhD, Senior Experience Researchers, TJ Jones, PhD and Esme Xu, and Senior Experience Developer Evan Shimizu, PhD trace the evolution of this feature from early beta release to its current form, sharing how user insights shaped every design decision along the way.

What was the primary goal when you set out to design Reference Image?

Dana Jefferson: Our goal was to enhance and accelerate creative workflows, deliver new levels of control, and help creatives achieve the outputs they imagined by allowing them to choose the images that supply generative inspiration.

In the early beta we aimed for a simple and intuitive upload interface, but user research revealed a fundamental misunderstanding about how the reference image influenced the output. Our redesign was driven by the need to close the intent gap and introduce technology that better aligned the feature with user expectations.

What user insights did you leverage to help inform the design solution?

Roxanne Rashedi: There were four rounds of research for Reference Image with both professional (C-Pro) and non-professional (Non-Pro) creatives. The first was led by Senior Experience Researcher TJ Jones, the second by Senior Experience Researcher Esme Xu, and I led the final two.

Round 1: Early lessons about understanding and usage

TJ Jones: In our first round of research, we set out to understand how users perceived and interacted with the feature. People were quick to recognize its potential, and many could clearly see how a reference image might help improve the final output. However, while most users understood the basic UI interactions, they wanted more control over the output and felt that clearer, more descriptive interface copy could guide them through the process.

One key area of confusion was the relationship between reference image and prompt. Some people used the prompt field to describe aspects of the reference image, hoping to influence how the reference image shaped the output. It showed that the connection between the image and the prompt field wasn’t intuitive. The insights informed early explorations around interface copy and interaction patterns that would help people better understand that relationship.

Two stacked images labeled Before and After, showing an image-editing interface applied to a photo of a woman standing on a beach with the ocean in the background. In the Before image (top), the woman is wearing a bright pink jacket over a gray shirt. The pink jacket is outlined with a dotted selection border, indicating the area to be modified. The editing interface displays tools for selecting a reference image and generating changes, with a denim jacket image shown as the reference on the right panel. In the After image (bottom), the same woman is wearing a blue denim jacket over the same gray shirt. The beach and ocean remain visible in the background, and the editing toolbar shows options for confirming or adjusting the applied change.
The UI tested in Round 1 of research revealed that people were unsure about the relationship between the reference image and the prompt. We designed a slider so people could control how much the image affected the generative output.

Round 2: Better understanding the level of control people were looking for

Esme Xu: Building on what we learned in the first round, we ran a second study to dig deeper into expectations around influence and control, so we could shape the overall experience based on what we learned.

We found that users still needed more clarity on how the prompt and reference image worked together. Many people still weren’t sure what role the prompt was supposed to play or, given the presence of a reference image, how much it should affect the output. People also wanted the ability to control the image’s influence more precisely—to be able to dial it up or down depending on what they were trying to do—and for Photoshop to remove or ignore the background automatically. We began exploring clearer sliders and toggles to control influence.

Two stacked images labeled Before and After, showing an image-editing interface applied to a photo of a woman standing on a beach with the ocean in the background. In the Before image (top), the woman is wearing a bright pink jacket over a gray shirt. The pink jacket is outlined with a dotted selection border, indicating the area to be modified. The editing interface displays tools for selecting a reference image and generating changes, with a denim jacket image shown as the reference on the right panel. Also in that right panel is a slider titled Reference Match with both Low and Hight settings. In the After image (bottom), the same woman is wearing a blue denim jacket over the same gray shirt. The beach and ocean remain visible in the background, and the editing toolbar shows options for confirming or adjusting the applied change.
Research revealed that people were unsure about the “Strength” designation on the slider. We designed a two-position slider (Low, High) under the heading “Reference match,” so people could better understand how an uploaded image affected output.

Round 3: Drag-and-drop compositing

Roxanne: Next, we shifted focus to understanding how users think about combining and influencing visual elements. We tested a drag-and-drop workflow to help place objects within a scene. Still, neither group felt it delivered easy, cohesive results: C-Pros hoped for a one-step solution to harmonize compositions, and Non-Pros struggled with perspective and alignment.

In addition, the term we used for the feature, “generative compositing,” confused Non-Pros and felt overly complex to C-Pros, who viewed generative AI as a support tool rather than one for creating new content. Both groups seemed to prefer simpler, action-oriented terms like “remove background” or “combine images.”

Workflow complexity was another hurdle. Compositing requires users to select a layer, and introducing layers early in the workflow slowed progress, especially for Non-Pros. It was time to rethink both terminology and when to introduce layers.

A Photoshop artboard interface showing a tropical beach scene with a leaning palm tree and turquoise water. A surfboard is being added near the tree and has a bounding box and transform handles on it. The layers panel is open and shows the beach photo and the surfboard image as separate layers.
When we tested a drag-and-drop approach, we found it didn’t meet the promise of the feature for either Non-Pros (who struggled with the concept of layers, adjusting perspective, and aligning details) or C-Pros (who view generative AI as a tool to streamline workflows and wanted a one-step solution to harmonize compositions).

Round 4: Generative Fill as the starting point

Roxanne: Finally, we focused on streamlining the experience. By using Generative Fill as a starting point, users could select a portion of an image and generate new content based on a prompt. A “preserve character” slider helped clarify that the process involved "stitching" an object into a new image rather than altering the original.

There was progress. The interaction felt more intuitive, but users continued to encounter challenges with precise selections, and the generated outputs often fell short of expectations. For example, when a user carefully outlined a plant holder, the model produced a smaller or overly simplified version, leaving users frustrated by the gap between effort and outcome.

The results prompted collaboration with our research science partners to enhance the model’s sensitivity to user selections so the technology would better reflect user intent and creative vision.

“Our close collaboration with design—from using a Photoshop-like prototype in research, to understanding real user workflows, and embracing ongoing iteration—gave us a deeper sense of what makes features truly useful.”
Soo Ye Kim and He Zhang, Research Scientists

What was the most unique aspect of the design process?

Evan Shimizu: The most unique aspect of the design process was our ability to run four different UI designs and interaction flows very quickly in a Photoshop-like prototype. We built an entirely new web app called “Protoshop” (Prototype + Photoshop) to test interactions, rather than trying to build them in Photoshop on the desktop.

Having a web app meant that we could quickly test and iterate on designs, without requiring users (and our team) to download and install a new version of Photoshop every time it was updated.

It might seem like building an entirely new web app that emulates Photoshop would be more work than implementing a feature in Photoshop desktop, but in the time it would take to implement a single version of the reference-image UI in Photoshop, we built the entire Protoshop web app. After that, we could make changes in days instead of weeks. Sometimes we’d make changes by the hour, as working in a web environment allowed Dana to request changes that I could implement easily. The iteration speed allowed us to experiment with the designs and identify usability issues before moving forward with user research.

A photo editing interface showing a fluffy blue creature sitting on moss in a forest scene with soft sunlight. An editing panel labeled Reference Image is open, allowing the user to upload an image for generating changes. Layers and adjustment options are visible on the right side of the screen.
A screenshot of the Protoshop (Prototype + Photoshop) application.

What was the biggest design hurdle?

Avalon Hu: One of the biggest challenges of this project has been balancing our technology research team’s assumptions with genuine user needs. Often, when a new technology is developed, researchers have a specific vision for how users will interact with it. Additionally, because they’re so familiar with the system, they often know clever workarounds to carry out tasks. However, in real-world scenarios, we can't dictate how users will engage with our technology—especially when it's a feature within a creative tool.

We must remain open to the diverse and imaginative ways people use it. As a result, our approach has been to adapt the technology to user behavior rather than expecting users to conform to it.

How did the solution improve the in-product experience?

Dana: The original designs played a crucial role in shaping the final experience. For starters, research revealed several gaps where the feature didn’t meet user expectations. The biggest of those was that users typically expected an uploaded reference image to do more than guide the results (they expected the generated result to be an exact duplicate of the reference image). The original UI was adaptable enough that we were able to carry over many of its elements while also addressing gaps.

Two stacked images labeled First round (top) and Final Design. Each image has two sets of panels showing different stages of a design workflow. The "First round" image includes reference image dialogs for uploading and selecting a denim jacket. The "Final design" image shows updated reference image dialogs, one with an upload option and another displaying a spherical object with transparency.
In our first design round (top), we incorporated four input parameters along with a Strength slider to give users control. Later iterations expanded these capabilities, introducing crop and lasso tools directly into the flyout for more precise selection. Ultimately, the final design (bottom) was greatly simplified: Users could upload a reference image, select the specific part they want to use, and choose whether to swap or place it in the designated area.

Avalon: Unlike earlier explorations, Reference Image not only enables users to fill a selection by referencing an image; it also allows them to insert an object or replace part of an image. Previously, users had to accept whatever result the system generated—often with little control. Users can now express their intent in advance and have greater influence over the outcome.

Two stacked images showing the before and after of a photo editing process. In the Before image (top) , in a modern kitchen, there's a dotted selection area on the countertop and an editing panel displaying a blue ceramic vase as the reference image. The After image shows the same kitchen after editing, with the blue ceramic vase from the reference image panel in place of the dotted selection area on the counter.
The feature today. Top: A reference image is used to place a vase into a scene; a simple upload process leads to an expanded UI that allows people to select which part of an image they want to use and how they want to use it. Bottom: When the Reference is an “Object” and the Intent is “Place into the selected area.”
Two stacked images showing the before and after of a photo editing process. In the Before image (top) a silver classic car is parked on a desert road with cacti and mountains in the background. The editing interface displays two panels: one for choosing a reference image and another showing a lush green forest scene as the reference. The car is outlined with a dotted selection border, indicating that the area around it should be modified. In the After image the same car is now in a vibrant jungle setting with dense tropical foliage, sunlit leaves, and dappled light on the ground. The editing toolbar shows options for confirming or adjusting the applied transformation.
The feature today. Top: A reference image is used to replace the desert background with a jungle background. A simple upload process leads to an expanded UI that allows people to select which part of an image they want to use, and how they want to use it (Swap the chosen area). Bottom: When the Reference is the ”Whole image” and the Intent is ”Swap the selected area.”

What did you learn from this design process?

Dana: High-quality results are crucial for user satisfaction. I was glad that our team didn’t rush to ship the feature; instead, they continued to iterate and provide feedback on the technology until the output and behavior met users’ expectations.

Avalon: Sometimes, less really is more. In our initial design phase, we included several additional controls to give users more influence over the outcome. However, adding extra steps and options can unintentionally make experiences feel more complex and intimidating. By streamlining the flow, we were able to leverage existing features more effectively and let the new functionality shine through simplicity.

What’s next for Reference Image?

Evan and Roxanne: Reference Image was introduced in public beta to guide image generation using a visual reference. While it offered early insights into how users engage with reference-based workflows, the team evolved the experience based on what we’ve learned. The current version delivers a flexibility that supports most workflows enabled by Reference Image in a familiar interaction model.

Give Reference Image a try on desktop: Open the Creative Cloud app, select “Apps,” select "Beta apps,” then install the Adobe Photoshop beta.

Header copy
Design your career at Adobe.
Button copy
View all jobs
Button link
/jobs