This project is a solo experiment on the topic. As an illustrator, I was mesmerized by the power of generational AI, but somehow, the process never stuck with me. This case uses Vana Art Studio as a starting point but discusses the broader problem of text-to-image generation. I wasn't a company employee at the time.
As a designer, I'm always curious and ready to ask 'Why?' When I noticed something felt off while creating self-portraits in Vana Art Studio, I didn't just let it go. I used usability heuristics and personal experience to identify possible issues.From my perspective, one of the designer's duties is always asking why.
I did a quick research on the topic, and here is what I've found can help.
During my analysis of market competitors, I focused on identifying and understanding the features that improved user interactions. Notably, best practices among competitors included the integration of text-to-image examples and a detailed, granular approach to crafting prompts.
In addition to examining the AI art sector, I extended my research to encompass examples from various other industries. I focused particularly on understanding the integration of user-friendly features such as tips, auto-fill functions, and templates.
My design strategy centers on the use of tag inputs enhanced by contextual tag suggestions. To minimize errors, the initial tag "me" is automatically populated. The contextual suggestions enrich the prompt details, encouraging a deeper engagement through intuitive recognition rather than recall. Additionally, the suggestions dropdown allows for refined search results and provides a hover feature for visual previews, further aiding in intuitive visual exploration. This approach maintains best practices automatically, minimizing cognitive load and eliminating the necessity for additional examples or guides.
Each new word or phrase, entered directly or selected from a dropdown, is transformed into a tag. At the same time, tag suggestions are refined based on new context. Recognizing that only some users are well-versed in the intricacies of visual arts or photography settings, the system is set to enhance recognition, not recall. The hover-enabled visual exploration feature further empowers users to discover and experiment with new styles and phrases, promoting creative exploration. This has a chance to enhance user satisfaction and drive increased studio usage and revenue through a higher volume of transactions.
I used progressive disclosure to avoid overwhelming users. Extra guidance, such as "Drag to reorder," appears only after three tags are added, helping users to redistribute weight inside the prompt. Editing is just a click away on any tag, simplifying adjustments.
I am happy with the trajectory and outcome of this project. The developed concept has proven its potential to make prompt creation easy and intuitive. By enforcing proper prompt structure through automatic comma insertion and enhancing detailed input via tag suggestions, this design leverages recognition over recall. It further supports optimal information hierarchy with a drag-to-reorder feature and employs progressive disclosure to minimize cognitive load effectively. Moreover, it supports visual exploration rather than constant reliance on text.