Why we built this: Running UI-related experiments is can be a game-changer for conversions. Changing the color of a button or font size of a header can lead to surprising step function improvements (or decreases!) in growth, and experimenting with these in a data-driven way is the only way to maximize the opportunity. However: creating and deploying experiments like this requires engineering effort and is hence time-consuming and expensive.
That’s why we built Coffee, which is quickly becoming one of the most popular projects for generating UI code with AI. Roughly 80% of our frontend is AI-generated. It’s been a game-changing internal tool — and now, we’re open-sourcing it. Coffee’s mission is to make it incredibly easy for any developer to build a frontend in a fraction of the time.
<aside> 💡 We’re able to write frontend code in a fraction of the time, and this allows us to automatically run growth engineering experiments that require UI code changes.
</aside>
While Coffee is built for generating code within the context of a React codebase today, building it has given our team the foundation for generating high-quality UI engineering experiments on any front-end. Soon, Coframe will be able to perform many of the simple tasks that a growth engineer performs on the user interface level — and eventually, the more complex ones, as well.
https://www.loom.com/share/c36d03f056b04ed5815a8c2cce4b6fc6?sid=0b636762-4c98-4b90-b7bc-385799fba00d
Why we built this: To date, Coframe has enhanced the quality of content by refining and testing different versions based on how well past variations performed. Although this has been successful in identifying the best version among those tested, it can restrict you to the same idea. In the field of machine learning, we often refer to this as reaching a "local maximum"—basically, the best you can do in a limited scope. This process heavily leans towards the "exploit" side of the "explore versus exploit" continuum, meaning it focuses on using known strategies rather than trying out new ones.
However, Coframe has now taken this a step further and expanded this approach. By integrating external data (think search engine optimization keywords and insights about competitors) and a more exploratory algorithm, Coframe can now introduce diverse and potentially more effective variations, and with our statistical significance engine, we can quickly evaluate the results and try new experiments. This accomplishes two things:
<aside> 💡 This finds the best variants quicker and paves the way for segment personalization.
</aside>
The above animation is a visualization of an experiment we set up to test this system. In this experiment, we defined several variants to be “winning variants”. The model was not told these variants, but instead was tasked with generating brand new variants that would be assigned a score based on the Jaro–Winkler similarity to these different phrases. This metric rewards the mention of specific words, and we chose it because customers are frequently searching for mentions of specific words when they are evaluating a website. The more similar the generated variant is to a winning variant, the higher the score it receives. As you can see, the model is capable of efficiently exploring and identifying high-performing variants (this chart shows the progression over 20 generation steps).