LongCut logo

AI Interfaces Of The Future | Design Review

By Y Combinator

Summary

## Key takeaways - **Software shifts from nouns to verbs with AI**: Traditional software interfaces focus on static elements like text, forms, and buttons (nouns). AI-driven software emphasizes dynamic actions such as auto-completion, suggesting workflows, and information gathering (verbs), requiring new design tools to represent these actions visually. [00:50] - **Voice AI needs multimodal feedback**: For voice AI interfaces, relying solely on audio feedback is insufficient. Visual cues are crucial to confirm voice recognition and indicate when the AI is responding, especially in scenarios where audio might be muted or unclear. [02:11], [02:24] - **Latency is the AI conversation interface**: The speed at which an AI responds significantly impacts the naturalness of a conversation. High latency makes the interaction feel robotic, whereas low latency contributes to a more human-like experience, making response time a critical design element. [03:14], [06:03] - **Canvas UIs model complex AI agents**: Visual canvas interfaces, similar to flowcharts, are emerging as effective tools for modeling and managing complex AI agents. This approach allows users to visualize, control, and understand the multi-step, decision-making processes of autonomous AI. [09:29], [11:35] - **AI-generated content needs verifiable sources**: When AI agents retrieve information, providing inline sources or references is crucial for building user trust. This allows users to easily validate the accuracy of the AI's output, similar to footnotes in academic papers. [17:08], [18:35] - **Blurring video trades fidelity for faster iteration**: In AI video generation, presenting a blurred version of the video with accurate audio before full processing allows users to quickly iterate on the script. This approach prioritizes immediate feedback and human-in-the-loop refinement over initial visual perfection. [33:04], [34:18]

Topics Covered

  • AI design shifts from static nouns to dynamic verbs.
  • Latency is the interface for natural AI interaction.
  • Visual workflows empower control over autonomous AI agents.
  • Build trust in AI by citing sources inline.
  • Iterate faster: trade fidelity for immediacy in AI generation.

Full Transcript

over the next decade new AI user

interfaces are going to emerge beyond

the common chat UI that we're all

familiar with so in this episode I will

be joined by rapael Shad the creator of

notion calendar and together we'll

review a bunch of Cutting Edge AI

interfaces it will give us a glimpse of

what the future might look like welcome

to another episode of design

[Music]

review Rafael welcome thank you for

joining us today thank you so much for

having me yeah talk a little about how

you think interfaces are going to change

in the a age of AI we'll look at some

examples in a minute but from a high

level what are the differences between

kind of the say static web-based uh 2D

interfaces that we're used to today with

where things are going in the future

sure I think a big difference is that

software of today or kind of like up

until this point was mostly kind of like

just clear things you can point out on

the screen um that are you know kind of

nouns like text forms drop downs buttons

Etc and with AI what really changes is I

think so much of the design of what AI

does is kind of more verbs um it's more

the workflows Auto Complete Auto suggest

um go out and gather some information

for me Etc and we don't really have the

tooling yet to kind of draw verbs on the

screen and so that's what's really

fascinating how you know this software

is now emerging in this new AO World

awesome well let's take a look at some

examples that were submitted by the YC

Community the first one that we're going

to take a look at here is vapy which is

voice AI for Developers

all right so vapy lets developers build

test and deploy voice agents in minutes

rather than months and it looks like

there's a a microphone icon here where

we can just click in and give it a try

welcome to vapy I'm an assistant that

you can talk to like a person so how's

your day going going great and it looks

like this um uh demo and website is

really targeted at the developer that

would actually implement this technology

not so much the user that will EV

eventually use it your spot on vapy is

definitely more geared towards develop

who want to integrate Voice Assistant

capabilities into their apps or phone

numbers but hey even if you're not a

developer feel free to give it a spin so

one of the first things that I noticed

was that when I was speaking um it

wasn't there was no visual feedback um

uh making it clear that my voice is

actually recognized by the microphone um

and then similarly when the uh voice was

answering um there was no sort of like

visual indication um that that's what's

happening so for example if our laptop

was a mute uh we were not sure whether

demo is broken or what's going on so

important I guess to kind of pair

multimodal cues um so not just rely on

voice um in these type of scenarios

where you do have a screen uh on the

phone that would be a different scenario

I think one of the other interesting

things to try here is when you're

talking to a human um the latency is

really important and also interruptions

and it felt pretty fast and pretty

natural when we were conversing I wonder

what would happen if we tried to

interrupt it would it be able to handle

it let's see let's try it and one of the

things that I actually do really like

here is that they always because it's

developer Focus they always rendered um

kind of like a little label that shows

you instantly for each each answer the

milliseconds of the delay um really kind

of building you an intuition you know

how many milliseconds feels natural ver

it kind of feels like oh I'm talking to

a robot that's a great point because the

latency is the interface in some ways

and that how fast it responds to you the

longer it takes the less it feels like a

natural conversation and the more it

feels like you're talking to a robot the

whole point is to make it seem like

you're talking to a human yeah and it's

almost seems like we're basically

talking here kind of like the dev mode

um of this technology and so whenever

you can and are in this situation like

what are all the things that you might

be able to expose such as kind of the

milliseconds of latency kind like think

of it as Dev mode versus sort of like

production mode yep let's try to

interrupt her here welcome to VY hi I

was wondering if you could tell me what

the weather is person so how's your day

going all right so two things happened

one um it didn't pause uh when you were

interrupting and then two um it entirely

missed um your your question when uh

when it actually got done with with its

own sort of agenda yeah awesome well I

think my takeaway here is one developer

focused number two showing the latency

was actually key because that's so much

of what makes this actually feel natural

as a human interface and the voice felt

like I was talking to a human which was

pretty incredible pretty incredible that

um you know startups can now have these

things that previously kind of felt like

a huge company uh needs to build this

technology yeah it's amazing it's like

you would have thought this would be a

hundred years away not available today

totally great thank you vappy okay next

up we've got retail Ai and they say

supercharge your call operations with

voice AI so this is another voice AI um

what I thought was interesting on this

one we can actually try it with live

phone calls and having their AI voice

agent call us and run through uh some of

these different scenarios that they have

a receptionist uh setting appointments

lead qualification surveys but I haven't

been paying my debts so we're GNA go

debt collection here all right and I'll

put in my phone number and my

name and email address submitted wow I'm

actually getting a call my phone is

ringing hi is this Aaron yes this is

Aaron hi Aaron this is James from retail

debt collection company the reason for

my call today is to discuss the

outstanding balance on your account

which is currently $800 this balance is

pass due have you received our email

notification about this oh I'm sorry

this is not Aaron this is

Steve oh my apologies Steve can you

connect me with Aron by any chance um

you know I haven't seen Aaron for a

while he might be on vacation or

something got it Steve sorry for the

mixup we must have made a mistake hope

you have a wonderful rest of your

day all right right I successfully

evaded my debt collection awesome first

of all latency is is an issue huh that's

what kind of like breaks the illusion of

this being a real person yeah I I think

one of the things that stood out to me

is like it I threw at a curveball and it

totally adjusted cuz when I typed in the

form here I told it that my name was

Aaron and a simple system would just

keep referring to me as Aaron throughout

the entire call and halfway through I

told it oh no actually this is not Aaron

this is Steve and it referred to me as

Steve after that cool yeah so it

actually learned from the conversation

and so like build on that there's room

for improvement where you could then not

just say like oh sorry I made a mistake

but actually follow sort of like your

cue of you you know knowing um uh Aaron

right um but he's on vacation but it

kind of like just shut down and that's

when it started to feel a little bit

more robotic and there's clearly kind of

like more room for improvement but

already like a great first step that I

picked that up and again like the voice

felt incredibly realistic it felt like I

was talking to a human I agree with you

just the latency was the only thing that

gave me a clue that maybe this was not a

human yeah and one of the interesting

things here is that this may kind of be

technology used in a first line um of

sort of like you know a defense and then

you can kind of bring a human into the

loop um when sort like you know maybe

50% of these calls go through

robotically um and then you can bring in

uh the human for For the Rest like for

example you yes absolutely and there's a

transcript and and a lot more

information that a human could then

follow up on so this eliminates a lot of

the grunt work or you know work that

people are just kind of Outsourcing um

to try to get done um in a in a more

automated way totally so you're hinting

kind of you know the backside of this

product that we currently don't see um

but you can imagine like really rich UI

AI UI that then kind of shows what

happened during the call to the um call

center operator uh that is a whole other

side of this this this technology and

this company well retail AI this was

amazing thank you now let's take a look

at some AI agents and so agents are

effectively um autonomous AI that can go

out behind the scenes and can interface

with websites it can make phone calls it

can interact with users um lots of

things autonomously and then kind of

bring back their findings or complete

actions on behalf of uh the business

right they have a few high level

instructions and then just go execute

autonomously yep exactly so uh first one

we should take a look at here is gum

Loop 10x your output with AI automation

no coding required and I think what

they're alluding to here is that when

you have these agents they're executing

autonomously and they're making

decisions at each step along the way of

what to do next and it's hard as a human

who's trying to oversee this and making

sure that they're doing the right things

to monitor that and make sure that

they're on the right track so this I

think is one way to do that which is to

show visual workflows which is I think

what we're going to get from gum Loop

here cool let's give it a try let's uh

let's try start from a

template and we've got one here for web

scraping scraping the YC directory all

right let's try this template and let's

begin customizing the template Okay cool

so it looks like what we've got here is

a big open-ended canvas that we can pan

around and zoom in on that gives us a

bunch of uh boxes for each step in the

flow mhm and Canabis has really emerged

as a really interesting kind of almost

new document type um that seems to lend

itself pretty well to not just kind of

for design tools or or kind of

brainstorming tools but lends itself

really well for these sort of modeling

these kind of like AI processes yeah

it's it's great because it gives us the

user a visual overflow of exactly what

steps the agent is going to take and we

can control what it should do at each of

these steps so for example the first one

here is uh it gets input which it says

specify the batch code like

w24 and go get a coffee um so this is

what what should it ask the user at this

step then it combines text where it

basically combines it with this

URL um right

here and then it will take that and it

will use the web agent scraper to

execute the URL um to to get all of the

data from the website it will then

extract the data that is on that

website combine text put it in a

list and so you can see all the steps

here so one of the things that does

pretty well is um using colar um to show

different type of notes um kind of like

input actions um output Etc I almost

feel like I would want like a legend

like which color is what

um and then because it is uh canis um

kind of having different Zoom levels

showing different Fidelity so right now

we're so zoomed out I can't read any of

the small text why not just kind of hide

it and make the note almost collapse it

into just you know in this case a brown

Block in this case a Yellow Block to

kind of give different Zoom levels

different fidelities I think that's kind

of one way Um this can really go and

then here we have uh a pretty linear

flow um but the canvas and modeling

these kind of like AI ancient decision

trees gets really really powerful when

it isn't something you could just kind

of like linearly write in a document

like a recipe first do this then do this

then do this but really the power is in

sort of like the multi-dimensionality in

the branching um and so for as a starter

template to kind of like explain the

power of this tool to mod these

processes I think one that is

multi-dimensional would really showcase

the power of this yep that's a great

point I mean another Advantage for this

too is that you can just add text blocks

kind of next to these as instru

instuction so here was a template that

that we just pulled down to get started

with and it's got big text explaining

you know if you're going to customize

this here are some things that you

should consider which is helpful as

you're just jumping in and trying to

learn a product for the first time but I

think we're going to see more and more

of this canvas interface as we have

complex workflows that are customizable

by users 10 years from now this may be

the standard to interact and control and

monitor a lot of these agents if they're

truly everywhere and doing you know so

much of the work that we as humans don't

want to spend our time doing this is

going to be the way that we control it

yeah and interesting thing is that you

know like kind of what what is old is uh

is new again so kind of flowcharts Etc

probably like chip designers like 50

years ago they're like oh yeah we used

to you know kind of model our things

like that and so it's interesting to

kind of like see this Paradigm kind of

getting resurfaced in the AI era so you

know we didn't inent invent this today

but we're building on a lot of Legacy um

and on the Giant on the shoulder of

giants here yeah and it's always

historically been static and it seems

like what's new is actually making it

interactive yeah awesome thank you gum

Loop this is a a pretty incredible

product and you can tell a lot of

attention to detail was paid here all

right next up for AI agents we're going

to take a look at answer grid answer

grid has answers at scale and there's a

text box here where I can put in some

input but it looks like they've got some

suggestions for me and this is a really

nice pattern right whenever you have a

free form uh text T box where basically

you're expected to prompt um write text

as an instruction for the llm um kind of

the AI engine uh to use as an input

having some examples um to just kind of

like turn examples into buttons where

with a single click you can basically

fill out uh like a pretty pretty

reasonal example is um is something

that's you know really good to start yep

it's hard sometimes you come here and

you're like I'm not sure what to use

this for and you you're staring at the

blank canvas effectively and this gives

you some suggestions to make it easy to

get the value of the product and I would

take this even a level further I think

it's not just for examples where you

know as a demo um but what if you can

guess or infer from other context of

your application like what might be like

a good prompt even as you're kind of

using the application and then make them

single click as opposed to just being

static examples so get collapsing sort

of like suggested prompts into buttons I

think is like a really interesting kind

of dynamic uh uh pattern here yeah I'm

about to click on AI companies in San

Francisco and there's a lot that happens

behind this it seems like a simple

button but all right so pre-filled it AI

companies in San Francisco and wow

that's pretty fast okay so I get a bunch

of uh companies that I'm sure many of

the audience has heard of uh open AI

anthropic perplexity um and it's got

some information like uh HQ location and

Industry and website URLs um I think

what's interesting here is to click on

the plus button here and we can add

something different so let's just say

funding raised and basically basically

what happened here is we took a prompt

as an input um and we got the

spreadsheet structured data as an output

and in the background it went to these

websites scraped it assembled this this

spreadsheet and now we can do is add

columns and have sort of the agent go

out again not on a sort of like static

uh you know set of columns that were

predefined but our columns like things

we want to know kind of putting the

human back into the loop yeah and what's

cool is it's almost like every cell of

the spreadsheet gets its own AI agent to

get the data that we want which is

pretty incredible it's like a

spreadsheet on steroids okay let's go

let's figure out the funding that was

raised for each of these companies um so

I just put funding raised here um this

will be a web

search um I'll let them pick the format

that comes back here um and it looks

like they give a couple other

suggestions like employees and things

like that but let's just run this and

see what happens cool so you can see

each agent working we get the feedback

in every single cell um curious if

they're going to come back in one at a

time or all at the same time oh all at

the same time was an interesting choice

of um of kind of like the the Run button

was there like this Arrow up um there's

probably well first of all kind of uh

very interesting choice kind of like oh

send the agents are like up to the cloud

almost um sent them off uh and then

maybe as a second point maybe there's

some improvement kind of to make make

this button a little bit clearer like

how to actually you want people to click

this button basically kind of like Run

play Go yep yeah play button would have

been good too right um what's

interesting here is okay the data that

we get back is 470 million

73.6 it feels like it's missing um the

unit the unit for the condensed version

and maybe that was a setting that I

should have added you know I told it to

pick the um the the data type that was

coming back here um but I notice as I

click into each of these it shows me the

answer 6.6 it also shows the sources in

line and you can see open eye raises 6.6

billion so that tells me like okay we're

missing a b but um this is actually

really helpful that you can click into

each of these and see where it got the

information from this is another common

pattern that we see where if AI is going

out and doing a thing how do you know

you can trust the results that it brings

back you know sometimes it hallucinates

sometimes it gets the wrong thing and so

by having a source closely attached that

you can just you know Click on each of

these right here you can see immediately

where the sources came from it helps us

to be able to validate and Trust the

data that the AI agent is bringing back

totally and you know when you Googled

kind of in the past um you just had a

list of websites a list of basically the

references or the links and they were

your destination but now that you ask

kind of a chat chat box um and you get

the answer back you kind of want to have

the links um the references kind of like

inlined and I believe it was maybe

perplexity kind of to to do that pattern

first where you had like these little

round uh numbered dots right in line

with the answer kind of showing you

which segment um or which which fragment

of the answer comes from where and this

is a really nice pattern that um is sort

of like used here um and it could even

be used in other context or even be

inlined here um I guess in a spreadsheet

it works kind of to pull it out into in

into its own popover um when you are

kind of more uh looking for space and to

condense it then sort like the pattern

of having the footnotes almost directly

in the answer is a really really

successful pattern it's also interesting

you know you mentioned before about how

um you know we always had flowcharts and

these are like modern flowcharts with

the canvases and it's interesting too

that you know a lot of the citing

sources in the footnotes is not a new

thing that's been around since the

beginning of books but now it's actually

being used in a new way to actually

validate and verify information in real

time that an agent brings back which is

really cool just like an academic papers

of the past um footnotes you have your

references like from which paper which

data source you actually draw a

conclusion now we kind of see this

pattern more and more in software yep

well very cool thank you answer grid um

this is really well done all right so

next up let's take a look at another

kind of AI interface that I think we've

seen is pretty common which is you put a

prompt and you get some kind of output

or action that happens which I think has

become pretty common first up we've got

polyat polyat an AI product designer At

Your Service okay very interesting um so

you can design and iterate faster with

AI and get the production ready code and

ship faster all right so let's see my

first project sure let's start there

okay so we get the prompt box which is

the core element of uh of a lot of these

prompt to Output um interfaces that we

see um it says explain what you want to

build and um similarly to um one we

looked at before it gives us it looks

like a bunch of pre-built prompts that

are ready to go and it uh seems uh takes

multimodal uh input so um we see little

icons for the microphone uh and even an

image so it looks like we might be able

to upload a sketch for example of an

interface and then it will turn it into

like the actual thing let's try an

intermediate one this says create a

dashboard for a treasury management

software with a floating sidebo uh

sidebar with glass morphic collapsible

sidebar very specific and super dark

orange gradient in the background all

right that sounds on brand so let's

click our tiny little button here at

least it's orange at least it's

orange and Okay so we've got some

animations that are um trying to keep us

entertained and engaged says at the

bottom assembling pixels with tweezers

so I got some humor in the messages that

are popping up um but you know are we

going to be waiting 10 seconds or we're

going to be waiting 10 minutes here it's

a little unclear yeah they kind of

appear and disappear pretty quickly I

almost um for kind of more technical

audience um would love to see kind of

like a log to actually see kind of

what's going on under the hood and then

also kind of Step by Step just kind of

keeping it on the screen um so you kind

of see the progress of the machine uh uh

in the background yeah it's tough to

tell you never know um you know cuz

because this is one of the main

challenges with uh this new kind of

interface is it takes a long time to

generate from a prompt a very complex

output like an editable web page that

can then be safe to code or um a big

graphic or a video um you know a lot of

times you can go prompt to video and so

how do you keep the user engaged if it's

a short enough window where you can just

wait for the output or if it's going to

be you know three minutes five minutes

10 minutes longer than that you know

maybe you need a way to just tell them

to come back and that you'll email them

or message them or something when it's

ready and maybe some good prior art um

to borrow from there is today's uh kind

of uh uh meta search engines for flights

and they also take a while and they

already show you kind of in a low

resolution some early results and you

can always start to interact with the

filters ask and more results come in yep

okay great um wow this is uh pretty cool

it looks like we've got some some glass

I've never heard glass morphic before

but like the glass kind of UI uh I would

call it that yeah so so actually kind of

like the prompt is entirely there were

examples for the prompt but it was

basically completely free form and I

meant not be familiar with some of these

kind of design terms kind of glass

morphic or skoric or flat design or

gradient or whatever um and so having

maybe a a interface here um that gives

me sort of like selection and ideas kind

of almost like pills maybe these are

design terms that can almost like you

know like Lego bricks can like drag in

versus just I need to know these terms

and just type them or learn them from

the examples so there's kind of

opportunity to build kind of like a

richer prompt Builder yep I think part

of the power of these open-ended prompts

is that they can be anything they can

accept any input like glass morphic and

it knows what to do like it figured it

out it's like good news it can be

anything bad news it can be anything

exactly and and I think the worry as a

user when using these interfaces is that

you're worried you're going to tell it

something like glass morphic and it's

not going to know what that is and

you're going to wait for 2 minutes for

it to generate it and then it's it's not

going to be the thing that you expected

and you have to start all over again so

that would also be a really interesting

uh kind of design challenge um for the

output generated what are the things the

machine actually respected from your

prompt and where did it sort of like

ignore or struggle with kind of giving

that feedback back to the human and the

prompt maybe with like little squiggly

lines or kind of things or with color

kind of showing you know what did it

index on from The Prompt and execute on

and what did it sort like maybe your

fail if there could be sort of like that

feedback loop then it can help the human

to refine their prompt and kind of learn

how to you know interact with the

machine and help the AI to figure out

what it did well and should keep doing

and what it didn't do well and should

get more data to improve I think what's

cool here is like okay this seems pretty

interactive it looks like it may oh wow

there's even hovers here um so how do we

now get feedback to this design to

improve it it looks like there's an edit

button so I guess I can click into these

I wonder if I could give it another

prompt it looked like when you clicked

on a um one of the sub elements that you

can sort like prompt on a on a sort of

like module basis yeah let's see okay so

I'm editing uh what if I just took this

and what if I said okay explain your

changes okay so so so make this make the

sidebar blue mhm run revision okay okay

now we're waiting again hopefully this

will be faster hopefully it's like an

incremental change right where we can

sort of only submit to Delta and not

like do sort of like a single shot do

the whole thing over again um not just

for kind of to weight but also for you

know resources um preserve kind of the

the existing um design that we did like

and did want to turn blue um so let's

see how I can s like deal with diffs and

and for consistency too because you know

especially when you're prompting to

create Graphics one of the challenges is

if you want to change one element at a

hat on a person it's hard to keep the

rest of it consistent that's a common

challenge now and so if they're able to

do this then I I think that that speaks

pretty highly yeah so so so any

interface designer or technical team

kind of figure out the challenge of how

to kind of add sub prompts or how to

only change kind of iteratively um that

that that that's kind of like really the

the frontier it looks like it did it did

the rest change or did it no I believe

the the rest is is the same here awesome

well this is very cool polyat uh great

work another common pattern uh that

we're seeing with some of these new AI

interfaces is adaptive uh AI interfaces

and why don't you tell the audience a

little bit about um what that means yeah

I think some of the uh adap interface

that we see emerge um based on the

content of for example an email or

document the interface then dynamically

changes which typically isn't you know

static software typically wasn't the

case and so here it's kind of like the

input is the actual content and then the

output of the AI llm is then the UI to

interact back with that content and so a

really interesting space to emerge um

for kind of like you know looking at the

design uh challenges of that yeah it's

it's interesting to think about things

like Microsoft Word right where like the

thing that everybody is so familiar with

is a billion buttons on the top row

because they're never sure which one you

might need because they don't know the

context of how you're editing and with

AI now we don't need to show all the

buttons we can just show you the buttons

that are relevant the challenges of

course are kind of predictability people

love to have like their you know billion

buttons and like the exact place and so

let's let's have a look how this is uh

is solved and what the challenges are

okay so first up we've got zi and Zuni

is building a smarter email app for

Founders where you can stay on top of

your inbox With a Little Help from zi um

they've got a demo here so we're going

to go ahead and watch the demo and then

we we'll comment on that what if you

could respond to emails as fast as you

make decisions with Zuni now you can

Zuni makes you faster and more focused

by letting you decide not draft your

responses here in zi my try shows me

three important emails it knows I need

to act on let's start with the one at

the top here looks like I missed a call

and may want to reschedule on the right

hand side I can see responses that zi

thinks might apply

I'll hit Y to confirm a time let's say

1: p.m. tomorrow and hit enter to

generate my email looks great let's send

it and move on staying in control of my

inbox and ahead of my day okay well one

of the interesting things here I think

is is this interface on the side where

it's pulling up the user's email and

it's suggesting specific responses to

that email based on the content of that

email it's it's almost changing what the

reaction buttons are exactly and seems

like what's the right level of

abstraction to reply to an email in the

future so this still makes me kind of

process each email just maybe like a

little faster but I still have to go do

it um if you go like higher level

abstraction can s like an email inbox of

the future just kind of like autopilots

do that for me if we go all the way kind

of like down the ladder of abstraction

um we would talk more about s like AI

interfaces are more autocomplete and

this gets kind of right in the middle

sort of like of of the the levels of

Attraction where you know we process

email um there is no draft yet um but

there are sort of like predefined

prompts that are adaptive per email that

I can select um I think a really

interesting sort of like tweak on that

would be what if um sort of like the

best guest draft for that email would

already be sitting in the inbox for me

and then I can on a higher level than so

like text editing I can uh prompt it to

change it depending on so like what I as

a as a human in the loop actually kind

of feel um uh the answer should be yep

and and I think it's it's interesting

that they made these very high level and

then when you click on each one if they

need more information like you click

dismiss it'll just go away no other

context is needed um the first one here

it said confirm a call time and the UI

actually knew that that we needed input

of what call time should we suggest and

so it knows the next step based on which

response you want to give which it's

showing you a custom UI for how to

respond and not showing you a list of

five text boxes of which maybe one is

the one that you should fill out and

respond MH and the little detail here is

because we're already with hands on the

keyboard basically like in email mode um

uh and processing the Inbox and typing

anyway um being able to access all these

adaptive kind of like uh uh options by

just keyboard shortcut with a single

letter um is uh is is really on point

yeah you had a great Point earlier

around if buttons are moving people

value consistency and remembering oh

that button is there and I go here to

click it and that is actually

interesting where the buttons and the

responses are technically changing for

every single email but the the keys that

you're pressing do not and so you can

kind of keep your hand right there and

and know what to expect each time now

one kind of interaction design challenge

with these hot keys that are not like

with a modifier key like a keyboard

shortcut so command C for copy for

example has a modifier key but these are

just hot keys where I just PR press y it

just does it right um but what if I

think that my cursor is focused

inserting text and I want to kind of

reply yes then basically my first y

keystroke like submits a button right

and so there's always this challenge of

really being very clear when an input

element is focused and you're typing

versus now typing on the keyboard will

just do stuff in your UI yep adaptive

uis of the future here there you go

awesome thank you zi very cool we've

seen some voice examples let's take a

look at a video example now this is

argil argil doai and they have a product

that's almost like an AI uh video Studio

to create production quality videos um

this actually has a deep fake AI version

of me incredible and so um what we can

do here is we can type our own custom

script right here into this box and make

it say whatever we want create different

blocks where we can change camera angles

for each of these uh different settings

and we can change the body language too

so for this one where I say Here I Am uh

pointing to myself for this point I have

selected the point to myself example

here you have to select it manually yeah

so I selected it manually you could

imagine in the future they would

autodetect it right right that would be

really interesting it's cool here that

as you hover over these you can see the

different uh samples of of what will

happen yeah you can almost imagine how

you could highlight certain parts of the

script and then from a drop- down there

kind of choose suggest but then also

kind of standard um uh part of this this

Library kind of like you know just try

point to myself I'm I'm I'm what is it

I'm crushing it um or I'm I'm crushed um

and so so there's like a lot of kind of

like interplay with the text interface

to the left yep all right so let's hit

play and and let's watch it hi I'm Aaron

and today we're going to make a new

design review video All About AI

interfaces here I am pointing to myself

for this point cool so the the voice was

was very good that sounded like me and I

typed a bunch of text to get that and

what was the training data to get to

create your deep fake yeah so they just

need a few minutes of uh video of me or

whoever talking and then they can

basically process it automatically in

their models to create their deep fi and

you were saying something completely

different than this year that's

incredible and so let's let's um so if

this is all AI generated let's try to to

uh change change your name here yeah

let's say I'm Raphael and

today so

processing hi I'm rapael and today we're

going to make a new design review video

All About AI interfaces here I am

pointing to myself for this point okay

and so the other thing that you notice

here is the video is very blurry and why

is that and it's got this thing over it

and that's because the easier or the

faster part in generating this is

actually in creating the voice and the

hard part is it takes many minutes to

actually process and generate the video

with the right lip movement to match the

text that you've entered and so rather

than showing you something you know lips

moving that is off from what you what

you've uh put in they first show you

just kind of a blurry version with the

audio so you can get a sense of like

what it's going to be like then you

click generate here and then that you

know it says 12 minutes right here is

how long it's GNA take so so they're

trading off um basically uh Fidelity for

immediacy and basically putting the

human kind back in the loop because if

it was just a generate button right we

would wait for 12 minutes uh figure out

that something is not quite right and

then kind of like you know give the

machine a new prompt and wait until it

comes back so this is a really uh clever

trick uh to really kind of create this

iterative human machine collaboration

interface um and I think blurring the

video is is a is a is a great uh design

uh approach to do that here's a version

that I've generated hello my name is

Aaron and today I'll be sharing

incredible news straight from the YC

part ners we're moving YC to four

batches by year this is one of the

biggest changes YC has made since it

launched in 2005 here's the thinking

behind it previously if you so there you

go you can see it looks like me it has

the different uh Expressions the camera

cuts the lips match the the voice that

was added pretty incredible yeah I think

trading off Fidelity um for latency

really putting the human in the loop um

to iterate uh on the script very quickly

um while sort of like the the full-on

generation of the video can be uh happen

later that's one thing the other thing

is kind of opportunity um to potentially

bring some of the selection UI kind of

closer to the input itself the text um

uh I think there's a lot of room there

to grow yeah very cool awesome thank you

argel awesome well these were all

incredible examples what are some of our

takeaways from some of the interfaces

we've seen today yeah well I think when

we start first kind of started to get

this llm technology everything was sort

of like a chat box and people just kind

of like prompting it and now within just

a few short like like a few short months

or or one two years we see this

explosion of AI interface and AI

components that really kind of are built

AI natively um totally different

modalities how to interact with this new

teolog with the llms um and really just

endless opportunity for uh iteration and

uh building a new world of software yeah

I I think you said it really well at the

beginning here which is you know these

are all verbs we're creating videos we

have agents going out executing tasks

and so much of it is how do you keep the

user in the loop and in control while AI

does its magic and we've seen some

pretty amazing interfaces to get that

level of control and and make sure it's

doing the right thing that leads to

incredible output that would have taken

days years it almost feels it almost

feels like back in like 2010 or so when

when touch um devices really kind of

came on the market and everything had to

reinvented kind of Touch first and we're

at one of those moments again where like

all of software all the components that

we kind of took for granted um they are

really being reimagined and reshaped by

the builders and startups and designers

out there right now future is going to

be incredible yeah yeah well thank you

to everybody who submitted these are

pretty groundbreaking uis that I think

are going to continue to evolve over the

next decade and really excited to see uh

where they end up so thank you Raphael

for joining me for this episode and uh

we'll see you on another design review

thank you so much

[Music]

Loading...

Loading video analysis...