Doing this manually means opening After Effects, tracking down an up-to-date UI template, and keyframing every single text bubble popping in. Animating a "typing..." indicator character-by-character is miserable work.
I got tired of doing it, so I built GetMimic.lol to automate the process.
You pick a platform UI (it currently supports about 35, including iOS messages, WhatsApp, Twitter, and ChatGPT), type out your script, and it renders an MP4 video of the conversation playing out.