Artificial Reality

By Sherveen Mashayekhi

24 Apr

Building AI muscle with voice notes

03:39

I figure I'll start to do daily or at least semi-regular voice notes about the AI muscle that I'm building, which I think of as what I do every single day, using AI almost constantly throughout the workday, trying out all the latest tools, and then soon to be launching an actual content and course brand called AI Muscle.

But just to kick that journey off, yesterday was this further exploration and figuring out workflows and optimizations around using LLMs, like a ChatGPT or Gemini, to come up with great prompts and briefs to then use in image generation models to create really, really useful artifacts.

So as an example, I'll go to ChatGPT nowadays and say, hey, here's a sort of flowchart I want to build, the style I want it to be in, the fancy ways I want it to look, and can you create a creative brief? And whether I use it in 4.0's image generation model or bring it to mid-journey, if I tell it to change the brief based on which image generation model I plan to choose, it's a really quick way for me to get something really accurate, then go to the image generation model, get something really spectacular that would have taken me iterations or taken me time to write the proper prompt, and then I can just chunk that great brief or that great prompt into multiple image generation models.

There's some stuff I'm doing on the video side too now, related to some YouTube content. And so starting off on this notion of using LLM, and then figuring out how do I set up browser shortcuts, keyboard shortcuts to make it really easy to go from LLM to the image gen models, and then potentially to these video models to make really great use, especially as someone who's oversubscribed to way too many of these tools, but I always want to be trying all of them, always want to be testing the boundaries of them. So that's the journey today.

So one of the things I've done, as an example, this is actually have a mid-journey extension that I built with windsurf. So use the AI to code up a Chrome extension that allows me to automatically put different parameters that I use really, really often in mid-journey. So this can be things like whether or not I use my personalization code, or, you know, there's a chaos filter that says how different should four images be from each other when you generate four images at a time. And I might set that to zero sometimes or 50 other times or 100 other times. And so I have a button that just allows me to press that rather than having to open a sub menu or type something out.

So now today, I'm trying to figure out how do I further build on that extensions concept, but in the notion of chat GPT, and then for those image generation, to get a lot of, again, very quick workflow optimization out of the way.

And then last night was an exploration of Gemini 2.5 pro's deep research, which actually, a lot of people on x have been saying is as good as opening eyes, deep researcher better. I don't think so. I think it's interesting in some ways, just like the one that wasn't based off of 2.5 pro, it does sometimes a lot more reading of the web than I think opening eyes, deep research product right now. But from an analysis perspective, and an insights perspective, and all the kind of edge of an LLM doing synthesis, I still think actually, deep research and chat GPT is far better.

So that's the note for today. And again, I do this sort of very deep exploration, and I have been since, you know, GPT 3.5, you know, hit us with that when every single day as a mega superpower user. And so this will be a way to record it both for myself, but I'll probably also press the share button on this moving forward.

Publish with Voicenotes