Hello there 👋. I wanted to write about my experience building an end to end product prototype using AI and Replit in 2 days, and winning the Craft Ventures hackathon.
I am passionate about democratising design AI for everyday people. I have been tinkering with a few ideas to this end on the side for a while now, and decided to participate in the Craft Ventures Hackathon last week. I got great feedback on this prototype. I ended up winning the hackathon in the non-technical track. I started off with zero followers on Twitter, but thanks to Amjad’s shoutout, the news spread. I was super thrilled to see the Twitter community respond with so much love.
I got a lot of questions about how I built it on Replit whilst having AI write 100% of the code. I wanted to share a few top of mind takeaways.
Here is a TLDR-
Build technical intuition. Your role in using AI to code is that of a verifier. You cannot verify something you do not understand.
If your goal is rapid product development, Replit has the best ROI on build velocity for the time investment.
Develop a clear vision for what your north star is. Break them down into atomic units of verifiable end states. Prompt AI to help you develop each unit, and continue the dialogue until you have reached that end state. Assemble these units together for a fully functional product.
Before I dive into each of those points, let me give you a snapshot of what I built and why. At the hackathon, I built a product called DocuTok. It takes a Word/Google document as an input and transforms it into a multimodal AI generated content feed. The content feed is generated from the POV of a personality chosen by the user. I had worked on a product called Designer at Microsoft, starting within PowerPoint and spinning it off into an independent application. Death by PowerPoint was a phenomenon I first learned about when I was working there.
I saw first hand how an average user struggled to create engaging content with tools like PowerPoint and Word. They typically have an excellent vision in their head, but what gets translated into pixels ends up looking ugly, boring and fails to engage the audience. The tools are to be blamed. They have not evolved to meet emerging forms of storytelling and communication. The gazillion steps you have to climb to translate your ideas into reality makes it inaccessible to everyday people; a teacher at a school teaching GenZ kids, a PM at a large company, a two person manufacturing consulting firm etc. I wanted to fix that with DocuTok. It is very much a prototype at this point, but I wanted to anchor my hacking to a clear customer need. Here is a tweet version of what is happening.
Here is the demo video I submitted for the hackathon. (The video says DocTok; and then after recording the video I realized people might think it has something to do with doctors. I changed it to DocuTok the next day.) 🥲 .
Build technical intuition. Your role in using AI to code is that of a verifier. You cannot verify something you do not understand.
Huh? What does technical intuition mean? I thought AI can write all the code? Ok, let’s back up a little bit.
Let me walk you through my background. I am a product manager, currently @ Waymo. In the past, I worked @ Microsoft and Snap building assistive design products. I have an undergrad degree in CS and immediately went to grad school for Engineering Management. I have been a product manager since. I’ve a good understanding of programming concepts, system design and strong technical intuition. But, I’ve not coded in years, since undergrad. I have tinkered on the side now and then. Nothing complex, but it involves a ton of reusing of publicly available code, drowning youtube tutorials and still having an exception staring at me. I cannot code an application on my own, all alone today.
Being technical is a spectrum. Those who are absolutely non-technical will reject me from their camp for knowing too much, programmers who code for a living will reject me for knowing too little. I am somewhere in between. I feel like Mrs. Figg from Harry Potter; a squib that understands the magical world but can’t cast a spell. With LLMs, I have an elder wand that helps me escape death by runtime errors. (If you do not understand Harry Potter references, please close this tab and go learn. Kthanksbai.)
In a nutshell, invest in honing your technical acumen in system design (this will come with practice, as you continue to build). Learn how web applications should be built; What frameworks should you use? How can you string results from different APIs together? How should a function be constructed, what are the right arguments to pass? What are the failure modes? You get the gist. Worry not, you can ask AI to help with building that understanding. I usually ask GPT and Ghostwriter to add a comment to each line explaining what they are doing. Then, if you find weird things - call it out and have it make changes for you.
Think of it this way - you are the CEO of a company and you are not going to be 100% skilled in every job that needs to be done. Coding is one of them for which you are recruiting someone smarter than you (the AI). You have to be able to articulate what you want of them, how they need to deliver it to you, and whether the output is acceptable or not. You cannot make that call if you have zero clue about what that end state is going to be, and an understanding of the means to achieve that end.
Use Replit. It has the best ROI on build velocity for the time investment.
Replit is a browser based IDE that allows you to code, run and deploy programs in various languages from your browser, without the need for any local installation. I have tried coding locally. I wasted a lot of time simply trying to get all the right packages installed, and still swimming in errors with no end in sight. This was my constant state of mind, then.
Replit has instant set up - no running around in circles trying to set up the right environment and installing packages. Import the right libraries, and hit “Run”, the packages are drop-shipped to you immediately. I like how the product is elegantly designed. Secrets for env variables, ghostwriter to help code, integrated webviews and console to test your code. I like that you can view other repls, fork it to play around if you are inspired. Alternatively, I have a ton of private repls because I didn’t think what I was building was good enough to share yet. The hackathon has been a great confidence boost and a lesson for me to share more =). You can also boost your repl, allocating more computational resources. They have deployments, databases (haven’t tried this, I am using firebase), git and everything in between. I guess they are like the everything store for software development. Replit has its gaps and it has died on me many times. But, I have been using Replit+AI for a year now for a few projects, it is way more accessible than the alternatives.
Break your application into atomic units with clear end states that can be validated.
You could use AI to find inspiration, and build on the fly. But if you have an idea already in your head, develop a clear vision for what your north star is. Break them down into atomic units of verifiable end states. Prompt AI to help you develop each unit, and continue the dialogue until you have reached that end state. Assemble these units together for a fully functional product.
Eg) Write a function in python to take text as an input, and get a summary from GPT4. This is simple enough for you to iterate back and forth on. The end states to verify are:- Is the function taking text as input? Is the API connection working? If not, ask chatgpt to add error handling to interpret API callback errors. If we get an output, does it meet the expectations? If not, do we have to fine tune the prompt?
In this case, the text input is hardcoded in the python code. So we just have to verify if the E2E API connection is working as intended, and the output is satisfactory. The code generated here does not use GPT4, so we will have to tweak that.
As a next step, we can prompt it to tweak the code to get text input from the user directly.
When we have a successful E2E connection + a way to get user input in python and display the result in the console, add more layers like an HTML page to take the text input, and display the text summary in a web page. This involves changes to what you pass as input to text summarization function in the backend, and text input handling module in the HTML pages. You will need to write prompts to make these edits. You could add all of the requirements in one single prompt. The issue is the code that it generates may have errors, and you might exceed token limit if you want to paste the long code snippet + your questions back and forth for debugging. Splitting into smaller units makes it easy for you to manage. I also paste any errors I get into chatgpt and help it debug for me.
With every atomic unit that works the way you want, DO NOT FORGET TO GIT COMMIT. Replit has git, and it works okay. You’re probably doing this already, but I kept forgetting. You definitely don’t want to copy paste a bunch of stuff from chatgpt/ghostwriter, lose track of the changes you made, shift away from your success state without a path to bounce back to a version that works.
These are some of my top of mind thoughts. If you have any questions/comments/feedback let me know. Happy to answer. If you are working in this space, please come say hi. I would love to chat.
Fin. ^_^
Note: This post was published in a different hosting platform. I recently migrated to substack, and lost comments from the article. :(
Super Priya. Lots of great points here. LearnCan is building something similar in education - multi modal based on student challenge levels and learning style (auditory, visual, kinesthetic etc).
Would love to see a video of you breaking down your approach using Replit using a worked example of a product.