GPTs: Chat is all you need to build AI apps

GPTs: Chat is all you need to build AI apps

·

18 min read

Last week, Open AI (the company behind ChatGPT) announced a new capability called GPTs (we will sometimes refer to these as apps as well). This allows anyone to build extremely powerful AI empowered conversational experiences, in minutes, with no programming or AI knowledge needed. This article describes everything you need to build your own GPTs.

Background

This is particularly exciting to me, because it realizes a vision that has been dear to my heart, but evasive, for a long time - allowing anybody to create custom software experiences they want, without requiring months of complex developer work and thousands of dollars. GPTs are the realization of that vision, and much more.

No code app platforms

The vision for meon - a no code web app platform I built around 2014 (no longer active):

No code apps have proliferated in recent years. I asked ChatGPT to summarize popular no-code platforms today (it searched the web) and it gave me this:

There are many no-code platforms today, each catering to different needs and skill levels. For complete beginners, platforms like Softr and Glide provide an accessible entry point. Those seeking a balance between power and user-friendliness might opt for Bubble or Draftbit. For advanced users who require detailed control over data and infrastructure, Backendless and Mendix offer more sophisticated capabilities. Business-oriented platforms such as Microsoft Power Apps and Appian excel in integrating business processes and AI into app development. Meanwhile, multi-platform development is facilitated by solutions like FlutterFlow and OutSystems, making them ideal for creating apps across different devices and operating systems.

November 30 2022: ChatGPT

The world changed irreversibly on November 30th 2022 when OpenAI released ChatGPT. This made it possible for people to just talk to an entity that understood what they wanted, and did its best to give them what they needed. Over the last year, the whole world grappled to come to terms with this. People got more productive in their personal and work lives using this new digital entity that knew almost everything about everything and was kind and was always cheerfully eager to please every kind of user ask. OpenAI itself continued to improve this to give us an enhanced GPT-4 (which cost a small subscription upgrade to use), an ability to build programs that talked to this entity, tooks steps to make it safer. AI model competitors scrambled to create worthy alternatives, getting close in some areas and not so close in others. Some others invested in and partnered with OpenAI to better integrate with these new capabilities.

To learn more about this, see my recent posts How to pick a Chat Model (for users) and How to pick a Chat Model API (for developers).

Businesses were all affected by this event as well, and adapted to realign to this new world order, venturing into entirely new experiences or extending their products and services to integrate with these capabilities.

November 6 2023: GPTs

Less than a year later, on November 6th 2023, as everyone struggled to align with this disrupting change, the world changed irreversibly yet again, when OpenAI revealed something they call GPTs in their OpenAI Dev Day event keynote (video). This made it possible for anyone who knew to chat to use a tool from OpenAI called GPT builder to create their own custom flavors of ChatGPT (called just GPTs) for anything they might fancy, optionally uploading documents or specifying web links to use as knowledge for the GPT, and for more advanced use cases, to optionally allow the GPT to connect to external sources of knowledge. This made it possible for many if not most narrow experiences to be built as GPTs, by anyone who knew how to talk to a chat model. The dream of spinning up a specialized agent that can fulfil a specific goal, in minutes, has become reality.

We will talk all about creating and using GPTs in a minute, but OpenAI also announced several other things, notably a feature called Assistants, for developers to build new experiences or integrate their existing products or services with one or more such GPT like instances. They also announced several improvements that users of ChatGPT needed in an enhanced new version, now available to everyone who has the subscription (ChatGPT Plus). We will talk about Assistants and related developer features in a subsequent post.

Creating GPTs

If you have never used ChatGPT, please go here and try it out. You may also read my previous post How to pick a Chat Model for useful tips on how to use it.

Note: if you want to build your own GPTs, use others' GPTs, or use the latest powerful model for ChatGPT which is far superior to the free ones, you would need the ChatGPT Plus subscription that you can get by upgrading (for a fee) from within the ChatGPT app.

You can build several kinds of GPTs. I will walk you through creating a GPT called DriveSmart that will help students to prepare for a US state driving test. This, like most other GPTs that we could build, would otherwise have taken several days or more to build before, and even longer to deploy out into the world, requiring developers with knowledge of coding and sophisticated hosting systems. Creating a GPT like this will just take minutes.

We will start with a simple first cut version of this app just by chatting, and then see how we can further enhance it.

GPT Builder Create

Start by signing in to ChatGPT (best from a desktop or laptop browser, like Chrome or Safari or Edge from a PC or Mac, but might work with slight differences from a mobile device browser as well).

You can get into the GPT Builder by clicking on "Explore" that will show up in the top left area.

On the page that comes up, you can see the GPTs you have created or added to your account, under My GPTs, as well as the different GPTs from OpenAI (and others in the future). Under My GPTs, there is a plus button to create a GPT. Click it.

This launches the GPT Builder. You will see two panes - the left pane lets you create your GPT, while the right pane shows a preview of the GPT that you created. The left pane has two tabs - Create and Configure. The Create tab lets you build your GPT using an entirely conversational experience. The Configure tab lets you configure the GPT. For this first cut, we will just use the Create tab.

We are building a US driving test prep GPT. Simply express what you want clearly, in words:

I want to build an app to help students in the United States learn and practice questions for their theoretical driving test.

The GPT Builder then comes back with a suggested name for the app. You can keep it or change it by talking more back and forth, or specifying a name you would like.

Once that is settled, GPT Builder will use its image creation capabilities to generate a profile picture based on the theme of your GPT. You can again either change it by telling it what you want, or you can upload your own image, or you can accept it. It will then ask further questions about the app. Just answer them according to your preferences, until you have everything you need. Use your own language, but here is a particular instruction I gave:

Your goal is to make the student, whatever their initial level, be able to finish by being fully prepared to ace the test. Use your knowledge of the test syllabus (or find the most recent curriculum using web search by asking the student for the state or curriculum link). Ask the student questions initially to assess the student's current level as well as whether they want to learn or just practice questions. Give them practice questions till they are confident, then give them a full practice test, and tell them where they did right or wrong. Help them in areas they did poorly, and repeat till they ace the test.

GPT Builder asked me if there is anything to emphasize or avoid. Since my app may have a lot of teen drivers, I asked it to avoid presenting too disturbing or gruesome facts:

Many of the students will be teens, so try to refrain from distressing information presented in disturbing ways - for example, gruesome details of mortalities due to distracted driving. Instead present the information in a less disturbing way.

Then GPT Builder asked me how to handle situations where it needs clarity - to make a guess, or to ask follow up questions. Remember to give as much information as possible to explain what to do and also why. It will make the model perform better. I responded with the following:

Always ask follow up questions if something is unclear. Making potentially wrong guesses can cause the conversation to go down a path that can be frustrating.

Then, it asked me what tone to use - formal, friendly, humorous, etc. Here's what I said:

Keep a friendly tone, add occasional humor, but use a precise language when explaining things. Also, guide the student more when they are struggling, but challenge them with harder questions when they are doing well.

At this point, GPT Builder said it was all ready, and I could try out the preview and see how it goes - I tried it out, it looked good (see the right side preview session below):

Once you are ready, you can use the green Save button on top right to save the GPT you just built. You have these options:

  • save it only for yourself - choose this if you want to use this GPT only for your own personal use.

  • save it for anyone who has the link - choose this if you want to share this with others (currently, it looks like the others you share it with need to have a Plus account to use it as well, but this might change in the future).

  • make it public - this will get it into the OpenAI store when it launches, and anyone can find and use your GPT.

Make your selection and click Confirm.

This will then publish your GPT, and take you to ChatGPT where your GPT will be open in a new session.

Note how it also automatically added some examples for the user to get started on the right - these are called Conversation starters.

Tip: At this time, the preview session you had in the GPT Builder is lost. If you want to save it, copy the relevant text for yourself before saving or publishing the GPT from the GPT Builder.

Notice the new GPT DriveSmart appears in the list of GPTs on the left - you can click on it any time to start a new session with that GPT:

If you want to share the link to the GPT with someone else, use the dropdown on the top of the GPT chat panel and copy the link to your clipboard, and share it with others.

And that's it! You have a fully functioning GPT created with no code, and just having a conversation. You can build a whole bunch of apps with just this approach in a few minutes, something that would otherwise have taken a lot more time and effort to build before.

GPT Builder Config

When you created DriveSmart, everything you said to the GPT Builder from the conversation in the Create tab was captured in the Config tab of GPT Builder as the configuration for your GPT. We will now understand the elements of a GPT's config by looking at the config of the DriveSmart app. Most aspects of your GPT can be specified or updated from either the Create or Config tab dynamically, and the config will update to reflect it.

To edit a GPT you created, you can open the GPT in the main ChatGPT window, and select "Edit GPT" to take you back to GPT Builder for that GPT.

Click on the Config tab in the left pane. These are the fields you see there:

The profile picture, the name, the description and the conversation starters show up in every new chat window with your GPT, as indicated by the red arrows above.

You can click on the profile picture to change it, either with your own, or with another one created by Dall-E (for finer control, switch back to the Create tab and converse with GPT Builder to refine the profile picture).

You can change the name, description and edit or remove or add conversation starters and they will change in the chat window as well.

Instructions are the instructions that define your GPT that GPT Builder generated based on your conversation in Create mode. You can review these in the config, and if needed change them directly here. Or you can change them by giving further instructions in the create tab.

Tip: Check your instructions from the config tab to see if they match what you want each time before publishing your app. If you give a lot of complex details in your conversation with GPT Builder, it may lose some key earlier aspects of the app that you asked for.

Knowledge refers to some knowledge you can give your app by uploading some files. For example, in the DriveSmart app, let us say I want to update the app to only support driving tests for Washington state driving test. From the config tab, I can click on the Upload files button, and select the Washington State driver guide pdf. This file is now used as knowledge for the app. I may want to change the instructions, name and description, maybe even profile picture to reflect this. Here is another app I built that is specifically tuned for driving prep in WA state. This app uses only content from the given guide to help students prepare for the driving test.

Note: Please always be careful about the documents you download and upload. Check that you have the right to download the document from wherever you find them. Also note that once uploaded, OpenAI might see the documents you upload under certain circumstances. There are many complex nuances to this, and you should consult a legal professional for advice.

Note: GPT builder refers to any documents in its scope as knowledge. This includes the GPT level knowledge documents that you upload for your GPT in the config tab, which is available to . It also includes user specific documents that the user may upload while interacting with the GPT. The GPT chat window will often show a status like "Checking knowledge" - this could refer to the global or user specific knowledge

Capabilities are different, well, capabilities that your GPT can choose to include and make available when your users interact with it. Currently there are three choices (you can pick none, some or all of them):

  • Web Browsing is the ability for the GPT to search the web when it needs some additional information when interacting with your users. ChatGPT (and custom GPTs you build) use a model that is trained uptil some fixed date in the past (currently, April 2023). If your app needs information that is more current than that, you would need to turn this on. For example, if you build a movie recommendation app that needs to find what is running in local theaters or streaming on Netflix, you should turn this on.

  • DALL-E Image Generation is the ability for the user of your app to have images generated during the interaction. For example, if your GPT is a blog creating app, and wants users to be able to generate cover images for the blog, you would enable this capability.

  • Code Interpreter is a capability that allows your GPT to run code - for example, if you have a data analysis app where your user can upload some files and have it analyze the data and even generate charts and graphs based on it, you would turn this on. Note: You do not have to see or write any code for this - all that happens behind the scenes for you - it is just the capability of GPT to run some code and generate some results.

Your GPT will decide when it needs to use these capabilities and choose them automatically.

Tip: Previously ChatGPT Plus users had an option to pick these capabilities at the beginning of every conversation from a dropdown on top. With the recent update, this has been removed. However, they can get some of this by using some of the GPTs made available in the Explore tab:

  • Dall-E focuses on Image generation

  • Data Analysis focuses on the Code Interpreter capability (running python code, analyzing data, generating visualizations)

  • ChatGPT Classic - this disables all capabilities, notably the ability to disable web search, which many users missed with the new update.

Actions in the config tab are a powerful and more advanced feature, which allows your GPT to call out to other programs (via interfaces called APIs) from within a GPT session. For example:

  • a weather forecasting GPT app can call out to a weather app to get weather for a city that the user asks for.

  • you can build GPTs that integrate with several apps that you might use like Google Drive, Gmail, Slack, Notion and many more via API aggregators like Zapier - see Zapier actions. See here for instructions on how to build GPTs with Zapier actions.

Your GPT can both fetch data from external apps and send out data and perform tasks within those external apps.

Note: actions are a powerful feature that enables you to build much more powerful GPTs without technically needing to write a single line of code, but this nevertheless requires some degree of technical sophistication to use properly. I will create a separate post about that for advanced technical users in the future. For now, just know that this is an option to explore.

Here is a link to the DriveSmart app that I just created. If you have ChatGPT Plus, you should be able to use this to try it out.

Tip: If you update an app from the Create tab, it may overwrite some things in your config. For example, after some changes in instruction in the Create tab, the GPT builder decided to update the profile picture - I liked the original one, but could not get back to it, no matter how much I tried instructing it on the Create tab. It also regenerates conversation starters at times. To avoid such things from happening, when you start a GPT Builder create or update session, you can instruct it not to ever change the things you want unless you specifically ask for it. For example: "Do not change the profile picture, name, description or conversation starters unless I specifically tell you to. Ok?" . And continue only once you have done that. This prevented GPT Builder from automatically changing those things.

What can you build?

In this new world of GPTs, we can think of levels of apps (from simplest to more advanced) that can be built:

  1. Apps that use ChatGPT's own capabilities and knowledge: several apps, like the original DriveSmart app above, can be built using just the knowledge within GPT's "brain". Pick an area of interest to you and find use cases that would benefit from having AI apps like these. There are also several options based on the capabilities available in the GPT builder. For example, if you are into food blogging, you can build an app that helps you create blog content using, perhaps the web searching capability to go out and fetch new restaurant menus and recipes, etc. You could also create a separate app simply to build food images for your blog, by enabling just the Dall-E Image Generation capability. You may also want to analyze some local restaurant data and share the insights, for which you may have an app with the Code Interpreter capability enabled. Stop now, and think of your own interests, and come up with ideas that you could build with just the simple GPT Builder

  2. Apps that use app global knowledge: These are apps that could use global knowledge that you, the builder uploads while building your GPT.

  3. Apps that use user uploaded knowledge: These are apps where user can upload their own documents as knowledge. These are a bit more complex because you have to provide additional instructions to ensure that only the right documents are uploaded, that your GPT informs your user how and what to upload, and ensure that the GPT understands how to use that knowledge

  4. Apps that use available actions: We can also build GPTs that interact with a variety of services (as of today, Zapier provides connections to over 6000 external apps). For example, you might build an app that searches your email platform for mails related to your online expenses, and provide insights on financial planning based on that.

    Warning: Developers only beyond this point

  5. Apps that use custom actions: You could build your own web service that can then be called as an action from within your GPT builder. For example, if you have a database containing some data which you want to access from your GPT, you could expose a web service that provides certain interactions with the database, and have your GPT call it over the web.

  6. Assistants based products and services: Assistants open up a huge number of possibilities for developers to build custom apps that can programmatically call one or more custom GPT like entities called assistants. Without getting into too much detail, an assistant is something you can create effectively equivalent to creating a GPT, but you can create it from the OpenAI playground or even programmatically. Then, your app (not a GPT, but your product or service that wants to use GPT like experiences) can programmatically call these assistants that you created. This does not require a ChatGPT subscription, and notably the pricing model is based on usage.

I will talk in much mored detail about action based GPTs and Assistants based programs in future posts for more advanced technical users and developers.

Why not just use ChatGPT?

ChatGPT is an AI model that has a lot of clutter in its "brain". Imagine a person who has read a lot and you ask them something broad, they may get confused and answer from some context that you did not mean. If you ask them specifically in a context, they will provide a much better response. ChatGPT is like that - it has a lot of knowledge (much more than any human) and so it will help to provide it instructions on "which part of its brain to use" in a given context. GPTs provide a way to narrow down the focus of the experience for a specific use case. Your users could still use ChatGPT to achieve some of these, but they would need to provide the same set of instructions each time to achieve the same use case. GPTs provide a way for users to directly jump into getting their task done.

Summary

OpenAI's new GPTs (aka apps) feature allows anyone to build powerful AI-powered conversational experiences without programming or AI knowledge. This article walked through the process of creating a GPT called DriveSmart, a driving test preparation app, using GPT Builder and its various features. GPTs can be built using ChatGPT's capabilities, global knowledge, user-uploaded knowledge, available actions, custom actions, and Assistants-based products and services. GPTs are more focused and context-specific compared to ChatGPT, providing a more tailored experience for users.