GPT-4: Generative Pre-trained Transformer 4
I”Generative Pre-trained Transformer”, known as GPT They are a series of language processing models created by the American Artificial Intelligence laboratory OpenAI.
GPT models grow and learn thanks to artificial intelligence, which is fed with various data, texts and numbers, drawing on a large database of information.
GPT-4 It is the successor of GPT-3.5, the processing model behind the popular ChatGPT tool, with which it shares very similar characteristics, but with some big news. The main one is the significant improvement in GPT-4's ability to understand natural language and generate responses. This is mainly due to the larger size of the GPT-4 model and the large training dataset used.
GPT-4 is therefore able to provide more accurate and relevant answers than its younger brother, and is also more versatile in managing a wide range of domains. It also has improvements in understanding the context - since it accepts much longer texts than ChatGPT (going from a “memory” of 6 pages to 52) - thus arriving at an ability to generate more articulated and complex contents and find information in very large documents.
The latest innovation is the multimodal functionality, which means that it is able to understand information from different modes, such as images.
This ability to process information from different sources allows GPT-4 to interact more effectively and naturally with users and paves the way for many new applications. OpenAI states that “GPT-4 can solve difficult problems with greater precision, thanks to its broader general knowledge and problem solving skills.”
The architecture of GPT-4 is similar to that of its predecessors: it's a transformer, probably just a decoder, and it does next-token-prediction.
If one of the limitations of GPT chat was the length of the input (4,000 tokens), GPT-4 has a context length of 8,192 tokens (there is also a version with limited access of 32,768 tokens).
GPT-4 and multimodality
An interesting use case on the topic of multimodality is the search in documents where there are images. Let's imagine a bot inside a technical documentation that includes instructions in the form of images of procedures or interfaces. With a multimodal approach, images can be managed together with text, making it possible to respond contextually on both modes.
How to access GPT-4
The GPT-4 API is now accessible to all paying customers using the API and promises unprecedented capabilities when it comes to language models, offering a unique opportunity for developers.
Here are the main updates announced by OpenAI:
- General availability of the GPT-4 API.
The extraordinary power of the GPT-4 model has stimulated innovation in the creation of innovative products in different sectors. Access to the API is now open to all developers who are already using the API and who are in good standing with payments. By the end of the month, the API will also be made available to new developers. Subsequently, the speed limits will be gradually increased based on the available computing capacity.
- Expansion of the GPT-3.5 Turbo, DALL-E and Whisper APIs.
Based on the stability and availability of these models for production-scale use, OpenAI is making the GPT-3.5 Turbo, DALL-E and Whisper APIs generally available. In addition, the company is working to securely enable the fine-tuning function for GPT-4 and GPT-3.5 Turbo and expects that this functionality will be available later this year.
- Retiring old models in the completion API.
In order to optimize computing capabilities, the oldest models in the Completion API will be phased out by the beginning of 2024. These will be replaced by new versions, such as ada-002, babbage-002, curie-002, davinci-002.
- Focus on the chat API.
We have moved on from the completion API and chat API. Currently, the chat completion API represents 97% of GPT API usage. This change in approach offers better results, greater flexibility and specificity in activities and interactions. In addition, it helps to reduce the risk of attacks such as the prompt injection process, as the content provided by the user can be structurally separated from the instructions.
- Code Interpreter available to everyone.
This is the “coding ready” version of GPT-4, equipped with three new features. The AI can read files uploaded by the user directly into the browser, with a maximum size of 100 MB. In addition, it can allow the download of files and offers the possibility of running your own Python code, creating “run-time” AI models for small datasets. This system works excellently even for those who are not familiar or have specific technical skills.
Popular use cases
Duolingo allows premium customers to have language conversations with GPT-4. The project is called Role Play and Explain my Answer. Duolingo's GPT-4 course is designed to teach students how to have natural conversations on a wide range of specialized topics. Duolingo introduced these new features in Spanish and French, with the intention of extending them to other languages and adding more functions in the future.
Government of Iceland
The Icelandic government also uses GPT-4. The Icelandic government is working together with tech companies and OpenAI's GPT-4 to maintain and use the country's native language. Now, 40 volunteers supervised by Vilhjálmur Þorsteinsson (CEO of the language technology company Miðeind ehf) are training GPT-4 with learning reinforced by human feedback (RLHF).
GPT-4 learns from corrections and consequently improves its future responses. Attempts to refine a GPT-3 model with 300,000 questions in Icelandic had failed before the RLHF, due to the time-consuming and data-intensive process.
Morgan Stanley, a financial services company, employs an internal GPT-4-enabled chatbot that can sift through Morgan Stanley's huge PDF format to find solutions to consultants' problems. With the GPT-3 and now GPT-4 functions, the company has begun to study the best way to use its intellectual capital. Morgan Stanley has an internal library of unique content, called intellectual capital, that has been used to train the chatbot using the GPT-4. Around 200 employees use the system regularly, and their suggestions help to improve it even more. The company is evaluating an additional OpenAI technology that has the potential to improve the insights of advisors' notes and facilitate subsequent conversations with customers.
Be My Eyes
The Danish company Be My Eyes uses a GPT-4 'virtual volunteer' within its software to help the visually impaired and the blind in their daily activities.
Like the rest of the financial industry, Stripe's support team used GPT-3 to improve the quality of customer service. It now uses the GPT-4 functions to analyze websites and understand how companies use the platform, so that it can adapt the assistance to their needs. It can work as a virtual assistant for developers, understanding their requests, analyzing technical material, summarizing solutions and providing summaries of websites. Using GPT-4, Stripe can monitor community forums like Discord for signs of criminal activity and remove them as quickly as possible.
Khan Academy, a company that provides online educational resources, has started using GPT-4 features to power an artificially intelligent assistant called Khanmigo. In 2022, they started testing the functionality of the GPT-4; in 2023, the Khanmigo pilot program will be available to a select few. Those interested in participating in the program can put themselves on the waiting list.
Initial assessments suggest that GPT-4 could help students learn specific computer programming topics, while gaining a broader appreciation for the relevance of their study. Additionally, Khan Academy is experimenting with different ways in which teachers could use the new features of GPT-4 in the curriculum development process.
Indigo.ai and GPT-4
indigo.ai has a modular platform already prepared for generative models such as GPT-4. The tests of the model have already started and the first results clearly demonstrate how the chatbots developed with the help of this new technology can offer a more natural communication, precise and contextualized between companies and people.
The advantages can be summarized in:
- Improving natural language comprehension, reducing the risk of misunderstanding and improving the user experience. indigo.ai has always focused heavily on quality in understanding requests and minimizing user frustration.
- More answers Accurate and pertinent: thanks to the greater consistency and contextualization offered by GPT-4, i Chatbot of indigo.ai can provide more precise and useful answers to users. The indigo.ai platform is designed to “control” these generative models, and the more powerful they are, the more there is a need to channel them securely.
- IDs and not only answers: thanks to the greater “memory”, you can leave the model free to go and find the best answer in free text documents, such as word or pdf. This allows you to adjust through the platform the level of control you want to have over the answers given, ranging from handwritten and approved texts to the pure free and creative generation.
- Multimodal interactions: with the ability to process and search for information from different modes, like visual ones, indigo.ai chatbots will be able to offer more engaging and complete interactions.