"Generative Pre-trained Transformers," known as GPTs are a series of language processing models made by the U.S. Artificial Intelligence laboratory OpenAI.
GPT models grow and learn through artificial intelligence, which is fed with various data, text and numbers, drawing on a large database of information.
GPT-4 is the successor to GPT-3.5, the processing model behind the popular ChatGPT tool, with which it shares very similar features, but with some major new features. The main one is the significant improvement in GPT-4's natural language understanding and response generation capabilities. This is mainly due to the larger size of the GPT-4 model and the large training dataset used.
GPT-4 is thus able to provide more accurate and relevant responses than its younger sibling, and is also more versatile in handling a wide range of domains. It also has improvements in context understanding-since it accepts much longer texts than ChatGPT (going from a "memory" of 6 pages to 52)-thus achieving an ability to generate more articulate and complex content and unearth information in very large documents.
The latest innovation is multimodal functionality, which means it can understand information from different modalities, such as images.
This ability to process information from different sources allows GPT-4 to interact more effectively and naturally with users and paves the way for many new applications. OpenAI states that "GPT-4 can solve difficult problems with greater accuracy due to its broader general knowledge and problem solving capabilities."
The architecture of GPT-4 is similar to that of its predecessors: it is a transformer, probably decoder-only, and does next-token-prediction.
If one of the limitations of ChatGPT was the input length (4,000 tokens), GPT-4 has a context length of 8,192 tokens (there is also a limited access version of 32,768 tokens).
An interesting use case on the topic of multimodality is searching in documents in which there are images. Imagine a bot within a technical documentation that includes instructions in image format of procedures or interfaces. With a multimodal approach, images can be handled along with text, making it possible to respond contextually on both modalities.
The GPT-4 API is now accessible to all Clients payers using the API and promises unprecedented capabilities regarding language models, providing a unique opportunity for developers.
Below are the main updates communicated by OpenAI:
Duolingo allows premium costumers to have language conversations with GPT-4. The project is called Role Play and Explain my Answer. Duolingo's GPT-4 course is designed to teach learners how to have natural conversations on a wide range of specialized topics. Duolingo has introduced these new features in Spanish and French, with plans to extend them to other languages and add more features in the future.
The Icelandic government is also using GPT-4. The Icelandic government is working together with tech companies and OpenAI's GPT-4 to have the country's native language maintained and used. Now, 40 volunteers supervised by Vilhjálmur Þorsteinsson (CEO of language technology company Miðeind ehf) are training GPT-4 with human feedback reinforced learning (RLHF).
GPT-4 learns from the corrections and consequently improves its future responses. Attempts to refine a GPT-3 model with 300,000 Icelandic language questions had failed before RLHF because of the time-consuming and data-intensive process.
Morgan Stanley, a financial services firm, employs an internal GPT-4-enabled chatbot that can sift through Morgan Stanley's huge PDF format to find solutions to advisors' problems. With GPT-3 and now GPT-4 functions, the company has begun to explore how best to use its intellectual capital. Morgan Stanley has an internal library of unique content, called intellectual capital, which was used to train the chatbot using GPT-4. About 200 employees use the system regularly, and their suggestions help to further improve it. The firm is evaluating additional OpenAI technology that has the potential to improve insights from advisors' notes and facilitate follow-up conversations with Clients.
The Danish company Be My Eyes uses a GPT-4 "virtual volunteer" within its software to help the visually impaired and blind in their daily activities.
Like the rest of the financial industry, Stripe's support team used GPT-3 to improve the quality of service Clients. It now uses GPT-4 functions to analyze websites and understand how companies use the platform so that it can tailor support to their needs. It can act as a virtual assistant to developers, understanding their requests, analyzing technical material, summarizing solutions, and providing summaries of websites. Using GPT-4, Stripe can monitor community forums such as Discord for signs of criminal activity and remove them as quickly as possible.
Khan Academy, a company that provides online educational resources, has begun using GPT-4 capabilities to power an artificially intelligent assistant called Khanmigo. In 2022, they began testing GPT-4 functionality; in 2023, the Khanmigo pilot program will be available to a select few. Those interested in participating in the program can get on a waiting list.
Initial evaluations suggest that GPT-4 could help students learn specific topics in computer programming while gaining a broader appreciation for the relevance of their study. In addition, Khan Academy is experimenting with different ways in which teachers could use GPT-4's new features in the curriculum development process.
indigo.ai has a modular platform that is already prepared for generative models such as GPT-4. Testing of the model has already begun, and early results clearly demonstrate how chatbots developed with the help of this new technology can provide more natural, accurate, and contextualized communication between companies and people.