Hello and welcome to our weekly roundup!
How was your week? Hopefully, as well as Sam Altman's, because this week, OpenAI gave out an incredible amount of updates, products, and things to discuss. We have so many topics to discuss, so without further ado, let's get started.
This Creators’ AI Edition:
Featured Materials 🎟️
News of the week 🌍
Useful tools ⚒️
Weekly Guides 📕
AI Meme of the Week 🤡
AI Tweet of the Week 🐦
(Bonus) Materials 🎁
Featured Material 🎟️
OpenAI Introduces Canvas
OpenAI has many important updates this week, but perhaps the most important one is Canvas for ChatGPT. This new writing and coding interface opens in a separate window and allows multiple users to work on the same project. Canvas was created with GPT-4o and can be manually selected in the model selector in the beta version.
People use ChatGPT every day for help with writing and code. Although the chat interface is easy to use and works well for many tasks, it’s limited when you want to work on projects that require editing and revisions. Canvas offers a new interface for this kind of work.
To simplify, canvas is just a better way to interact with ChatGPT, no matter your task. The most popular chatbot now works like a test editor. You can directly write text or code, edit it, and use many hotkeys. In addition, canvas allows you to restore previous versions of your work.
Here are writing shortcuts:
Suggest edits: ChatGPT offers inline suggestions and feedback.
Adjust the length: Edits the document length to be shorter or longer.
Change reading level: Adjusts the reading level, from Kindergarten to Graduate School.
Add final polish: Checks for grammar, clarity, and consistency.
Add emojis: Adds relevant emojis for emphasis and color.
And shortcuts for coding:
Review code: ChatGPT provides inline suggestions to improve your code.
Add logs: Inserts print statements to help you debug and understand your code.
Add comments: Adds comments to the code to make it easier to understand.
Fix bugs: Detects and rewrites problematic code to resolve errors.
Port to a language: Translates your code into JavaScript, TypeScript, Python, Java, C++, or PHP.
Canvas for ChatGPT is already available for ChatGPT Plus and Team users worldwide. Enterprise and Edu users will get access next week. The company also plans to make canvas available to all ChatGPT Free users when it comes out of beta. Exactly how long we'll have to wait is still unknown.
Keep your mailbox updated with key knowledge & news from the AI industry
OpenAI’s DevDay 2024
In addition to the Canvas announcement, OpenAI also held its annual DevDay event, where it talked a lot about AI and the new tools that are available now. They are aimed at entrepreneurs and developers to help them build AI products and features more efficiently and cost-effectively.
In total, the company showed four such platforms.
Model Distillation
The company introduced a new way to extend the capabilities of smaller models, such as the GPT-4o mini, by fine-tuning them using the output of larger models (this process is called model distillation). To make it simpler, OpenAI built a model distillation suite within its API platform.
This solution allows developers to create custom datasets using advanced models such as GPT-4o and o1-preview to generate high-quality responses, fine-tune the smaller model to track those responses, and then create and run custom evaluations to measure how the model performs on certain tasks.
OpenAI says it will offer 2M free training tokens per day on GPT-4o mini and 1 million free training tokens per day on GPT-4o through Oct. 31 to help developers start with Distillation.
Prompt Caching
Prompt Caching is one of the major updates introduced at DevDay this year. This feature aims to reduce cost and latency to support developers. Many developers reuse the same context multiple times in multiple API calls when building AI applications, which adds complexity to the process. With prompt caching, developers can reuse frequently occurring prompts without paying full price each time.
The API automatically saves long prefixes for up to two hours. If it detects a new request with the same prefix, it will automatically apply a 50 percent discount to the input cost. This new feature could save developers of AI applications with very narrowly focused use cases a significant amount of money.
Caching is applied to the latest versions of the GPT-4o, GPT-4o mini, o1-preview, and o1-mini, as well as refined versions of these models.
Vision Fine-Tuning
Developers will now be able to customize the GPT-4o using not only text, but also images. The company says this will improve the model's ability to understand and recognize images, enabling applications such as advanced visual search functions, improved object detection for autonomous vehicles or smart cities, and more accurate analysis of medical images.
OpenAI will give away 1M free training tokens daily throughout October to get developers started. Starting in November, fine-tuning GPT-4o with images will cost $25 per million tokens.
Realtime API
The public beta of the Realtime API is a platform for building apps with low-latency, AI-generated voice responses. It allows developers to utilize the recently launched “Advanced Voice Mode” to build speech conversion apps by allowing them to choose from 6 available voices, making the process faster, cheaper, and more responsive.
With the Realtime API, audio is immediately processed by the API without linking multiple applications, making it much faster, cheaper, and more responsive. The API also supports function calls, meaning applications running on it can perform actions like ordering a pizza or making an appointment. Realtime will eventually be updated to handle multimodal experiences, including video.
News Of The Week 🌍
OpenAI Has Closed Biggest Funding Round
DevDay isn't the only event OpenAI distinguished itself with this week. The company also completed its long-awaited funding round, announcing that it raised $6.6B at a company valuation after investing $156B. According to CNBC, the round was led by Thrive Capital, joined by Microsoft, Nvidia, and other investors. To help you understand the ChatGPT developer's growth rate, earlier this year, OpenAI was valued at $80B, up $29B from 2023.
Yes, even for the ever-successful OpenAI, the last week seems to be a particularly good one. However, it was not without some unpleasant moments. One of the co-leads on the company's video generator, Sora, has left for Google. Tim Brooks announced in a post on X that after two years, he is moving to DeepMind, where he would also work on video generators and “world simulators.”
Thus, he joined the list of rare specialists who left Sam Altman's project this year.
Pika Labs Introduces New Video Generation Model
Startup Pika Labs, one of the premier AI developers for video generation, has announced the launch of Pika 1.5, its most advanced model. This platform emphasizes hyper-realism. In demos, the company showed how the AI generates realistic movements of people and creatures, dubbed “Pikaffects,” and offers more dynamic physics and sophisticated camera techniques.
Developers say these features will be useful in various applications, from professional movie production to beautiful social media videos.
So while we wait for the public release of Sora, some developers are already releasing major updates to their models and making deals like the one we saw between Runway and Lions Gate. It's kind of funny.
Google AI Podcasting Tool Goes Viral
One of Google's AI products suddenly has people liking it (honestly, I didn't think it would ever happen anymore). The podcasting tool launched in September as part of NotebookLM has been swirling on social media. The platform, powered by Gemini 1.5, allows people to upload content like links to websites, videos, and PDFs and then generate audio.
The AI creates a show called Deep Dive, in which male and female voices discuss what the user has uploaded. The voices are very realistic: episodes are punctuated with short human-like phrases like “Wow,” “Oh, right,” and “Wait, let me get that right.” The generated presenters even interrupt each other to make their point.
Given that this is Google's significant (albeit accidental) achievement in recent months, it's not worth skipping this product. If you want to figure out how to test NotebookLM, the guys at Wired wrote a great guide.
Sharing is caring! Refer someone who recently started a learning journey in AI. Make them more productive and earn rewards!
Microsoft Unveils Huge Update for Its Copilot
Earlier this week, Microsoft unveiled a major redesign of its Copilot. The AI gained voice and vision capabilities to become a more personalized and helpful assistant. The design has also changed dramatically across all platforms: on mobile devices, the web version, and a dedicated Windows app. As The Verge reports, the UI is now card-based and strongly resembles interactions with Inflection AI (whose employees Microsoft hired earlier this year).
Albeit a little late, we got another solid competitor to Google's Assistant, the updated Siri and the latest version of ChatGPT. To be honest, I don't think Microsoft has much chance in the mobile market, but the desktop version of Copilot may well find its user.
Meta May Train Its AI on Images Made with Ray-Ban Glasses
Sounds scary, but it's actually not all that bad. TechCrunch journalists contacted Meta representatives to find out if the company will be able to use photos taken with Ray-Ban smart glasses to train its models. Strictly speaking, yes and no. Photos simply taken for personal use will not be shared with the company. But if a user wants to analyze them with Meta AI, the image will fall under different rules and become fodder for large language models.
We seem to be getting closer and closer to the cyberpunk world.... Or am I rushing things too fast? Either way, it's good to know these things in advance. And if you have dissemination-sensitive data in your library, be careful how you use it.
Useful Tools ⚒️
Lookie AI – A faster way to absorb knowledge from YouTube
Clones – Chat, learn, and grow with AI companions
Blaze – Create 1-click content in your brand style with AI
PodSnap.AI – Get summaries of podcast episodes as they go live
buzzabout – Audience insights from 1B+ online discussions in 2 mins
If you're an entrepreneur, creator, or freelancer who needs to feel the pulse of your target audience, Buzzabout may be just what you need. The tool uses AI to gather analytics from billions of social media discussions (the platform currently handles TikTok, YouTube, and Reddit). Buzzabout's interface is presented as a chatbot that accepts natural language queries and then generates detailed reports with key metrics, charts, and data.
Share this post with friends, especially those interested in AI stories!
Weekly Guides 📕
How to Build A $10,000 Website Using AI in 5 minutes (No Code Required!)
These AI Meme Videos Are Going Viral (Here's How They're Made)
How to Use AI to Write a Book 📖 in 2024 | Step-by-Step Guide
Invideo AI 2024 Tutorial - Best AI Video Generator
Pictory AI Review & Tutorial - How To Use Pictory For Beginners (2024)
AI Meme Of The Week 🤡
AI Tweet Of The Week
Quite interesting (and unconfirmed!) information from Reuters. If their source is correct, then perhaps Musk wasn't so far from the truth, and OpenAI did indeed set up a “Game of Thrones”. What do you think?
(Bonus) Materials 🏆
AI Doesn’t Know Much About Golf. Or Farming. Or Mortgages. Or … - WSJ
Hacking Generative AI for Fun and Profit - Wired
Generative AI & Licensing: A Special Report - Variety
AI Creativity: Genius or Gimmick?
The Gray Area | Yuval Noah Harari on the AI revolution
Share this edition with your friends!