Welcome to this week’s AI news roundup! In this blog post, we will explore two remarkable AI breakthroughs that are shaping the future of video creation and material science. Pika Labs, a platform founded by two Stanford computer science PhD students, is revolutionizing video creation with its intuitive and user-friendly features. On the other hand, Google DeepMind’s AI tool, Gnome, is transforming material science by achieving in days what would have taken centuries using traditional methods. Let’s dive in and discover their exciting advancements!
Pika Labs: Transforming Video Creation
Pika Labs was founded in April 2023 by two visionary Stanford computer science PhD students who sought to create a more intuitive and user-friendly platform for video creation. Inspired by the success of Runway, a generative AI video startup, they launched Pika Labs with a focus on text to video, image to video, and video to video conversions. Within just a year of its launch on Discord, Pika Labs has gained immense popularity, attracting over 500,000 users and $55 million in funding.
Pika 1.0, the latest version of the platform, offers enhanced user experience through a sleek and minimalist design, inspired by Runway but with additional tools that maintain simplicity. It integrates features from Mid-journey, such as an explore mode that provides users with various artworks for inspiration. The platform retains the core functionality from the Discord version and has plans to release a mobile app in the future.
One standout feature of Pika 1.0 is its video to video transformation capability, allowing users to alter existing videos into different styles using text prompts. This feature sets Pika Labs apart with its polished execution. Additionally, Pika 1.0 introduces AI-assisted manipulation of video aspect ratios, catering to various social media formats like TikTok, Instagram, and YouTube shorts. The platform also offers generative fill editing tools, which enable users to replace elements in a scene through simple prompts. These features promise exciting possibilities, especially for those in the entertainment industry. Pika 1.0’s CGI animation capabilities rival renowned studios like Pixar and Dreamworks, with lifelike lighting and convincing facial expressions. Moreover, the updated version extends clip lengths beyond three seconds, providing users with more flexibility and creative freedom in their video projects.
While Pika Labs is making waves in the AI video landscape, it faces competition from Adobe Systems’ acquisition of rephrased AI and Meta’s introduction of video. The interest in AI-driven video technologies is escalating rapidly, as highlighted by the release of Stable Video Diffusion by Stability AI and updates to Runway ML by Runway. Pika Labs’ innovations are undoubtedly shaping the future of video creation, but let’s now turn our attention to Google DeepMind’s groundbreaking AI tool in the field of material science.
Gnome: Google DeepMind’s AI Breakthrough
Google DeepMind has recently unveiled its new AI tool called Gnome, which aims to revolutionize material science. Gnome’s cutting-edge technology can accomplish in a matter of days what would have taken traditional crystal research methods 800 years. By leveraging graphene networks and combining AI with quantum physics, Gnome identifies over 2.2 million new crystals, including approximately 380,000 stable compounds with potential applications in future technologies.
Traditionally, discovering new inorganic crystals was a slow and painstaking process. However, Gnome changes the game by not only identifying potential new materials but also predicting their stability. This groundbreaking method significantly accelerates the discovery of materials and promises exciting advancements that were previously unimaginable.
Gnome’s efficiency is achieved through active learning, a process that combines AI with quantum physics. Initially trained with publicly available data on known materials, Gnome’s performance received a significant boost by suggesting new materials and testing their stability using quantum physical density functional theory. The AI’s accuracy in predicting material stability jumped from approximately 50% to over 80%. Moreover, DeepMind has enhanced the model’s efficiency, increasing the discovery rate from below 10% to over 80%. This improvement reduces the computational resources required for each discovery.
Some of Gnome’s predictions have already been validated in practice by other labs, as they successfully synthesized 736 of the forecasted new crystals. The data on the 380,000 most stable candidates have been made publicly accessible, serving as a valuable resource for researchers worldwide to develop new technologies.
While Gnome demonstrates remarkable progress in material science, the sheer number of theoretical structures identified by systems like Gnome surpasses current synthesis capabilities. This highlights the need for AI that not only predicts new materials but also determines which ones are worth synthesizing, bridging the gap between theoretical discovery and practical application.
Meta’s Cutting-Edge AI Research Projects
In addition to Pika Labs and Google DeepMind, Meta has introduced three groundbreaking AI research projects that push the boundaries of technology across diverse fields like video learning, translation, and audio generation.
The first project, Ego exo 4D, developed in collaboration with 15 global university partners, represents a leap forward in video learning and multimodal perception. This dataset and benchmark combine egocentric and exocentric views, capturing complex human activities like sports, music, and cooking. The dataset, consisting of over 1400 hours of video, will be open-sourced in December, serving as a valuable resource for AI research in augmented reality systems, robotic learning, and social networks.
The second project, Seamless Communication, aims to revolutionize AIS language translation with four new models. These models, including Seamless Expressive, Seamless Streaming, Seamless M4 TV 2, and a combined seamless model, focus on preserving speech nuances, reducing translation latency, and enabling smoother voice and text communication across languages. Seamless Expressive, as showcased in a demo, exhibits Meta’s strides in making communication more natural and barrier-free.
The third project, Audiobox, marks Meta’s foray into generative AI for audio. Following the success of its predecessor, Voicebox, Audiobox allows users to generate specific voices and sound effects through voice input and text prompts, empowering creators to produce custom audio content with ease.
Conclusion
In conclusion, Pika Labs and Google DeepMind have revolutionized their respective fields with their groundbreaking AI technologies. Pika Labs’ intuitive platform has transformed video creation with its comprehensive features, while Google DeepMind’s Gnome has accelerated material science research by achieving in days what would have taken centuries. These advancements open up new possibilities and pave the way for exciting applications in various industries. As AI continues to evolve, we can expect even more remarkable breakthroughs and a future where technology pushes the boundaries of what is possible.