Tuesday, April 16, 2024
HomeRoboticsHow Omniverse Created A Stunning Demo At GTC With Toys ?

How Omniverse Created A Stunning Demo At GTC With Toys [2022] ?


Table of Contents

  • Introducing Toy Jensen
  • Real CEO and Digital Kitchen
  • At the speed of the Omniverse

Introducing Toy Jensen

That presentation could only happen at NVIDIA Omniverse , NVIDIA’s virtual world simulation and collaboration platform for 3D workflows.

And that happened in an interview with NVIDIA CEO Jensen Huang’s virtual toy model Toy Jensen.

When one of Toy Jensen’s creators asked, “What’s the coolest thing about…?”

Unfazed, little Toy Jensen pondered the answer for a while.

“The most wonderful people are those who are kind to others,” replied Toy Jensen.

Cutting-edge computer graphics, physics simulation, a live CEO, and a supporting cast of AI-powered avatars all come together to deliver the NVIDIA GTC keynote powered by Omniverse.

In the process of making these keynote speeches happen, a little bit of the human soul was also mixed in.

This AI commentary (question and answer session with Toy Jensen) is an unexpected glimpse into the depth of technology in Omniverse, so it was added as a highlight of the keynote.

“Omniverse is a hub for different research areas to converge, collaborate and work together,” said Kevin Margo, NVIDIA’s creative team and lead speaker for the presentation. “Omniverse facilitates the convergence of all of them,” he said.

Toy Jensen’s ad lib concludes a presentation that seamlessly combines the real-life CEO with virtual and real environments, in which Huang explained how NVIDIA’s technology can bring AI, graphics, and robotics to humans in real and virtual worlds. I explained to the viewers how they would merge.

Real CEO and Digital Kitchen

The CEO that the viewer sees is completely real, but the surrounding environment changes according to the story that the CEO tells.

Juan’s keynote, as viewers saw it, seemed to begin in a kitchen, which is also where many people are in the midst of the COVID pandemic.

And after Juan’s kitchen, modeled down to the screws holding the closets in place, slides out of view, he walks toward the glittering lobby of a virtual recreation of the Endeavor office building. *Translation note 1).

“One of our goals is how to make the keynote stand out,” says Margot. “We’re always looking for that special moment where we can do something new and fantastic and showcase the latest innovations from NVIDIA.”

The event was the beginning of a visual journey that took Juan from the lobby to Shannon’s room, the gathering place within Endeavor, through the holodeck, data center, and stop at the real robotics lab and Endeavor exterior.

Virtual environments such as Juan’s kitchen were created by the team using familiar tools supported by Omniverse such as Autodesk Maya, 3ds Max and Adobe Substance Painter.

Omniverse has played a role in connecting these tools in real time. This accelerated work by allowing each team member to simultaneously see changes made by colleagues using different tools.

“The acceleration of work [with Omniverse] was very important,” said Margot.

Once the live shooting started, the fusion of virtual and real progressed at once.

A small video production team on site recorded Huang’s speech in just four days from October 30th, using a spare conference room at NVIDIA’s Silicon Valley headquarters.

Omniverse enabled the NVIDIA team to project the dynamic virtual environments created by their colleagues onto a screen behind Huang.

As a result, when the scene around Juan changed, so did the light hitting him, helping him blend better into the virtual environment.

Also, as he moved through the scene and the camera moved, the environment around him changed.

“As the camera moves, the perspective and parallax of the world on the video wall changes with the camera,” says Margot.

Also, Juan was able to see the projected (virtual) environment around him, which helped him better navigate through each scene.

(*Translation Note 1) A video of the NVIDIA GTC 2021 keynote is available on YouTube . From 13:00 to 14:11 in that YouTube video, which is also inserted at the end of this article, you can watch the virtual kitchen being dismantled and replaced by the Endeavor office building.

At the speed of the Omniverse

All of this and more Omniverse has accelerated the work of the NVIDIA production team. Rather than adding elaborate digital sets in post-production, the production team captured most of what was needed on camera after each shot.

As a result, the video production team quickly produced a presentation that seamlessly blended real CEOs with virtual and real-world settings.

But Omniverse didn’t just accelerate collaboration between creators with physical and digital elements under pressure to meet deadlines. Omniverse also acted as a platform to tie together the series of demos introduced in the keynote.

In order for developers to use the Omniverse to create intelligent, interactive agents that can see, talk, converse on a wide range of subjects, and naturally understand spoken intent, Huang created the Omniverse Avatar. Announced.

Omniverse is a collection of technologies from ray tracing to recommender systems, and in the keynote we showed off a series of stunning demos combining these technologies.

Huang introduced how Project Tokkio for the Omniverse Avatar connects Metropolis computer vision, Riva speech AI, avatar animation and graphics into Toy Jensen Omniverse Avatar, a real-time conversational AI robot. The demo was an instant hit.

The conversation between three NVIDIA engineers and the tiny Toy Jensen model showcased more than just technical brute force, it was a professional, natural question-and-answer session.

Toy Jensen and engineers Q&A with photo-realistic modeling of Toy Jensen and its environment (to the point where Toy Jensen’s glasses light up when he moves his head) and NVIDIA’s Riva voice powered by Megatron 530B large language model We showed how synthesis technology can support natural and fluid conversations.

To produce this demo, NVIDIA’s creative team created a digital model in Maya Substance and Omniverse did the rest.

“None of it is manual, just load the animation assets and speak [to the avatar],” Huang said.

Huang also showed a second demo of “Project Tokkio”. It’s a customer service avatar installed in a restaurant kiosk that can see two customers, have a conversation with them, and understand what they’re saying.

However, this model relies on an integrated model of restaurant menus rather than Megatron, so avatars can smoothly guide customers through their choices.

The same stack of techniques is useful for human-to-human conversation. Huang showed how Project Maxine can add cutting-edge video and audio capabilities to virtual collaboration and video content creation applications.

The demo showed a woman speaking English on a video call in a noisy cafe, but her voice was clearly audible with the background noise removed. As she speaks, her words are transcribed and translated in real time into French, German and Spanish.

Thanks to Omniverse, she has an avatar who can speak with the same voice and intonation as her.

These demos are possible because Omniverse integrates advanced speed AI, computer vision, natural language understanding, recommendation engines, facial animation and graphics technology through the Omniverse Avatar.

Omniverse Avatar’s speech recognition is based on NVIDIA Riva , a software development kit that recognizes speech in multiple languages . Riva is also used to generate human-like speech responses using its text-to-speech capabilities.

Omniverse Avatar’s natural language understanding is based on the Megatron 530B Large Language Model , which can recognize, understand, and generate human language .

The Megatron 530B is a pretrained model that can complete sentences and answer questions covering a wide range of subjects with little or no additional training. It can also summarize long and complex stories, translate them into other languages, and handle many domains without special training.

Omniverse Avatar’s recommendation engine is powered by NVIDIA Merlin . It is a framework for companies to build deep learning recommendation systems that can process large amounts of data and make smarter suggestions.

Perception is powered by NVIDIA Metropolis , a computer vision framework for video analytics .

Avatar animation is also powered by NVIDIA Video2Face and Audio2Face , AI-driven 2D and 3D facial animation and rendering technologies .

All of these technologies are configured as applications and processed in real time using the NVIDIA Unified Compute Framework .

Packaged as scalable and customizable microservices, these technologies can be securely implemented, managed and orchestrated in multiple locations by NVIDIA Fleet Command .

With these technologies, Huang was able to tell the big story of how the NVIDIA Omniverse is transforming a multitrillion-dollar industry.

All demos were built on Omniverse. Omniverse brought it all together: a real CEO, real and virtual environments, and a series of demos made within Omniverse.



Please enter your comment!
Please enter your name here

Recent Posts

Most Popular

Recent Comments