OpenUSD Could Enable a Real Metaverse

Too often, the term “metaverse” is just used to mean the kind of virtual reality we’ve been seeing for a decade, but it’s supposed to mean interoperable 3D and VR environments where you can share and integrate components from different sources. Now that Pixar’s Universal Scene Description file format is not just an open source project (OpenUSD) that acts as a de facto specification, but is being developed as a standard through the Linux Foundation by the Alliance for OpenUSD, USD could emerge as a way to make that real.

“OpenUSD is a highly extensible framework for describing, composing, simulating, and collaboratively navigating, and constructing 3D scenes at any scale,” Aaron Luk, who oversees USD engineering at NVIDIA and previously co-developed USD at Pixar, told The New Stack.

“OpenUSD’s unique composition capabilities provide rich and varied ways to aggregate assets into larger assemblies and enable collaborative workflows that accelerate individuals and teams across their multi-app workflows and projects.”

Unlike existing 3D interchange options, USD — which was originally created to enable workflows that moved millions or even billions of individual objects in an animated movie scene between multiple 3D tools — isn’t just a file format for importing and exporting. “You can have streaming or in-memory representations of USD,” AOUSD chairman and Pixar CTO Steve May told us at the launch of the Alliance.

USD assets can have multiple topologies, allowing them to render differently in different conditions. It’s also extensible using schemas, which already cover 3D metadata like BIM data, sensors and other object properties. It handles hierarchies of scene layering and composition, not just the geometry of individual 3D assets — a scene from a Pixar movie could be made up of hundreds or thousands of USD files.

“That’s where the power of AOUSD really lies, in that ability to aggregate and modify large numbers of assets and then combine them into a complete picture,” he said.

And that’s relevant far beyond entertainment, May maintained.

“Whether it’s immersive 3D content, interactive experiences, new spatial computing platforms, or scientific and industrial applications, OpenUSD will become the fundamental building block on which all 3D content will be created.”
Steve May, AOUSD chairman and Pixar CTO

From Media to Metaverse

OpenUSD is already a de facto standard supported by a wide range of tools, from Adobe Photoshop, Blender, Autodesk Maya, ESRI and Adobe’s new line of Substance 3D tools, to AR tools like Adobe Aero, Microsoft’s Mixed Reality Toolkit, Nvidia Omniverse — and the visionOS SDK for Apple Vision Pro (which uses USDZ, a zipped version of USD).

But as May explained, “right now, the behavior of OpenUSD is defined by what’s in that open source distribution that we provide, which means that if something in the code changes, effectively the way that works can change.”

With Apple and others investing so much in USD, it’s time to turn it into formally defined data specifications that allow for interoperability between tools and ecosystems that stretch beyond media and entertainment.

The open source nature of OpenUSD (and the influence of Pixar on the software market for illustration and animation) means USD is already widely used in game development as well as the entertainment industry (in visual effects as well as animation). Now, it’s being adopted in architecture, engineering, construction, automotive and manufacturing — anywhere that complex 3D assets and environments are important.

IKEA already uses computer-generated images instead of photographs in its catalog: it recently joined AOUSD to be involved in creating a standard it can use to manage 3D content for furniture that can go from CAD models to manufacturing guides, to marketing that shows an entire 3D scene with multiple items, and an AR app that shows shoppers what those items will look like in their own home according to the assembly instructions that come in the package. Lowes has also signed up, for similar reasons.

If OpenUSD takes off as the format for that, you can imagine picking furniture and décor from different stores to see together — because you don’t buy everything in the same place. Use a 3D phone scanning app like MagiScan or Luma AI and you can create your own USD assets of physical objects to include in a 3D environment, so you can include what you already own.

“[OpenUSD] enables robust interchange between digital content creation, CAD, and simulation tools with its expanding ecosystem of schemas, covering domains like geometry, shading, lighting and physics,” Luk explained.

That’s why it is now proving ideal for industrial applications, from augmented reality and spatial computing to Industrial IoT (IIoT), factory digital twins and computer vision (think robots and self-driving cars). It’s also relevant for simulations and scientific computing like computational fluid dynamics and finite element analysis, where you work with 3D meshes; an area that increasingly feeds into product design and manufacturing processes.

New USD schemas could include the electrical, physical and mechanical properties of materials and objects. NVIDIA has already contributed UTF-8 support for international characters, geospatial coordinates, metrics assembly, content validation and visualization as a service.

Having a universal data interchange with interoperability across different ecosystems of tools is going to be key to an industrial metaverse that’s more than just a marketing slogan. Building a metaverse of 3D virtual worlds, interactive experience and 3D industrial environments will require layering and compositing a lot of different elements — like the 3D equivalent of the web, Luk suggested.

“OpenUSD has the potential to be the ‘HTML of the 3D world’ – an open, unifying technology that doesn’t require the obsolescence of other formats, but rather can contain and integrate them as necessary.”
Aaron Luk; oversees USD engineering at NVIDIA

If you’re an architect, you could use it to build a model of how the sun lights a landscape, with accurate shadows to help you forecast energy needs for heating and cooling — and do it once, then use that in multiple designs rather than creating the path of the sun to visualize a sports stadium, and then doing it all over again for an apartment block or a new road layout. Or even easier, you could get an OpenUSD plugin that does that, across the multiple applications you use.

BMW is using NVIDIA’s OpenUSD-based Omniverse platform to animate processes in the virtual factories it uses to plan manufacturing facilities before it builds them, laying out factory workflows, checking for potential collisions, or experimenting with the best place to put an industrial robot by combining 3D data from systems used to design buildings, vehicles, equipment and logistics with the simulation of processes and human workers. The BMW Factoryverse lets teams walk through the virtual factory and make changes to different layers without interfering with each other’s work.

Generative AI goes 3D

With tools like Cesium, you can combine your own 3D assets and data from digital twins with real-world 3D geospatial data from multiple sources: for example, you could take an architectural model built in Autodesk and place it on Google’s Photorealistic 3D Tiles, or lay out an entire interactive city built in Esri ArcGIS on Bing Maps imagery.

If you want a smaller-scale scene for your virtual environment, you could shell out for a 360-degree camera — or you could call a generative AI service to build you a photorealistic panorama from a prompt: Adobe’s Firefly generative models will be available as APIs in NVIDIA Omniverse, and they let designers start with an image or a sketch. “We are also working on 2D-to-3D USD generative AI models,” Luk noted.

Or you could add in AI-animated characters using Wonder Studio or generate facial animations and gestures from audio files with Omniverse Audio2Face, again using OpenUSD to bring them into the environment.

“OpenUSD is the portal for 3D workflows to access generative AIs,” Luk said. “Software vendors and tool builders with OpenUSD-connected applications can create their own proprietary large language models to act and operate across their portfolio of tools — vastly streamlining and enhancing the user experience.”

In fact, NVIDIA has built a foundation model of its own ChatUSD, that software developers can fine-tune with their own data, which can parse USD scenes and generate USD scripts.

That lets developers treat Omniverse as an OpenUSD portal for tools from multiple software providers that they might not be familiar with: “A proprietary LLM or RAG chatbot becomes a co-pilot to the user and orchestrates various ChatUSD or 2D-to-3D-based agents in each of the software vendors’ tools to perform actions. From the user’s perspective, they prompt an arbitrary UI in a viewport that can visualize USD data to complete a task, and tasks are coordinated and performed automatically without having to complete them manually in separate tools.”

Not only does OpenUSD provide a common framework for creating, describing and sharing 3D scenes and assets: it could also be a way to stitch together a whole workflow by combining different tools and AI services.

Making that work will rely on OpenUSD being a comprehensive and robust standard.

Evolving a Standard

Making sure the schemas that make OpenUSD extensible work across a wide range of industries and ecosystems also means making sure the data models that underlie schemas for all this extensibility are specified consistently. That’s a big part of the work the AOUSD will handle, along with creating a full, normative specification for USD, covering Foundational Data Types, Foundational Data Models, Core File Formats, Composition Engine, and Stage Population.

“This is critical work that will be foundational for specifications in areas such as materials, physics, and solid modeling in the future,” Luk told us: the first draft of the specification is due in 2024, with final drafts expected to be ratified before the end of 2025.

The Alliance was formed in August 2023 by Apple, Pixar, Adobe, Autodesk and NVIDIA: it recently announced that Meta has now joined too, along with Cesium, Chaos, Epic Games, Foundry, Hexagon, OTOY, SideFX, Spatial and Unity — and IKEA and Lowes. It’s also now collaborating with the Khronos group, which will reassure developers who are currently faced with handling two 3D asset standards that are starting to overlap.

Khronos’ existing glTF format is already an open source ISO standard with very efficient 3D object representation and it covers scenes as well as 3D assets, but it’s best thought of as “JPEGs for 3D” rather than the high resolution, ultra-realistic 3D scenes USD excels at. While OpenUSD focuses on interoperability and authoring workflows, glTF’s strength is as a publishing format for real-time display. Plus, scenes in 3D environments like the metaverse will likely include other objects and formats, like audio, video and other media, so the two groups are working together in the Metaverse Standards Forum’s 3D Asset Interoperability Working Group to make sense of how the two standards will work together.

There are already a variety of tools that promise to convert between glTF and USD but that doesn’t always preserve all the details, so the group will work on defining “scene elements such as objects, geometry, materials, lights, physics, behaviors in a form that allows straightforward and lossless conversion” between glTF and USD, especially for interactive environments as well as complex static scenes.

AOUSD is starting out under the governance of the Linux Foundation as part of the joint Development Foundation, which May described as having expertise in incubating early-stage technologies into standards that are complementary to the Academy Software Foundation — which already has a large working group on OpenUSD and focuses on supporting the use of open source software in the film industry. Picking the JDF is a sign of the broader opportunities for USD and you can expect to see the specification transition to a larger standards body like ISO or ECMA in the long run, May told us. “The Alliance comprises its own standards body as an incubation point to develop it to the point where we can then move on to common standards bodies.”

Delivering a consumer metaverse will depend on navigating the complexity of competing economic interests between organizations that fiercely defend their expensive IP and content as much as on any technical interoperability. But for the industrial metaverse, there are much stronger incentives for cooperation and collaboration. OpenUSD is starting to gather the right momentum for delivering technology that could actually deserve to use the name.

Group Created with Sketch.