Using Parallel Maya

Overview

This guide describes the Maya features for accelerating playback and manipulation of animated scenes. It covers key concepts, shares best practices/usage tips, and lists known limitations that we will aim to address in subsequent versions of Maya.

This guide will be of interest to riggers, TDs, and plug-in authors wishing to take advantage of speed enhancements in Maya.

If you would like an overview of related topics prior to reading this document, check out Supercharged Animation Performance in Maya 2016.

Key Concepts

Starting from Maya 2016, Maya accelerates existing scenes by taking better advantage of your hardware. Unlike previous versions of Maya, which were limited to parallelizing individual nodes, Maya now includes a mechanism for scene-level graph analysis and parallelization. For example, if your scene contains different characters that are not constrained to one another, Maya recognizes this and evaluates each character at the same time.

Similarly, if your scene has a single complex character, it may be possible to evaluate sub-sections of the rig simultaneously. As you can imagine, the amount of parallelism depends on how your scene has been constructed. We will get back to this later. For now, let’s focus on understanding key Maya evaluation concepts.

At the heart of Maya’s new evaluation architecture is an Evaluation Manager (EM), responsible for creating a parallel-friendly description of your scene, called the Evaluation Graph (EG). The EM schedules EG nodes across available compute resources.

Prior to evaluating your scene, the EM checks if a valid EG graph exists. The EG is a simplified version of the Dependency Graph (DG), consisting of DG nodes and connections. Destination node(s) employ data from the Source node(s) in order for the Destination Node(s) to perform evaluation. This dependency is represented by a connection in the EG. A valid EG may not exist for various reasons. For example, you may have loaded a new scene and no EG may have been built yet, or you may have changed your scene, invalidating a prior EG.

Maya uses the DG’s dirty propagation mechanism to build the EG. Dirty propagation is the process of walking through the DG, from animation curves to renderable objects, and marking the attributes on DG nodes as needing to be re-evaluated (i.e., dirty). Unlike previous versions of Maya that propagated dirty on every frame, Maya now disables dirty propagation once the EG is built, and reuses the existing EG until it becomes invalid.

With dirty propagation disabled, computing your scene at a given frame involves walking the EG, scheduling, and evaluating EG nodes. Because the EG encodes node-level dependencies, when evaluating a given EG node, you know that all inputs coming from dependent nodes have already been calculated. This further enables pipelining of some operations. Specifically, when we find EG nodes without dependents, we can initiate additional processing (e.g., rendering) since we are guaranteed that no downstream nodes will require computed results.

Tip. If your scene contains expression nodes that use the getAttr command the DG graph will be missing explicit dependencies which will result in an incomplete EG. In addition to impacting correctness, expression nodes will also reduce the amount of parallelism in your scenes (see Scheduling Types for details).

Depending on how you have built your scene, the EG may contain circular node-level dependencies. If this is the case, the EM creates node clusters. At scene evaluation time, nodes in clusters are evaluated serially before continuing with other parallel parts of the EG. Multiple clusters may be evaluated at the same time. As with previous versions of Maya, you should avoid building scenes with attribute-level cycles as this is unsupported, and leads to unspecified behavior.

By default, the EM schedules node evaluation on available CPU resources. However, the EM also provides the ability to override evaluation for sub-sections of the EG, targeting computation to specific runtimes and/or hardware. One example of this is the GPU override feature included in Maya, which uses your graphics card’s graphics processing unit (GPU) to accelerate deformations.

When manipulating your rig, you may notice that performance improves once you have added at least 2 different keys on a controller. By default, only animated nodes are included in the EG. This limit helps keep the EG compact, making it fast to build, schedule, and evaluate. Hence, if you are manipulating a controller that has not yet been keyed yet, Maya relies on legacy DG evaluation. When 2 or more different keys are added, the EG rebuilds to include the newly-keyed nodes, permitting Parallel evaluation via the EM.

Tip. You can use the controller command to identify objects that will be used as controllers (and therefore animation sources) in your scene. If the Include controllers in evaluation graph option is set (see Windows > Settings/Preferences > Preferences, then Settings > Animation), the objects marked as controllers will automatically be added to the evaluation graph even if they are not animated yet. This will prevent the EG from being rebuilt when these objects are animated and will allow Parallel evaluation for manipulation even if they have not been keyed yet.

Supported Evaluation Modes

Maya starts in Parallel evaluation mode by default. This new evaluation mode replaces the legacy DG-based evaluation. Maya supports 3 evaluation modes:

Mode What does it do?
DG Uses the legacy Dependency Graph-based evaluation of your scene. This was the default evaluation mode prior to Maya 2016
Serial Evaluation Manager Serial mode. Uses the EG but limits scheduling to a single core. Serial mode is a troubleshooting mode to pinpoint the source of evaluation errors.
Parallel Evaluation Manager Parallel mode. Uses the EG and schedules evaluation across all available cores. This mode is the new Maya 2016 default.

When using either Serial or Parallel EM modes, you can also activate GPU Override to accelerate deformations on your GPU. You must be in Viewport 2.0 to use this feature (see Custom Evaluators).

To switch between different modes, go to the Preferences window (Windows > Settings/Preferences > Preferences > Animation). You can also use the evaluationManager MEL/Python command; see documentation for supported options.

To see the evaluation options that apply to your scene, turn on the Heads Up Display Evaluation options (Display > Heads Up Display > Evaluation).

First Make it Right Then Make it Fast

Before focusing on understanding how to make your scene fast in Maya using Parallel evaluation, it is important to ensure that evaluation in DG and EM modes generates the same results.

If you observe evaluation errors when you start Maya, (that is, what you see in the viewport differs from previous versions of Maya), determine the source of these errors. Errors may be due to an incorrect EG, threading related problems, or other issues. In the sections that follow we will review 2 important concepts related to errors: Evaluation Graph Correctness and Thread Safety

Evaluation Graph Correctness

In the event that you see evaluation errors, first try to test your scene in Serial evaluation mode (see Supported Evaluation Modes). Serial evaluation mode uses the EM to build an EG of your scene, but limits evaluation to a single core to eliminate threading as the possible source of differences. Note that since Serial evaluation mode is provided for debugging, it has not been optimized for speed and scenes may run slower in Serial than in DG evaluation mode. This is expected.

If transitioning to Serial evaluation eliminates evaluation errors, this indicates that the errors in your scene are likely due to a threading-related problem. However, if errors persist even after transitioning to Serial evaluation this indicates that the EM is building an incorrect EG for your scene. There are a few possible reasons for this:

Custom Plugins. If your scene uses custom plug-ins that rely on the MPxNode::setDependentsDirty function to manage attribute dirtying, this may be the source of problems. Plug-in authors sometimes use MPxNode::setDependentsDirty to avoid expensive calculations every time MPxNode::compute is called. Using this approach results from previous evaluations are typically cached and MPxNode::setDependentsDirty is used to trigger re-computation.

Since the EM relies on dirty propagation to create the EG, any custom plug-in logic that alters dependencies may interfere with the construction of a correct EG. Furthermore, since the EM evaluation does not propagate dirty messages, any custom caching or computation in MPxNode::setDependentsDirty is not called while the EM is evaluating.

If you suspect that your evaluation errors are related to custom plug-ins, temporarily remove the associated nodes from your scene and validate that both DG and Serial evaluation modes generate the same result. Once you have made sure this is the case, you will need to revisit the plug-in logic. The API Extensions section covers Maya 2016 SDK changes that will help you adapt plug-ins to Parallel evaluation.

Another debugging option is to use scheduling type overrides to force your custom nodes to be scheduled in a more conservative, i.e. safer but allowing less parallelism, way. This approach can enable the usage of Parallel evaluation even if only some of the nodes are not thread-safe. Scheduling types are described in more details in the Thread Safety section.

Errors in Autodesk Nodes. Although we have done our best to ensure that all out-of-the-box Autodesk Maya nodes correctly express dependencies, sometimes a scene uses nodes in an unexpected manner. If this is the case, we ask you make us aware of scenes where you encounter problems. We will do our best to address problems as quickly as possible.

Thread Safety

Prior to Maya 2016, evaluation was single-threaded and developers did not need to worry about making their code thread-safe. At each frame, they were guaranteed that evaluation would proceed serially and computation would finish for one node prior to moving onto another. This approach allowed for the caching of intermediate results in global memory and using external libraries without considering their ability to work correctly when called simultaneously from multiple threads.

These guarantees no longer apply for Parallel Maya. Developers now working in Maya must update plug-ins to ensure correct behavior during multi-core evaluation.

Two things to consider when updating plug-ins:

Here’s a concrete example for a simple node network consisting of 4 nodes:

In this graph, evaluation first calculates outputs for Node1 in serial (i.e., Node1.A, Node1.B, Node1.C), followed by parallel evaluation of Nodes 2, 3, and 4 (that is, Read Node1.A to use in Node2, Read Node1.B to use in Node3, etc.).

Since we know that making legacy code thread-safe requires time, we have added new scheduling types to instruct the EM how to schedule nodes. Scheduling types provide a straightforward migration path, so you do not need to pass up parallelizing opportunities for some parts of your scenes just because a few nodes still need work.

There are 4 scheduling types:

Scheduling Type What are you telling the scheduler?
Parallel Asserts that the node and all third-party libraries used by the node are thread-safe. The scheduler may evaluate any instances of this node at the same time as instances of other nodes without restriction.
Serial Asserts it is safe to run this node with instances of other nodes. However, all nodes with this scheduling type should be executed sequentially within the same evaluation chain.
Globally Serial Asserts it is safe to run this node with instances of other nodes but only a single instance of this node should be run at a time. Use this type if the node relies on static state, which could lead to unpredictable results if multiple node instances are simultaneously evaluated. The same restriction may apply if third-party libraries store state.
Untrusted Asserts this node is not thread-safe and that no other nodes should be evaluated while an instance of this node is evaluated. Untrusted nodes are deferred as much as possible (i.e. until there is nothing left to evaluate that does not depend on them), which can introduce costly synchronization.

By default, nodes scheduled as Serial provide a middle ground between performance and stability/safety. In some cases, this is too permissive and nodes must be downgraded to GloballySerial or Untrusted. In other cases, some nodes can be promoted to Parallel. As you can imagine, the more parallelism supported by nodes in your graph, the higher level of concurrency you are likely to obtain.

When testing your plug-ins with parallel Maya, a simple strategy is to schedule nodes with the most restrictive scheduling type (i.e., Untrusted), and then validate that the evaluation produces correct results. Raise individual nodes to the next scheduling level, and repeat the experiment.

You can also alter scheduling behavior dynamically at runtime. For example, Maya currently defaults to scheduling expression nodes as untrusted, since it is unclear ahead of time what actions an expression will perform. However, if Maya detects an expression node that is limited to arithmetic and has outputs that are purely a function of inputs, we can safely promote scheduling of that expression to GloballySerial. We cannot schedule expressions as Parallel since the Maya command interpreter is not thread-safe, because it must store state in order to provide useful logging and error reporting.

There are two ways to alter the scheduling level of your nodes:

Mel/Python Commands. Use the evaluationManager command to change the scheduling type of nodes at runtime. Below, we illustrate how you can change the scheduling of scene transform nodes:

Scheduling Type Command
Parallel evaluationManager -nodeTypeParallel on "transform";
Serial evaluationManager -nodeTypeSerialize on "transform";
GloballySerial evaluationManager -nodeTypeGloballySerialize on "transform";
Untrusted evaluationManager -nodeTypeUntrusted on "transform";

C++/Python API methods. You can also schedule individual nodes at compile time by overriding the MPxNode::schedulingType function. Functions should return one of the enumerated values specified by MPxNode::schedulingType. See the Maya 2016 MPxNode class reference for more details.

Safe Mode

On rare occasions you may notice that during manipulation or playback, Maya switches from Parallel to Serial evaluation. This is due to Safe Mode, which is an attempt to trap errors that lead to instabilities such as crashes. If Maya detects that multiple threads are attempting to simultaneously access a single node instance at the same time, the evaluation is forced to Serial execution to prevent problems.

While Safe Mode catches many problems, it cannot catch them all. Therefore, we have also developed a special Analysis Mode that performs a more thorough and costly check of your scene. Analysis mode is designed for riggers and TDs who wish to troubleshoot evaluation problems when creating new rigs. Avoid using Analysis Mode during animation since it will slow down your scene. See Analysis Mode for details.

Tip. If Safe Mode forces your scene into Serial mode, the EM may not produce the expected incorrect results when manipulating. In such cases you can either disable the EM:

evaluationManager -mode "off";

or disable EM-accelerated manipulation:

evaluationManager -man 0;

Custom Evaluators

Once the EG has been created, Maya targets node sub-graphs evaluation. In this section, we will review how we have used custom evaluators to accelerate deformations and catch evaluation errors on specific scenes. Currently you cannot author new custom evaluators, but in the future, we may extend OpenMaya to support such extensions.

Tip. Use the evaluator command to query the available/active evaluators or modify currently active evaluators.

import maya.cmds as cmds

# Returns a list of all currently available evaluators. 
cmds.evaluator( query=True )
# Result: [u'dynamics',
u'ikSystem',
u'disabling',
u'deformer',
u'transformFlattening',
u'reference',
u'pruneRoots'] # 

# Returns a list of all currently enabled evaluators.
cmds.evaluator( query=True, enable=True )
# Result: [u'dynamics',
u'ikSystem',
u'deformer',
u'transformFlattening',
u'reference',
u'pruneRoots'] # 

GPU Override

Maya contains a custom deformer evaluator that targets mesh deformations on the GPU using OpenCL to accelerate deformations in Viewport 2.0. The profoundly parallel nature of modern GPUs makes them ideal to tackle problems such as deformations that must perform the same operations on streams of data, such as mesh vertices and normals. We have included GPU implementations for 6 of the most commonly-used deformers in animated scenes: skinCluster, blendShape, cluster, tweak, groupParts, and softMod.

Unlike Maya’s previous deformer stack that performed deformations on the CPU and subsequently sent deformed geometry to the graphics card for rendering, the GPU override sends undeformed geometry to the graphics card, performs deformations in OpenCL and hands off the data to Viewport 2.0 for rendering without read-back overhead. We have observed substantial speed improvements from this approach in scenes with dense geometry.

Even if your scene uses only supported deformers, GPU override may not be enabled due to unsupported node features. For example, with the exception of softMod, deformers must currently apply to all vertices; there is no support for incomplete group components. Additional deformer-specific limitations are listed below:

Deformer Limitation(s)
skinCluster The following attribute values will be ignored:
- bindMethod
- bindPose
- bindVolume
- dropOff
- heatmapFalloff
- influenceColor
- lockWeights
- maintainMaxInfluences
- maxInfluences
- nurbsSamples
- paintTrans
- smoothness
- weightDistribution
blendShape The following attribute values will be ignored:
- baseOrigin
- icon
- normalizationId
- origin
- parallelBlender
- supportNegativeWeights
- targetOrigin
- topologyCheck
cluster n/a
tweak Only relative mode is supported. relativeTweak must be set to 1.
groupParts n/a
softMod Only volume falloff is supported when distance cache is disabled
Falloff must occur on all axes
Partial resolution must be disabled

A few other reasons that can prevent GPU override from accelerating your scene:

You can also increase support for new custom/proprietary deformers using new API extensions (refer to Custom GPU Deformers for details).

If you have enabled GPU Override and the HUD reports Enabled (0 k), this indicates that no deformations are happening on the GPU. There could be a number of reasons for this, such as those mentioned above.

To troubleshoot factors limiting use of GPU override for your particular scene, use the deformerEvaluator command. Supported options include:

Command What does it do?
deformerEvaluator; Prints the chain or a reason it is not supported for each selected node.
deformerEvaluator -chains; Prints all active deformation chains.
deformerEvaluator -meshes; Prints a chain for each mesh or a reason if it is not supported.

Dynamics Evaluator

Parallel evaluation in Maya 2016 only had limited support for animated dynamics. Although scenes with Bullet rigid bodies and Bifrost fluids evaluated correctly, legacy dynamics nodes (particles, fluids) and Nucleus nodes (nCloth, nHair, nParticles) disabled the Evaluation Manager, and reverted to DG-based evaluation on playback or manipulation.

Legacy dynamics disabled the EM because, in order to generate repeatable and stable results, they relied on evaluation rules that violated DG evaluation best practices. While these deviations from the safe path were accepted by the DG, once legacy dynamics were evaluated in Parallel, they created problems. The dynamics evaluator was originally created to detect these deviant node types and disable the EM, resorting to DG-based evaluation.

Since Maya 2016, the dynamics evaluator has been improved so that it can handle more complex dynamics setups. Now, it not only detects unsupported nodes and disables Parallel evaluation when it finds them, it also manages the tricky computation rules necessary for proper evaluation. This is one of the ways custom evaluators can be used to change Maya’s default evaluation behavior.

Note. Legacy dynamics nodes (particles, fluids) are still not supported. If the dynamics evaluator finds unsupported node types in the EG, it still disables parallel evaluation and resorts to DG-based evaluation.

By default, the following node types are blacklisted. If the dynamics evaluator finds them, it will disable the EM:

and any type derived from these.

In order for the dynamics evaluator to manage evaluation of dynamics, use the following commands:

evaluator -name dynamics -c "disablingNodes=unsupported";
evaluator -name dynamics -c "handledNodes=dynamics";
evaluator -name dynamics -c "action=evaluate";

The disablingNodes flag specifies the set of nodes that will force the dynamics evaluator to disable the EM, in this case, the nodes it does not support.

The handledNodes flag specifies the set of nodes that are going to be captured by the dynamics evaluator and scheduled in clusters that it will manage, in this case, any node type associated with dynamics.

The action flag specifies how the dynamics evaluator will handle its nodes, in this case, it will perform required evaluation tasks.

In this configuration, the node types that cause EM to be disabled are:

and any type derived from these.

In order to return to the default configuration, use the following commands:

evaluator -name dynamics -c "disablingNodes=legacy2016";
evaluator -name dynamics -c "handledNodes=none";
evaluator -name dynamics -c "action=none";

Tip. To get a list of nodes that will make the dynamics evaluator disable the EM in its present configuration, use the following command:

evaluator -name "dynamics" -valueName "disabledNodes" -query;

You can configure the dynamics evaluator to ignore unsupported nodes. If you want to try Parallel evaluation on a scene where it is disabled because of the presence of unsupported node types, use the following commands:

evaluator -name dynamics -c "disablingNodes=none";
evaluator -name dynamics -c "handledNodes=dynamics";
evaluator -name dynamics -c "action=evaluate";

Note: Using the dynamics evaluator on unsupported nodes may cause evaluation problems and/or application crashes; this is unsupported behavior. Proceed with caution.

Tip. If you want the dynamics evaluator to skip evaluation of all dynamics nodes in the scene, use the following commands:

evaluator -name dynamics -c "disablingNodes=unsupported";
evaluator -name dynamics -c "handledNodes=dynamics";
evaluator -name dynamics -c "action=freeze";

This can be useful to quickly disable dynamics when the simulation has a big impact on animation performance.

Dynamics simulation with the Evaluation Manager (and therefore the dynamics evaluator,) can have slightly different results from DG-based evaluation. Dynamics simulation results often depend on evaluation order, but with DG-based evaluation, the order depends on the sequence from which the data is pulled. For instance, the order from which the renderer draws items in the scene may differ from the order that a script gets simulation results.

With EM-based evaluation, the EM determines the evaluation order and, although it might differ from DG-based evaluation, it is consistent regardless of whether evaluation occurs in the context of the scene being rendered in a Maya viewport or is baked with Maya Batch.

While this is particularly relevant for dynamics simulation, it also applies to any nodes that have order-dependent evaluation. Even if you want to avoid order-dependent evaluation because it often leads to unreliable evaluation results, the EM will stabilize the order regardless of the context from which evaluation is generated. This order can sometimes generate results that slightly differ from legacy DG-based evaluation.

Reference Evaluator

When a reference is unloaded it leaves several nodes in the scene representing reference edits to preserve. Though these nodes may inherit animation from upstream nodes, they do not contribute to what’s rendered and can be safely ignored during evaluation. The reference evaluator ensures all such nodes are not evaluated.

Other Evaluators

In addition to the GPU override and dynamics evaluators, additional evaluators exist for specialized tasks:

Evaluator What does it do?
ikSystem Automatically disables the EM when a multi-chain solver is present in the EG. For regular IK chains it will perform any lazy update prior to parallel execution.
disabling Automatically disables the EM if user-specified nodes are present in the EG. This evaluator is used for troubleshooting purposes. It allow Maya to keep working stably until issues with problem nodes can be addressed.
transformFlattening Consolidates deep transform hierarchies containing animated parents and static children, leading to faster evaluation. Consolidation takes a snapshot of the relative parent/child transformations, allowing concurrent evaluation of downstream nodes.
pruneRoots We found that scenes with several thousand paramCurves become bogged down because of scheduling overhead from resulting EG nodes and lose potential gains from increased parallelism. To handle this situation, special clusters are created to group paramCurves into a small number of evaluation tasks, thus reducing overhead.

Custom evaluator names are subject to change as we introduce new evaluators and expand these functionalities.

Evaluator Conflicts

Sometimes, multiple evaluators will want to “claim responsibility” for the same node(s). This can result in conflict, negatively impacting performance. To avoid these conflicts, each evaluator is associated with a priority upon registration and nodes are assigned to the evaluator with the highest priority. Internal evaluators has been ordered to prioritize correctness and stability over speed.

API Extensions

We have added a few API extensions and tools that make the most of the evaluation capabilities to aid your pipeline . This section reviews API extensions for Parallel Evaluation, Custom GPU Deformers, and Profiling Plug-ins.

Parallel Evaluation

If your plug-in plays by the DG rules, you probably will not need many changes to make the plug-in work in Parallel mode. Porting your plug-in so it works in Parallel may be as simple as recompiling it against the latest version of OpenMaya!

If the EM generates different results than DG-based evaluation, make sure that your plug-in:

If your plug-in relies on custom dependency management, you need to use new API extensions to ensure correct results. As described earlier, the EG is built using the legacy dirty-propagation mechanism. Therefore, optimizations used to limit dirty propagation during DG evaluation, such as those found in MPxNode::setDependentsDirty, may introduce errors in the EG. Use MEvaluationManager::graphConstructionActive() to detect if this is occurring.

There are new virtual methods you will want to consider implementing:

Other recommended best practices include:

Custom GPU Deformers

To make GPU Override work on scenes containing custom deformers, Maya 2016 provides new API classes that allow the creation of fast OpenCL deformer back-ends.

Though you will still need to have a CPU implementation for the times when it is not possible to target deformations on the GPU (see GPU Override), you can augment this with an alternate deformer implementation inheriting from MPxGPUDeformer. This applies to your own nodes as well as to standard Maya nodes.

The GPU implementation will need to:

When you have done this, do not forget to load your plug-in at startup. Two working devkit examples (offsetNode and identityNode) have been provided to get you started.

Tip. To get a sense for the maximum speed increase you can expect by providing a GPU backend for a specific deformer, tell Maya to treat specific nodes as passthrough. Here’s an example applied to polySoftEdge:

   GPUBuiltInDeformerControl
       -name polySoftEdge
       -inputAttribute inputPolymesh 
       -outputAttribute output
       -passthrough;

Although results will be incorrect, this test can confirm if it is worth investing time implementing an OpenCL version of your node.

Profiling Plug-ins

To visualize how long custom plug-ins are taking in the new profiling tools (see Profiling Your Scene) you will need to instrument your code. Maya provides C++, Python, and Mel interface for you to do this. Refer to the Profiling using MEL or Python or the API technical docs for more details.

Profiling Your Scene

In the past, it could be challenging to understand where Maya was spending time. To remove the guess work out of performance diagnosis, Maya includes a new integrated profiler that lets you see exactly how long different tasks are taking.

You can open the Profiler by selecting:

Once the Profiler window is visible:

  1. Load your scene and start playback
  2. Click Start in the Profiler to record information in the pre-allocated record buffer.
  3. Wait until the record buffer becomes full or click Stop in the Profiler to stop recording. The Profiler shows a graph demonstrating the processing time for your animation.
  4. Try recording the scene in DG, Serial, Parallel, and GPU Override modes.

Tip. By default the profiler allocates a 20MB buffer to store results. The record buffer can be expanded via the UI or using the profiler -b value; command, where value is the desired size in MB. This may be needed for more complex scenes.

The Profiler includes information for all instrumented code, including playback, manipulation, authoring tasks, and UI/Qt events. When profiling your scene, make sure to capture several frames of data to ensure gathered results are representative of scene bottlenecks.

The Profiler supports several views depending on the task you wish to perform. The default Category View, shown below, classifies events by type (e.g., dirty, VP1, VP2, Evaluation, etc). The Thread and CPU views show how function chains are subdivided amongst available compute resources. Currently the Profiler does not support visualization of GPU-based activity.

Now that you have a general sense of what the Profiler tool does, let’s discuss key phases involved in computing results for your scene and how these are displayed. By understanding why scenes are slow, you can target scene optimizations.

Every time Maya updates a frame, it must compute and draw the elements in your scene. Hence, computation can be split into one of two main categories:

  1. Evaluation (i.e., doing the math that determines the most up-to-date values for scene elements)
  2. Rendering (i.e., doing the work that draws your scene in the viewport).

When the main bottleneck in your scene is evaluation, we say the scene is evaluation-bound. When the main bottleneck in your scene is rendering, we say the scene is render-bound.

Evaluation-Bound Performance

There are several different problems that may lead to evaluation-bound performance.

Lock Contention. When many threads try to access a shared resource you may experience Lock Contention, due to lock management overhead. One clue that this may be happening is that evaluation takes roughly the same duration regardless of which evaluation mode you use. This occurs since threads cannot proceed until other threads are finished using the shared resource.

Here the Profiler shows many separate identical tasks that start at nearly the same time on different threads, each finishing at different times. This type of profile offers a clue that there might be some shared resource that many threads need to access simultaneously.

Below is another image showing a similar problem.

In this case, since several threads were executing Python code, they all had to wait for the Global Interpreter Lock (GIL) to become available. Bottlenecks and performance loses caused by contention issues may be more noticeable when there is a high concurrency level, such as when your computer has many cores.

If you encounter contention issues, try to fix the code in question. For the above example, changing node scheduling converted the above profile to the following one, providing a nice performance gain. For this reason, Python plug-ins are scheduled as Globally Serial by default. As a result, they will be scheduled one after the other and will not block multiple threads waiting for the GIL to become available.

Clusters. As mentioned earlier, if the EG contains node-level circular dependencies, those nodes will be grouped into a cluster which represents a single unit of work to be scheduled serially. Although multiple clusters may be evaluated at the same time, large clusters limit the amount of work that can be performed simultaneously. Clusters can be identified in the Profiler as bars with the opaqueTaskEvaluation label, shown below.

If your scene contains clusters, analyze your rig’s structure to understand why circularities exist. Ideally, you should strive to remove coupling between parts of your rig, so rig sections (e.g., head, body, etc.) can be evaluated independently.

Tip. When troubleshooting scene performance issues, you can temporarily disable costly nodes using the per-node frozen attribute. This removes specific nodes from the EG. Although the result you see will change, it is a simple way to check that you have found the bottleneck for your scene.

Render-Bound Performance

The following is an illustration of a sample result from the Maya Profiler, zoomed to a single frame measured from a large scene with many animated meshes. Because of the number of objects, different materials, and the amount of geometry, this scene is very costly to render.

The attached profile has four main areas:

In this scene, a substantial number of meshes are being evaluated with GPU Override and some profiler blocks appear differently from what they would otherwise.

Evaluation. Area A depicts the time spent computing the state of the Maya scene. In this case, the scene is moderately well-parallelized. The blocks in shades of orange and green represent the software evaluation of DG nodes. The blocks in yellow are the tasks that initiate mesh evaluation via GPU Override. Mesh evaluation on the GPU starts with these yellow blocks and continues concurrently with the other work on the CPU.

An example of a parallel bottleneck in the scene evaluation appears in the gap in the center of the evaluation section. The large group of GPU Override blocks on the right depend on a single portion of the scene and must wait until that is complete.

Area A2 (above area A), depicts blue task blocks that show the work that VP2 does in parallel to the scene evaluation. In this scene, most of the mesh work is handled by GPU Override so it is mostly empty. When evaluating software meshes, this section shows the preparation of geometry buffers for rendering.

GPUOverridePostEval. Area B is where GPU Override finalizes some of its work. The amount of time spent in this block varies with different GPU and driver combinations. At some point there will be a wait for the GPU to complete its evaluation if it is heavily loaded. This time may appear here or it may show as additional time spent in the Vp2BuildRenderLists section.

Vp2BuildRenderList. Area C. Once the scene has been evaluated, VP2 builds the list of objects to render. Time in this section is typically proportional to the number of objects in the scene.

Vp2PrepareToUpdate. Area C2, very small in this profile. VP2 maintains an internal copy of the world and uses it to determine what to draw in the viewport. When it is time to render the scene, we must ensure that the objects in the VP2 database have been modified to reflect changes in the Maya scene. For example, objects may have become visible or hidden, their position or their topology may have changed, and so on. This is done by VP2 Vp2PrepareToUpdate.

Vp2PrepareToUpdate is slow when there are shape topology, material, or object visibility changes. In this example, Vp2PrepareToUpdate is almost invisible since the scene objects require little extra processing.

Vp2ParallelEvaluationTask is another profiler block that can appear in this area. If time is spent here, then some object evaluation has been deferred from the main evaluation section of the Evaluation Manager (area A) to be evaluated later. Evaluation in this section uses traditional DG evaluation.

Common cases for which Vp2BuildRenderLists or Vp2PrepareToUpdate can be slow during Parallel Evaluation are:

Vp2Draw3dBeautyPass. Area D. Once all data has been prepared, it is time to render the scene. This is where the actual OpenGL or DirectX rendering occurs. This area is broken into subsections depending on viewport effects such as depth peeling, transparency mode, and screen space anti-aliasing.

Vp2Draw3dBeautyPass can be slow if your scene:

Other Considerations. Although the key phases described above apply to all scenes, your scene may have different performance characteristics.

For static scenes with limited animation, or for non-deforming animated objects, consolidation is used to improve performance. Consolidation groups objects that share the same material. This reduces time spent in both Vp2BuildRenderLists and Vp2Draw3dBeatyPass, since there are fewer objects to render.

Troubleshooting Your Scene

Analysis Mode

The purpose of Analysis Mode is to perform more rigorous inspection of your scene to catch evaluation errors. Since Analysis Mode introduces overhead to your scene, only use this during debugging activities; animators should not enable Analysis Mode during their day-to-day work. Note that Analysis Mode is not thread-safe, so it is limited to Serial; you cannot use analysis mode while in Parallel evaluation.

The key function of Analysis Mode is to:

Tip. To activate Analysis Mode, use the dbtrace -k evalMgrGraphValid; MEL command.

Once active, error detection occurs after each evaluation. Missing dependencies are saved to a file in your machine’s temporary folder (e.g., %TEMP%\_MayaEvaluationGraphValidation.txt on Windows). The temporary directory on your platform can be determined using the internalVar -utd; MEL command.

To disable Analysis Mode, type: dbtrace -k evalMgrGraphValid -off;

Let’s assume that your scene contains the following three nodes. Because of the dependencies, the evaluation manager must compute the state of nodes B and C prior to calculating the state of A.

Now let’s assume Analysis Mode returns the following report:

Detected missing dependencies on frame 56
{
     A.output <-x- B
     A.output <-x- C [cluster]
}
Detected missing dependencies on frame 57
{
    A.output <-x- B
    A.output <-x- C [cluster]
}

The <-x- symbol indicates the direction of the missing dependency. The [cluster] term indicates that the node is inside of a cycle cluster, which means that any nodes from the cycles could be responsible for attribute access outside of evaluation order

In the above example, B accesses the output attribute of A, which is incorrect. These types of dependency do not appear in the Evaluation Graph and could cause a crash when running an evaluation in Parallel mode.

There are multiple reasons that missing dependencies occur, and how you handle them depends on the cause of the problem. If Analysis Mode discovers errors in your scene from bad dependencies due to:

Graph Execution Order

There are two primary methods of displaying the graph execution order.

The simplest is to use the ‘compute’ trace object to acquire a recording of the computation order. This can only be used in Serial mode, as explained earlier. The goal of compute trace is to compare DG and EM evaluation results and discover any evaluation differences related to a different ordering or missing execution between these two modes.

Keep in mind that there will be many differences between runs since the EM executes the graph from the roots forward, whereas the DG uses values from the leaves. For example in the simple graph shown earlier, the EM guarantees that B and C will be evaluated before A, but provides no information about the relative ordering of B and C. However in the DG, A pulls on the inputs from B and C in a consistent order dictated by the implementation of node A. The EM could show either "B, C, A" or "C, B, A" as their evaluation order and although both might be valid, the user must decide if they are equivalent or not. This ordering of information can be even more useful when debugging issues in cycle computation since in both modes a pull evaluation occurs, which will make the ordering more consistent.

The EM Shelf

The BonusTools has a special shelf specifically aimed at working with the EM that contains features to query and analyze your scene and to toggle various modes on/off. See the accompanying shelf documentation for a complete list of all shelf features.

Known Limitations

This section lists known limitations for the new evaluation system.

Revisions

2016 Extension 2

2016