Other types of parallelism
 
 
 

Vectorization

Vector SSE instructions allow multiple identical operations to be done in parallel on adjacent data values. Four floating point operations or two double precision operations may be executed at the same time on processors supporting SSE2. Maya requires system support for SSE2, so the plug-in writer may assume SSE2 is always available.

SSE2 code can either be written directly in assembler, written using compiler intrinsics, or code can be written in C/C++ and the compiler can generate vector instructions. If the compiler generates vector SSE the process is known as autovectorization. The Intel and gcc compilers support autovectorization, but VC++ currently does not. Obviously it is much easier to write high level code and have the compiler generate vector instructions. However autovectorizers can be finicky and small changes can cause vectorization to disappear, so code written in C/C++ that relies on the autovectorizer needs to be flagged to ensure developers do not modify it and unwittingly disable the vectorization.

The supplied plug-in sseDeformer shows a simple example of SSE2 code that can be autovectorized by the Intel compiler. The example shows an approximate 3x speedup when run on a large polygonal mesh. Note that there can be significant overhead in getting the data into the correct format, which can sometimes negate any derived performance benefits. Also note that traditional threading of this code would be unlikely to be beneficial as the cost of the threading overhead might outweigh the savings from threading. Vectorization can be a good alternative in such cases. In the ideal case both vectorization and threading would be applied to derive maximum possible speedups.

Autoparallelism

Some compilers offer flags that will cause them to attempt to parallelize code automatically. These are rarely useful, since once code gets to be significantly time consuming, it becomes too difficult for the compiler to analyze statically whether it is threadsafe.