For GCC versions 4.5 till 4.9 a new function instrumentation is available via the plug-in interface of the compiler. This new function instrumentation greatly improves the measurement performance. It also provides compile-time instrumentation filtering using the same filter file format as the run-time filtering. On some systems the GCC plug-in dev package needs to be installed, in order to provide the necessary header files.
Features and improvements:
Support for pthread_exit and pthread_cancel was added.
Added support for task migration in the profiling system.
Added support for Intel Xeon Phi systems (native mode only)
Added new user instrumentation macros (e.g., SCOREP_USER_REGION_BY_NAME_BEGIN( name, type ) and SCOREP_USER_REGION_BY_NAME_END( name )). These macros can annotate user regions without the need to take care about the handle struct.
User tools and API improvements and changes:
Due to the added task migration support, the default for the invokation of OPARI2 in the instrumenter was changed. Until now, the instrumenter let OPARI2 make all tasks tied and print a warning if an untied task was encountered. The new default is that the untied tasks are left untied and no warning is printed.
The task related data storage mechanism was changed. The profiling backend does not use a hash table to associate a task id with a data structure anymore, but gets a pointer from the task management in the measurement core. Thus, the environment variable SCOREP_PROFILING_TASK_TABLE_SIZE to specify the size of the hash table disappeared.
Added the environment variable SCOREP_PROFILING_TASK_EXCHANGE_NUM to specify how ofter the profiling system returns reallocated memory objects that have migrated to another thread.
Support for cobi was removed.
SCOREP_User_RegionBegin / SCOREP_User_RegionInit accept NULL as parameter value for lastFileName and lastFileHandle. This simplifies the calls to these functions when used directly without the provided macros.
score-score got a new option: -m allows to display mangled region names. Furthermore, the filter evalution in scorep-score can also use mangled names, too.
In some cases, not all regions are exited at measurement finalization time. Fixed.
Using PGI compiler instrumentation in conjunction with tasks could lead wrong region handles in region exits. Fixed.
Fix building of MPI wrapper if compiler issues unrelated warnings at configure time.
The SCOREP_USER_METRIC_UINT64 macro used signed values. Fixed.
Add conflict in the instrumenter between --thread=pthread and --mutex=pthread.
Fixed errors with libmpigf during linking of the instrumented application.
Fixes wrong acquisition order in pthread_cond_timedwait by modifying the nesting level (analog pthread_cond_wait)
Fixes that internal CUDA driver calls were recorded
Fixes a potential deadlock in CUDA adapter for multithreaded CUDA
Fortran OpenMP applications instrumented with OPARI2 and preprocessing report wrong file names ending in '.input.F' for POMP2 regions.