Telemetry Package

Telemetry Package #

Contents #

Additional Documentation #


Quick-Start Guide #

The telemetry package provides a full telemetry suite for analyzing simulation setup
runs. The suite uses third party logging tools and system information files (see ‘Logging
Tools’
) to produce 6 logfiles. Those logfiles are then passed on to the telemetry_plots
module to generate 6 plots (see ‘Python Module telemetry_plots').

DISCLAIMER:
The telemetry data collected by this suite does not represent perfectly accurate profiling data!

The logging utilities used do not record data in perfectly equal timesteps. The variations
introduced are not reflected in the time axis of the telemetry plots. However, these variations
are beyond the resolution of the plots anyway, so that no information loss occurs.

start_telemetry_run.sh Script #

The start_telemetry_run.sh script combines the execution of a simulation setup
with the recording of the logging data and saves the generated plots to the folder of
the logfiles.

Call this script it directly from the command line:

$ ./start_telemetry_run.sh

More information can be found in the section ‘Bash Script start_telemetry_run.sh'.

telemetry_plots Module #

The telemetry_plots module is a tool to plot existing telemetry data collected by the
start_telemetry_run.sh script or equivalent calls to the logging-tools used by that
script. The start_telemetry_run.sh script uses this module.

This module can be called directly from the command line:

$ ./telemetry_plots.py \
--files sys_mem_xxxxxxxxxxxxxxxxxxx.dat sys_util_xxxxxxxxxxxxxxxxxxx.dat \
gpu_xxxxxxxxxxxxxxxxxxx.dat gpu_mem_xxxxxxxxxxxxxxxxxxx.dat \
gpu_mem_long_xxxxxxxxxxxxxxxxxxx.dat gpu_long_xxxxxxxxxxxxxxxxxxx.dat \
ENDTIME_xxxxxxxxxxxxxxxxxxx.dat STARTTIME_xxxxxxxxxxxxxxxxxxx.dat \
--pname python3_mv_gpu

Additionally this module provides the class Plots for integrating the plotting functionality
in custom code. It can be imported directly from the telemetry package:

from telemetry import Plots

More information can be found in the section ‘Python Module telemetry_plots'.

User Guide #

The telemetry package is located inside the root folder of the pyglasma3d_numba
project and consists of the following files:

pyglasma3d_numba/
└─ telemetry/
    ├─ __init__.py                [internal file]
    ├─ README.md                  [readme]
    ├─ start_telemetry_run.sh     [script for main usage]
    └─ telemetry_plots.py         [main module]

The main intended usecase is to record telemetry data for simulation runs using
the start_telemetry_run.sh script. This script requires the HOME environment
variable to point to the path where the pyglasma3d_numba project root folder is
located. The generated logfiles will be saved to a subfolder of HOME as well:
$HOME/pyglasma3d_numba_stats/.
It is also possible to pass arguments to the simulation setup configured to run (per
default the mv.py setup) by passing them directly to this script. For easy
recognition of the processes being monitored, the simulation setup’s process name
will be set to “python3_mv_gpu” (see the section ‘User Guide’ for this script). The
start_telemetry_run.sh script uses the telemetry_plots module internally
to generate the telemetry plots.

The most common errors during execution are explained in the section ‘User Guide’ for
errors regarding the start_telemetry_run.sh script and in the section ‘Errors’ for
errors regarding the telemetry_plots module.

The logged data contains:

  • CPU utilization for each core and average over all cores
  • System memory usage
  • GPU utilization
  • GPU memory utilization and usage

In addition to system telemetry data, 5 logfiles containing information about the
simulation run are saved. These include:

  • Messages to stdout (the simulation log)
  • Configuration parameters for the simulation setup
  • Starttime, endtime and duration measurements

After a successful run of the start_telemetry_run.sh script the resulting folder tree
with all logfiles and plots would look like:

$HOME/pyglasma3d_numba_stats/2020-02-05_20-29-31/
     ├── CMD_ARGS_2020-02-05_20-29-31.dat
     ├── DURATION_2020-02-05_20-29-31.dat
     ├── ENDTIME_2020-02-05_20-29-31.dat
     ├── gpu_2020-02-05_20-29-31.dat
     ├── gpu_long_2020-02-05_20-29-31.dat
     ├── gpu_mem_2020-02-05_20-29-31.dat
     ├── gpu_mem_long_2020-02-05_20-29-31.dat
     ├── PARSED_gpu_mem_long_2020-02-05_20-29-31.dat
     ├── SIMLOG_2020-02-05_20-29-31.dat
     ├── STARTTIME_2020-02-05_20-29-31.dat
     ├── sys_mem_2020-02-05_20-29-31.dat
     ├── sys_util_2020-02-05_20-29-31.dat
     ├── PARSED_sys_util_2020-02-05_20-29-31.dat
     └── plots/
                    ├── PLOT_gpu_2020-02-05_20-29-31.png
                    ├── PLOT_gpu_long_2020-02-05_20-29-31.png
                    ├── PLOT_gpu_mem_2020-02-05_20-29-31.png
                    ├── PLOT_gpu_mem_long_2020-02-05_20-29-31.png
                    ├── PLOT_sys_mem_2020-02-05_20-29-31.png
                    └── PLOT_sys_util_2020-02-05_20-29-31.png

Please refer to the section ‘Logfiles’ for detailed information about the logfiles and to the
section ‘Sample Plots’ for detailed information about the plots.

Requirements #

The contents of this package are based on linux kernel 4.19.12. They have also been tested
with the latest kernel version at the time of writing: 5.5.4

This telemetry package requires:

  • Python version 3.8.1 or higher

Additionally the following external Python modules are required:

Package Version ≥
matplotlib 3.1.2
numpy 1.18.1

Additionally the following system tools and drivers are required:

Package Version
mpstat (providing sysstat) 11.4.3
proprietary Nvidia driver (providing nvidia-smi) 440.40

Logfiles #

The following 13 files are the output of the start_telemetry_run.sh script. They are
either generated by the logging tools used, or are parsed versions of those logfiles generated
by the telemetry_plots module. The placeholder ‘x’ will be replaced with the timestamp
recorded at launch of the start_telemetry_run.sh script and is the output of the
command:

$ date +%Y-%m-%d_%H-%M-%S

CMD_ARGS_xxxxxxxxxxxxxxxxxxx.dat #

--fastmath 0 --energy 420000.0

This file contains any command line arguments supplied to the start_telemetry_run.sh
script and that were passed on to the simulation setup.

DURATION_xxxxxxxxxxxxxxxxxxx.dat #

real 120.00
usr 89.00
sys 42.42

This file contains the duration of the simulation run in seconds where real is the
elapsed real time, usr is the elapsed user CPU time and sys is the elapsed
system CPU time as reported by the GNU time(1) command:

time -f "real %e\nusr %U\nsys %S" 
[...]

ENDTIME_xxxxxxxxxxxxxxxxxxx.dat #

2020-02-05_20-31-33.219706012

This file contains the ending timestamp of the simulation run and is the output of the date(1)
command:

date +%Y-%m-%d_%H-%M-%S.%N

gpu_xxxxxxxxxxxxxxxxxxx.dat #

# output format:
# GPU_id, gpuUtil, us, util_percent
0, gpuUtil , 1580930970350480, 7
0, gpuUtil , 1580930970518438, 7
0, gpuUtil , 1580930970686393, 7
0, gpuUtil , 1580930970854018, 7
0, gpuUtil , 1580930971021954, 7
0, gpuUtil , 1580930971189840, 7
0, gpuUtil , 1580930971357525, 22
0, gpuUtil , 1580930971525519, 7
0, gpuUtil , 1580930971693287, 8
0, gpuUtil , 1580930971860853, 9
...

This file contains GPU utilization data as produced by the corresponding nvidia-smi call.
The first column is the GPU ID. The third column is a timestamp with microseconds counted
from ‘epoch’. The last column is the utilization in percent. The data is recorded in ~ 166
millisecond intervalls.

gpu_long_xxxxxxxxxxxxxxxxxxx.dat #

# output format:
# YYYY/MM/DD HH:MM:SS.mmm, GPU_util, MEM_util, MEM_used, MEM_total
2020/02/05 20:29:31.195, 8, 0, 2011, 12064
2020/02/05 20:29:31.695, 8, 0, 2011, 12064
2020/02/05 20:29:32.195, 8, 0, 2011, 12064
2020/02/05 20:29:32.695, 8, 0, 2011, 12064
2020/02/05 20:29:33.195, 8, 0, 2011, 12064
2020/02/05 20:29:33.695, 8, 0, 2011, 12064
2020/02/05 20:29:34.196, 8, 0, 2011, 12064
2020/02/05 20:29:34.696, 8, 0, 2011, 12064
2020/02/05 20:29:35.196, 8, 0, 2011, 12064
2020/02/05 20:29:35.696, 8, 0, 2011, 12064
...

This file contains GPU and GPU memory utilization data as produced by the corresponding
nvidia-smi call. The first two columns compose the timestamp. The third column is
the GPU utilization in percent. The fourth column is the GPU memory utilization in percent.
The fifth column shows the used GPU memory in MB. The sixth column is the maximum
available GPU memory in MB. The data is recorded in ~ 500 millisecond intervalls.

gpu_mem_xxxxxxxxxxxxxxxxxxx.dat #

# output format:
# GPU_id, memUtil, us, util_percent
0, memUtil , 1580930970350528, 0
0, memUtil , 1580930970518476, 0
0, memUtil , 1580930970686422, 0
0, memUtil , 1580930970854038, 20
0, memUtil , 1580930971021963, 25
0, memUtil , 1580930971189840, 20
0, memUtil , 1580930971357516, 10
0, memUtil , 1580930971525500, 0
0, memUtil , 1580930971693259, 0
0, memUtil , 1580930971860815, 0
...

This file contains GPU memory utilization data as produced by the corresponding
nvidia-smi call. The first column is the GPU ID. The third column is a timestamp
with microseconds counted from ‘epoch’. The last column is the utilization in percent.
The data is recorded in ~ 166 millisecond intervalls.

gpu_mem_long_xxxxxxxxxxxxxxxxxxx.dat #

# output format:
# YYYY/MM/DD HH:MM:SS.mmm, P_name, MEM_used
2020/02/05 20:29:31.197, /home/user/.conda/envs/project/bin/python, 921
2020/02/05 20:29:31.197, python3_mv_gpu, 539
2020/02/05 20:29:31.197, python, 539
2020/02/05 20:29:31.197, /home/user/.conda/envs/project/bin/python, 921
2020/02/05 20:29:31.197, python3_mv_gpu, 539
2020/02/05 20:29:31.197, python, 539
2020/02/05 20:29:31.197, /home/user/.conda/envs/project/bin/python, 921
2020/02/05 20:29:31.197, python3_mv_gpu, 539
2020/02/05 20:29:31.197, python, 539
...

This file contains GPU memory usage data as produced by the corresponding
nvidia-smi call. The first two columns compose the timestamp. The third
column is the name of the compute process whose memory usage in MB is recorded
as the third column. This file will get filtered for the compute process of the running
simulation setup: python3_mv_gpu. The resulting file is
PARSED_gpu_mem_long_xxxxxxxxxxxxxxxxxxx.dat. If there is more than one
process with the same name as used by the current simulation run, both of of these logfiles
will be rendered unusable. Additionally the plot that is based on these files will be unusable
as well.

PARSED_gpu_mem_long_xxxxxxxxxxxxxxxxxxx.dat #

# output format:
# YYYY/MM/DD HH:MM:SS.mmm, P_name, MEM_used
2020/02/05 20:29:31.197, python3_mv_gpu, 539
2020/02/05 20:29:31.197, python3_mv_gpu, 539
2020/02/05 20:29:31.698, python3_mv_gpu, 539
2020/02/05 20:29:31.698, python3_mv_gpu, 539
2020/02/05 20:29:32.198, python3_mv_gpu, 539
2020/02/05 20:29:32.198, python3_mv_gpu, 539
2020/02/05 20:29:32.698, python3_mv_gpu, 539
2020/02/05 20:29:32.698, python3_mv_gpu, 539
2020/02/05 20:29:33.198, python3_mv_gpu, 539
2020/02/05 20:29:33.198, python3_mv_gpu, 539
...

This file contains GPU memory usage data for the simulation process and is the result
of filtering the gpu_mem_long_xxxxxxxxxxxxxxxxxxx.dat file.

SIMLOG_xxxxxxxxxxxxxxxxxxx.dat #

[...]
Using CUDA
FASTMATH is set to: True

##########################################################

[t=0.0]: Initializing left nucleus.
Initialized left nucleus in: 9.182
[t=0.0]: Initializing right nucleus.
Initialized right nucleus in: 5.45

CUDA free memory: 4.47 GB of 7.93 GB.
Current memory usage: 1.92 GB.

1.0 Complete cycle in: 1.172
2.0 Complete cycle in: 0.005
...

This file contains a dump of the simulation log. That log consists of any output the simulation
setup
writes to stdout.

STARTTIME_xxxxxxxxxxxxxxxxxxx.dat #

2020-02-05_20-29-33.194286049

This file contains the starting timestamp of the simulation run and is the output of the date(1)
command:

date +%Y-%m-%d_%H-%M-%S.%N

sys_mem_xxxxxxxxxxxxxxxxxxx.dat #

2020/02/05 20:29:32	MemTotal:       32881424 kB	MemFree:         6064676 kB	Buffers:         3803468 kB	Cached:         15242080 kB	Shmem:             94556 kB	SReclaimable:    3958080 kB	
2020/02/05 20:29:33	MemTotal:       32881424 kB	MemFree:         6064676 kB	Buffers:         3803468 kB	Cached:         15242092 kB	Shmem:             94560 kB	SReclaimable:    3958080 kB	
2020/02/05 20:29:34	MemTotal:       32881424 kB	MemFree:         6064416 kB	Buffers:         3803468 kB	Cached:         15242100 kB	Shmem:             94564 kB	SReclaimable:    3958080 kB	
2020/02/05 20:29:35	MemTotal:       32881424 kB	MemFree:         6064164 kB	Buffers:         3803468 kB	Cached:         15242104 kB	Shmem:             94564 kB	SReclaimable:    3958080 kB	
2020/02/05 20:29:36	MemTotal:       32881424 kB	MemFree:         6064164 kB	Buffers:         3803472 kB	Cached:         15242104 kB	Shmem:             94564 kB	SReclaimable:    3958080 kB	
2020/02/05 20:29:37	MemTotal:       32881424 kB	MemFree:         6064148 kB	Buffers:         3803472 kB	Cached:         15242108 kB	Shmem:             94564 kB	SReclaimable:    3958080 kB	
2020/02/05 20:29:38	MemTotal:       32881424 kB	MemFree:         6064164 kB	Buffers:         3803472 kB	Cached:         15242112 kB	Shmem:             94564 kB	SReclaimable:    3958080 kB	
2020/02/05 20:29:39	MemTotal:       32881424 kB	MemFree:         6064164 kB	Buffers:         3803472 kB	Cached:         15242120 kB	Shmem:             94568 kB	SReclaimable:    3958080 kB	
2020/02/05 20:29:40	MemTotal:       32881424 kB	MemFree:         6064164 kB	Buffers:         3803472 kB	Cached:         15242128 kB	Shmem:             94572 kB	SReclaimable:    3958080 kB	
2020/02/05 20:29:41	MemTotal:       32881424 kB	MemFree:         6063912 kB	Buffers:         3803476 kB	Cached:         15242132 kB	Shmem:             94572 kB	SReclaimable:    3958080 kB	
...

This file contains system memory usage data as produced by polling /proc/meminfo in
~ 1 second intervalls. The first two columns compose the timestamp. All following columns
are values that are used to calculate the amount of used system memory. The calculation
is the same as done by the htop tool:
$MemUsed = MemTotal - MemFree - Buff - Cached - SReclaimable + Shmem$

sys_util_xxxxxxxxxxxxxxxxxxx.dat #

Linux 4.19.0-0.bpo.1-amd64 (server) 	2020-02-05 	_x86_64_	(12 CPU)

20:29:31     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
20:29:32     all   95.42    0.00    4.58    0.00    0.00    0.00    0.00    0.00    0.00    0.00
20:29:32       0   95.96    0.00    4.04    0.00    0.00    0.00    0.00    0.00    0.00    0.00
20:29:32       1   91.92    0.00    8.08    0.00    0.00    0.00    0.00    0.00    0.00    0.00
20:29:32       2   94.00    0.00    6.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
20:29:32       3   95.00    0.00    5.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
20:29:32       4   97.98    0.00    2.02    0.00    0.00    0.00    0.00    0.00    0.00    0.00
20:29:32       5   98.00    0.00    2.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
20:29:32       6   95.96    0.00    4.04    0.00    0.00    0.00    0.00    0.00    0.00    0.00
20:29:32       7   97.00    0.00    3.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
20:29:32       8   96.00    0.00    4.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
20:29:32       9   98.00    0.00    2.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
20:29:32      10   95.00    0.00    5.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
20:29:32      11   93.07    0.00    6.93    0.00    0.00    0.00    0.00    0.00    0.00    0.00

20:29:32     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
20:29:33     all   95.75    0.00    4.25    0.00    0.00    0.00    0.00    0.00    0.00    0.00
20:29:33       0   96.00    0.00    4.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
20:29:33       1   96.04    0.00    3.96    0.00    0.00    0.00    0.00    0.00    0.00    0.00
20:29:33       2   98.00    0.00    2.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
20:29:33       3   96.00    0.00    4.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
20:29:33       4   96.00    0.00    4.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
20:29:33       5   96.00    0.00    4.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
...

This file contains CPU utilization data as produced by used the call to mpstat(1). The
first line is a summary of the basic system properties. Following that will be blocks of
utilization data in ~ 1 second intervalls separated by a blank line. Each block will have a
number of lines:

  • A header line
  • A summary line
  • One line for each CPU core

Not all columns are used for the plot. The relevant columns are the timestamp
(first column), the CPU ID (second column), the user CPU utilization in percent (third column)
and the idle percentage (last column). For each recorded data block, those values will be
extracted and written to the file PARSED_sys_util_xxxxxxxxxxxxxxxxxxx.dat.

PARSED_sys_util_xxxxxxxxxxxxxxxxxxx.dat #

# output format:
# HH:MM:SS CPU_all %usr %idle HH:MM:SS CPU_0 %usr %idle HH:MM:SS CPU_1 %usr %idle ...
20:29:32	all	95.42	0.00	20:29:32	0	95.96	0.00	20:29:32	1	91.92	0.00	20:29:32	2	94.00	0.00	20:29:32	3	95.00	0.00	20:29:32	4	97.98	0.00	20:29:32	5	98.00	0.00	20:29:32	6	95.96	0.00	20:29:32	7	97.00	0.00	20:29:32	8	96.00	0.00	20:29:32	9	98.00	0.00	20:29:32	10	95.00	0.00	20:29:32	11	93.07	0.00	
20:29:33	all	95.75	0.00	20:29:33	0	96.00	0.00	20:29:33	1	96.04	0.00	20:29:33	2	98.00	0.00	20:29:33	3	96.00	0.00	20:29:33	4	96.00	0.00	20:29:33	5	96.00	0.00	20:29:33	6	97.03	0.00	20:29:33	7	93.94	0.00	20:29:33	8	95.00	0.00	20:29:33	9	96.00	0.00	20:29:33	10	93.00	0.00	20:29:33	11	95.00	0.00	
20:29:34	all	95.42	0.00	20:29:34	0	96.00	0.00	20:29:34	1	93.94	0.00	20:29:34	2	97.00	0.00	20:29:34	3	94.00	0.00	20:29:34	4	98.00	0.00	20:29:34	5	98.00	0.00	20:29:34	6	93.00	0.00	20:29:34	7	94.00	0.00	20:29:34	8	96.00	0.00	20:29:34	9	97.00	0.00	20:29:34	10	93.00	0.00	20:29:34	11	96.00	0.00	
20:29:35	all	96.33	0.00	20:29:35	0	96.00	0.00	20:29:35	1	94.06	0.00	20:29:35	2	96.00	0.00	20:29:35	3	98.00	0.00	20:29:35	4	98.02	0.00	20:29:35	5	99.00	0.00	20:29:35	6	93.00	0.00	20:29:35	7	96.04	0.00	20:29:35	8	96.00	0.00	20:29:35	9	97.00	0.00	20:29:35	10	94.00	0.00	20:29:35	11	95.96	0.00	
20:29:36	all	94.84	0.00	20:29:36	0	98.00	0.00	20:29:36	1	93.00	0.00	20:29:36	2	95.00	0.00	20:29:36	3	96.00	0.00	20:29:36	4	96.97	0.00	20:29:36	5	97.00	0.00	20:29:36	6	94.00	0.00	20:29:36	7	93.00	0.00	20:29:36	8	95.05	0.00	20:29:36	9	92.00	0.00	20:29:36	10	94.00	0.00	20:29:36	11	96.00	0.00	
20:29:37	all	94.75	0.00	20:29:37	0	97.00	0.00	20:29:37	1	91.92	0.00	20:29:37	2	93.00	0.00	20:29:37	3	96.00	0.00	20:29:37	4	94.00	0.00	20:29:37	5	97.00	0.00	20:29:37	6	94.95	0.00	20:29:37	7	95.00	0.00	20:29:37	8	95.96	0.00	20:29:37	9	96.00	0.00	20:29:37	10	93.00	0.00	20:29:37	11	94.06	0.00	
20:29:38	all	95.83	0.00	20:29:38	0	97.03	0.00	20:29:38	1	95.05	0.00	20:29:38	2	93.00	0.00	20:29:38	3	96.00	0.00	20:29:38	4	98.00	0.00	20:29:38	5	97.00	0.00	20:29:38	6	95.05	0.00	20:29:38	7	95.00	0.00	20:29:38	8	96.00	0.00	20:29:38	9	95.00	0.00	20:29:38	10	95.00	0.00	20:29:38	11	95.96	0.00	
20:29:39	all	95.83	0.00	20:29:39	0	96.97	0.00	20:29:39	1	95.96	0.00	20:29:39	2	98.00	0.00	20:29:39	3	96.00	0.00	20:29:39	4	97.00	0.00	20:29:39	5	98.00	0.00	20:29:39	6	97.00	0.00	20:29:39	7	94.95	0.00	20:29:39	8	96.00	0.00	20:29:39	9	96.00	0.00	20:29:39	10	92.00	0.00	20:29:39	11	93.07	0.00	
20:29:40	all	95.33	0.00	20:29:40	0	97.00	0.00	20:29:40	1	96.00	0.00	20:29:40	2	95.00	0.00	20:29:40	3	93.00	0.00	20:29:40	4	99.00	0.00	20:29:40	5	96.00	0.00	20:29:40	6	96.00	0.00	20:29:40	7	94.06	0.00	20:29:40	8	94.06	0.00	20:29:40	9	97.00	0.00	20:29:40	10	94.00	0.00	20:29:40	11	91.00	0.00	
20:29:41	all	96.09	0.00	20:29:41	0	97.00	0.00	20:29:41	1	94.00	0.00	20:29:41	2	95.00	0.00	20:29:41	3	96.00	0.00	20:29:41	4	97.00	0.00	20:29:41	5	98.00	0.00	20:29:41	6	94.95	0.00	20:29:41	7	96.00	0.00	20:29:41	8	98.99	0.00	20:29:41	9	96.00	0.00	20:29:41	10	97.00	0.00	20:29:41	11	96.00	0.00	
...

This file contains CPU utilization data extracted from the sys_util_xxxxxxxxxxxxxxxxxxx.dat
logfile. The first four columns are average data for all CPU cores. The first column is the timestamp,
the second is the ID, the third is user CPU utilization in percent and the fourth is idle percentage.
For the following columns this scheme will be repeated for each CPU core. The values are recorded
in ~ 1 second intervalls.

Sample Plots #

DISCLAIMER:
The telemetry data collected by this suite does not represent perfectly accurate profiling data!

The following 6 plots will be generated by the telemetry_plots module based on the logfiles
recorded by the logging tools. The placeholder ‘x’ will be replaced with the timestamp recorded
at launch of the start_telemetry_run.sh script and is the output of the command:

$ date +%Y-%m-%d_%H-%M-%S

PLOT_gpu_xxxxxxxxxxxxxxxxxxx.png #

This plot shows the GPU utilization in percent (%) and is based on the file:
gpu_xxxxxxxxxxxxxxxxxxx.dat. The data is recorded in ~ 166ms intervalls.
The black dashed lines mark the start and end of the simulation run.

PLOT_gpu_long_xxxxxxxxxxxxxxxxxxx.png #

This plot shows the GPU and GPU memory utilization in percent (%), the total GPU
memory used in GB and the total available GPU memory in GB and is based on the
file: gpu_long_xxxxxxxxxxxxxxxxxxx.dat. The data is recorded in ~ 500ms
intervalls. The black dashed lines mark the start and end of the simulation run.

PLOT_gpu_mem_xxxxxxxxxxxxxxxxxxx.png #

This plot shows the GPU memory utilization in percent (%) and is based on the file:
gpu_mem_xxxxxxxxxxxxxxxxxxx.dat. The data is recorded in ~ 166ms intervalls.
The black dashed lines mark the start and end of the simulation run.

PLOT_gpu_mem_long_xxxxxxxxxxxxxxxxxxx.png #

This plot shows the GPU memory used in GB only for the process being monitored.
In the section ‘User Guide’ it is explained how that process is identified. The data is
based on the file: gpu_mem_long_xxxxxxxxxxxxxxxxxxx.dat and is recorded
in ~ 500ms intervalls. The black dashed lines mark the start and end of the simulation
run.

PLOT_sys_mem_xxxxxxxxxxxxxxxxxxx.png #

This plot shows the total system memory used in GB and the total available system
memory in GB and is based on the file: sys_mem_xxxxxxxxxxxxxxxxxxx.dat.
The data is recorded in ~ 1s intervalls. The black dashed lines mark the start and
end of the simulation run.

PLOT_sys_util_xxxxxxxxxxxxxxxxxxx.png #

The first subplot shows the CPU utilization in percent (%) averaged over all cores for
measurements in system space and user space. The second subplot shows the CPU
utilization in percent (%) for each core measured in system space. The third subplot
shows the CPU utilization in percent (%) for each core measured in user space. The
data is based on the file: sys_util_xxxxxxxxxxxxxxxxxxx.dat and is recorded
in ~ 1s intervalls. The black dashed lines mark the start and end of the simulation run.