Telemetry Package #
Contents #
- Quick-Start Guide
- User Guide
-
Logfiles
-
CMD_ARGS_xxxxxxxxxxxxxxxxxxx.dat
-
DURATION_xxxxxxxxxxxxxxxxxxx.dat
-
ENDTIME_xxxxxxxxxxxxxxxxxxx.dat
-
gpu_xxxxxxxxxxxxxxxxxxx.dat
-
gpu_long_xxxxxxxxxxxxxxxxxxx.dat
-
gpu_mem_xxxxxxxxxxxxxxxxxxx.dat
-
gpu_mem_long_xxxxxxxxxxxxxxxxxxx.dat
-
PARSED_gpu_mem_long_xxxxxxxxxxxxxxxxxxx.dat
-
SIMLOG_xxxxxxxxxxxxxxxxxxx.dat
-
STARTTIME_xxxxxxxxxxxxxxxxxxx.dat
-
sys_mem_xxxxxxxxxxxxxxxxxxx.dat
-
sys_util_xxxxxxxxxxxxxxxxxxx.dat
-
PARSED_sys_util_xxxxxxxxxxxxxxxxxxx.dat
-
- Sample Plots
Additional Documentation #
Quick-Start Guide #
The
telemetry
package provides a full telemetry suite for analyzing
simulation setup
runs. The suite uses third party logging tools and system information files (see
‘Logging
Tools’) to produce 6
logfiles. Those
logfiles are then passed on to the
telemetry_plots
module to generate 6
plots (see
‘Python Module telemetry_plots
').
DISCLAIMER:
The telemetry data collected by this suite does not represent perfectly accurate profiling data!
The
logging utilities used do not record data in perfectly equal timesteps. The variations
introduced are not reflected in the time axis of the
telemetry plots. However, these variations
are beyond the resolution of the
plots anyway, so that no information loss occurs.
start_telemetry_run.sh
Script
#
The
start_telemetry_run.sh
script combines the execution of a
simulation setup
with the recording of the
logging data and saves the
generated plots to the folder of
the
logfiles.
Call this script it directly from the command line:
$ ./start_telemetry_run.sh
More information can be found in the section
‘Bash Script start_telemetry_run.sh
'.
telemetry_plots
Module
#
The
telemetry_plots
module is a tool to plot existing telemetry data collected by the
start_telemetry_run.sh
script or equivalent calls to the
logging-tools used by that
script. The
start_telemetry_run.sh
script uses this module.
This module can be called directly from the command line:
$ ./telemetry_plots.py \
--files sys_mem_xxxxxxxxxxxxxxxxxxx.dat sys_util_xxxxxxxxxxxxxxxxxxx.dat \
gpu_xxxxxxxxxxxxxxxxxxx.dat gpu_mem_xxxxxxxxxxxxxxxxxxx.dat \
gpu_mem_long_xxxxxxxxxxxxxxxxxxx.dat gpu_long_xxxxxxxxxxxxxxxxxxx.dat \
ENDTIME_xxxxxxxxxxxxxxxxxxx.dat STARTTIME_xxxxxxxxxxxxxxxxxxx.dat \
--pname python3_mv_gpu
Additionally this module provides the
class Plots
for integrating the plotting
functionality
in custom code. It can be imported directly from the
telemetry package:
from telemetry import Plots
More information can be found in the section
‘Python Module telemetry_plots
'.
User Guide #
The
telemetry package is located inside the root folder of the
pyglasma3d_numba
project and consists of the following files:
pyglasma3d_numba/
└─ telemetry/
├─ __init__.py [internal file]
├─ README.md [readme]
├─ start_telemetry_run.sh [script for main usage]
└─ telemetry_plots.py [main module]
The main intended usecase is to record
telemetry data for simulation runs using
the
start_telemetry_run.sh
script. This script requires the HOME
environment
variable to point to the path where the
pyglasma3d_numba project root folder is
located. The generated
logfiles will be saved to a subfolder of HOME
as well:
$HOME/pyglasma3d_numba_stats/
.
It is also possible to pass arguments to the
simulation setup configured to run (per
default the
mv.py
setup) by passing them directly to this script. For easy
recognition of the processes being monitored, the
simulation setup’s process name
will be set to
“python3_mv_gpu” (see the section
‘User Guide’ for this script). The
start_telemetry_run.sh
script uses the
telemetry_plots
module internally
to generate the
telemetry plots.
The most common errors during execution are explained in the section
‘User Guide’ for
errors regarding the
start_telemetry_run.sh
script and in the section
‘Errors’ for
errors regarding the
telemetry_plots
module.
The logged data contains:
- CPU utilization for each core and average over all cores
- System memory usage
- GPU utilization
- GPU memory utilization and usage
In addition to system telemetry data, 5
logfiles containing information about the
simulation run are saved. These include:
- Messages to
stdout
(the simulation log) - Configuration parameters for the simulation setup
- Starttime, endtime and duration measurements
After a successful run of the
start_telemetry_run.sh
script the resulting folder tree
with all
logfiles and
plots would look like:
$HOME/pyglasma3d_numba_stats/2020-02-05_20-29-31/
├──
CMD_ARGS_2020-02-05_20-29-31.dat
├──
DURATION_2020-02-05_20-29-31.dat
├──
ENDTIME_2020-02-05_20-29-31.dat
├──
gpu_2020-02-05_20-29-31.dat
├──
gpu_long_2020-02-05_20-29-31.dat
├──
gpu_mem_2020-02-05_20-29-31.dat
├──
gpu_mem_long_2020-02-05_20-29-31.dat
├──
PARSED_gpu_mem_long_2020-02-05_20-29-31.dat
├──
SIMLOG_2020-02-05_20-29-31.dat
├──
STARTTIME_2020-02-05_20-29-31.dat
├──
sys_mem_2020-02-05_20-29-31.dat
├──
sys_util_2020-02-05_20-29-31.dat
├──
PARSED_sys_util_2020-02-05_20-29-31.dat
└── plots/
├──
PLOT_gpu_2020-02-05_20-29-31.png
├──
PLOT_gpu_long_2020-02-05_20-29-31.png
├──
PLOT_gpu_mem_2020-02-05_20-29-31.png
├──
PLOT_gpu_mem_long_2020-02-05_20-29-31.png
├──
PLOT_sys_mem_2020-02-05_20-29-31.png
└──
PLOT_sys_util_2020-02-05_20-29-31.png
Please refer to the section
‘Logfiles’ for detailed information about the
logfiles and to the
section
‘Sample Plots’ for detailed information about the
plots.
Requirements #
The contents of this package are based on linux kernel 4.19.12. They have also been tested
with the latest kernel version at the time of writing: 5.5.4
This telemetry package requires:
- Python version 3.8.1 or higher
Additionally the following external Python modules are required:
Package | Version ≥ |
---|---|
matplotlib | 3.1.2 |
numpy | 1.18.1 |
Additionally the following system tools and drivers are required:
Package | Version |
---|---|
mpstat (providing sysstat) | 11.4.3 |
proprietary Nvidia driver (providing nvidia-smi) | 440.40 |
Logfiles #
The following 13 files are the output of the
start_telemetry_run.sh
script. They are
either generated by the
logging tools used, or are parsed versions of those
logfiles generated
by the
telemetry_plots
module. The placeholder ‘x’ will be replaced with the timestamp
recorded at launch of the
start_telemetry_run.sh
script and is the output of the
command:
$ date +%Y-%m-%d_%H-%M-%S
CMD_ARGS_xxxxxxxxxxxxxxxxxxx.dat
#
--fastmath 0 --energy 420000.0
This file contains any
command line arguments supplied to the
start_telemetry_run.sh
script and that were passed on to the
simulation setup.
DURATION_xxxxxxxxxxxxxxxxxxx.dat
#
real 120.00
usr 89.00
sys 42.42
This file contains the duration of the simulation run in seconds where real
is the
elapsed real time, usr
is the elapsed user CPU time and sys
is the elapsed
system CPU time as reported by the
GNU time(1) command:
time -f "real %e\nusr %U\nsys %S"
[...]
ENDTIME_xxxxxxxxxxxxxxxxxxx.dat
#
2020-02-05_20-31-33.219706012
This file contains the ending timestamp of the simulation run and is the output of the
date(1)
command:
date +%Y-%m-%d_%H-%M-%S.%N
gpu_xxxxxxxxxxxxxxxxxxx.dat
#
# output format:
# GPU_id, gpuUtil, us, util_percent
0, gpuUtil , 1580930970350480, 7
0, gpuUtil , 1580930970518438, 7
0, gpuUtil , 1580930970686393, 7
0, gpuUtil , 1580930970854018, 7
0, gpuUtil , 1580930971021954, 7
0, gpuUtil , 1580930971189840, 7
0, gpuUtil , 1580930971357525, 22
0, gpuUtil , 1580930971525519, 7
0, gpuUtil , 1580930971693287, 8
0, gpuUtil , 1580930971860853, 9
...
This file contains GPU utilization data as produced by the corresponding
nvidia-smi
call.
The first column is the GPU ID. The third column is a timestamp with microseconds counted
from ‘epoch’. The last column is the utilization in percent. The data is recorded in ~ 166
millisecond intervalls.
gpu_long_xxxxxxxxxxxxxxxxxxx.dat
#
# output format:
# YYYY/MM/DD HH:MM:SS.mmm, GPU_util, MEM_util, MEM_used, MEM_total
2020/02/05 20:29:31.195, 8, 0, 2011, 12064
2020/02/05 20:29:31.695, 8, 0, 2011, 12064
2020/02/05 20:29:32.195, 8, 0, 2011, 12064
2020/02/05 20:29:32.695, 8, 0, 2011, 12064
2020/02/05 20:29:33.195, 8, 0, 2011, 12064
2020/02/05 20:29:33.695, 8, 0, 2011, 12064
2020/02/05 20:29:34.196, 8, 0, 2011, 12064
2020/02/05 20:29:34.696, 8, 0, 2011, 12064
2020/02/05 20:29:35.196, 8, 0, 2011, 12064
2020/02/05 20:29:35.696, 8, 0, 2011, 12064
...
This file contains GPU and GPU memory utilization data as produced by the corresponding
nvidia-smi
call. The first two columns compose the timestamp. The third column is
the GPU utilization in percent. The fourth column is the GPU memory utilization in percent.
The fifth column shows the used GPU memory in MB. The sixth column is the maximum
available GPU memory in MB. The data is recorded in ~ 500 millisecond intervalls.
gpu_mem_xxxxxxxxxxxxxxxxxxx.dat
#
# output format:
# GPU_id, memUtil, us, util_percent
0, memUtil , 1580930970350528, 0
0, memUtil , 1580930970518476, 0
0, memUtil , 1580930970686422, 0
0, memUtil , 1580930970854038, 20
0, memUtil , 1580930971021963, 25
0, memUtil , 1580930971189840, 20
0, memUtil , 1580930971357516, 10
0, memUtil , 1580930971525500, 0
0, memUtil , 1580930971693259, 0
0, memUtil , 1580930971860815, 0
...
This file contains GPU memory utilization data as produced by the corresponding
nvidia-smi
call. The first column is the GPU ID. The third column is a timestamp
with microseconds counted from ‘epoch’. The last column is the utilization in percent.
The data is recorded in ~ 166 millisecond intervalls.
gpu_mem_long_xxxxxxxxxxxxxxxxxxx.dat
#
# output format:
# YYYY/MM/DD HH:MM:SS.mmm, P_name, MEM_used
2020/02/05 20:29:31.197, /home/user/.conda/envs/project/bin/python, 921
2020/02/05 20:29:31.197, python3_mv_gpu, 539
2020/02/05 20:29:31.197, python, 539
2020/02/05 20:29:31.197, /home/user/.conda/envs/project/bin/python, 921
2020/02/05 20:29:31.197, python3_mv_gpu, 539
2020/02/05 20:29:31.197, python, 539
2020/02/05 20:29:31.197, /home/user/.conda/envs/project/bin/python, 921
2020/02/05 20:29:31.197, python3_mv_gpu, 539
2020/02/05 20:29:31.197, python, 539
...
This file contains GPU memory usage data as produced by the corresponding
nvidia-smi
call. The first two columns compose the timestamp. The third
column is the name of the compute process whose memory usage in MB is recorded
as the third column. This file will get filtered for the compute process of the running
simulation setup:
python3_mv_gpu
. The resulting file is
PARSED_gpu_mem_long_xxxxxxxxxxxxxxxxxxx.dat
. If there is more than one
process with the same name as used by the current simulation run, both of of these logfiles
will be rendered unusable. Additionally the
plot that is based on these files will be unusable
as well.
PARSED_gpu_mem_long_xxxxxxxxxxxxxxxxxxx.dat
#
# output format:
# YYYY/MM/DD HH:MM:SS.mmm, P_name, MEM_used
2020/02/05 20:29:31.197, python3_mv_gpu, 539
2020/02/05 20:29:31.197, python3_mv_gpu, 539
2020/02/05 20:29:31.698, python3_mv_gpu, 539
2020/02/05 20:29:31.698, python3_mv_gpu, 539
2020/02/05 20:29:32.198, python3_mv_gpu, 539
2020/02/05 20:29:32.198, python3_mv_gpu, 539
2020/02/05 20:29:32.698, python3_mv_gpu, 539
2020/02/05 20:29:32.698, python3_mv_gpu, 539
2020/02/05 20:29:33.198, python3_mv_gpu, 539
2020/02/05 20:29:33.198, python3_mv_gpu, 539
...
This file contains GPU memory usage data for the simulation process and is the result
of filtering the
gpu_mem_long_xxxxxxxxxxxxxxxxxxx.dat
file.
SIMLOG_xxxxxxxxxxxxxxxxxxx.dat
#
[...]
Using CUDA
FASTMATH is set to: True
##########################################################
[t=0.0]: Initializing left nucleus.
Initialized left nucleus in: 9.182
[t=0.0]: Initializing right nucleus.
Initialized right nucleus in: 5.45
CUDA free memory: 4.47 GB of 7.93 GB.
Current memory usage: 1.92 GB.
1.0 Complete cycle in: 1.172
2.0 Complete cycle in: 0.005
...
This file contains a dump of the simulation log. That log consists of any output the
simulation
setup writes to stdout
.
STARTTIME_xxxxxxxxxxxxxxxxxxx.dat
#
2020-02-05_20-29-33.194286049
This file contains the starting timestamp of the simulation run and is the output of the
date(1)
command:
date +%Y-%m-%d_%H-%M-%S.%N
sys_mem_xxxxxxxxxxxxxxxxxxx.dat
#
2020/02/05 20:29:32 MemTotal: 32881424 kB MemFree: 6064676 kB Buffers: 3803468 kB Cached: 15242080 kB Shmem: 94556 kB SReclaimable: 3958080 kB
2020/02/05 20:29:33 MemTotal: 32881424 kB MemFree: 6064676 kB Buffers: 3803468 kB Cached: 15242092 kB Shmem: 94560 kB SReclaimable: 3958080 kB
2020/02/05 20:29:34 MemTotal: 32881424 kB MemFree: 6064416 kB Buffers: 3803468 kB Cached: 15242100 kB Shmem: 94564 kB SReclaimable: 3958080 kB
2020/02/05 20:29:35 MemTotal: 32881424 kB MemFree: 6064164 kB Buffers: 3803468 kB Cached: 15242104 kB Shmem: 94564 kB SReclaimable: 3958080 kB
2020/02/05 20:29:36 MemTotal: 32881424 kB MemFree: 6064164 kB Buffers: 3803472 kB Cached: 15242104 kB Shmem: 94564 kB SReclaimable: 3958080 kB
2020/02/05 20:29:37 MemTotal: 32881424 kB MemFree: 6064148 kB Buffers: 3803472 kB Cached: 15242108 kB Shmem: 94564 kB SReclaimable: 3958080 kB
2020/02/05 20:29:38 MemTotal: 32881424 kB MemFree: 6064164 kB Buffers: 3803472 kB Cached: 15242112 kB Shmem: 94564 kB SReclaimable: 3958080 kB
2020/02/05 20:29:39 MemTotal: 32881424 kB MemFree: 6064164 kB Buffers: 3803472 kB Cached: 15242120 kB Shmem: 94568 kB SReclaimable: 3958080 kB
2020/02/05 20:29:40 MemTotal: 32881424 kB MemFree: 6064164 kB Buffers: 3803472 kB Cached: 15242128 kB Shmem: 94572 kB SReclaimable: 3958080 kB
2020/02/05 20:29:41 MemTotal: 32881424 kB MemFree: 6063912 kB Buffers: 3803476 kB Cached: 15242132 kB Shmem: 94572 kB SReclaimable: 3958080 kB
...
This file contains system memory usage data as produced by
polling /proc/meminfo
in
~ 1 second intervalls. The first two columns compose the timestamp. All following columns
are values that are used to calculate the amount of used system memory. The calculation
is the same as done by the
htop tool:
$MemUsed = MemTotal - MemFree - Buff - Cached - SReclaimable + Shmem$
sys_util_xxxxxxxxxxxxxxxxxxx.dat
#
Linux 4.19.0-0.bpo.1-amd64 (server) 2020-02-05 _x86_64_ (12 CPU)
20:29:31 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
20:29:32 all 95.42 0.00 4.58 0.00 0.00 0.00 0.00 0.00 0.00 0.00
20:29:32 0 95.96 0.00 4.04 0.00 0.00 0.00 0.00 0.00 0.00 0.00
20:29:32 1 91.92 0.00 8.08 0.00 0.00 0.00 0.00 0.00 0.00 0.00
20:29:32 2 94.00 0.00 6.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
20:29:32 3 95.00 0.00 5.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
20:29:32 4 97.98 0.00 2.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00
20:29:32 5 98.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
20:29:32 6 95.96 0.00 4.04 0.00 0.00 0.00 0.00 0.00 0.00 0.00
20:29:32 7 97.00 0.00 3.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
20:29:32 8 96.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
20:29:32 9 98.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
20:29:32 10 95.00 0.00 5.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
20:29:32 11 93.07 0.00 6.93 0.00 0.00 0.00 0.00 0.00 0.00 0.00
20:29:32 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
20:29:33 all 95.75 0.00 4.25 0.00 0.00 0.00 0.00 0.00 0.00 0.00
20:29:33 0 96.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
20:29:33 1 96.04 0.00 3.96 0.00 0.00 0.00 0.00 0.00 0.00 0.00
20:29:33 2 98.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
20:29:33 3 96.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
20:29:33 4 96.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
20:29:33 5 96.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
...
This file contains CPU utilization data as produced by used the call to
mpstat(1)
. The
first line is a summary of the basic system properties. Following that will be blocks of
utilization data in ~ 1 second intervalls separated by a blank line. Each block will have a
number of lines:
- A header line
- A summary line
- One line for each CPU core
Not all columns are used for the
plot. The relevant
columns are the timestamp
(first column), the CPU ID (second column), the user CPU utilization in percent (third column)
and the idle percentage (last column). For each recorded data block, those values will be
extracted and written to the file
PARSED_sys_util_xxxxxxxxxxxxxxxxxxx.dat
.
PARSED_sys_util_xxxxxxxxxxxxxxxxxxx.dat
#
# output format:
# HH:MM:SS CPU_all %usr %idle HH:MM:SS CPU_0 %usr %idle HH:MM:SS CPU_1 %usr %idle ...
20:29:32 all 95.42 0.00 20:29:32 0 95.96 0.00 20:29:32 1 91.92 0.00 20:29:32 2 94.00 0.00 20:29:32 3 95.00 0.00 20:29:32 4 97.98 0.00 20:29:32 5 98.00 0.00 20:29:32 6 95.96 0.00 20:29:32 7 97.00 0.00 20:29:32 8 96.00 0.00 20:29:32 9 98.00 0.00 20:29:32 10 95.00 0.00 20:29:32 11 93.07 0.00
20:29:33 all 95.75 0.00 20:29:33 0 96.00 0.00 20:29:33 1 96.04 0.00 20:29:33 2 98.00 0.00 20:29:33 3 96.00 0.00 20:29:33 4 96.00 0.00 20:29:33 5 96.00 0.00 20:29:33 6 97.03 0.00 20:29:33 7 93.94 0.00 20:29:33 8 95.00 0.00 20:29:33 9 96.00 0.00 20:29:33 10 93.00 0.00 20:29:33 11 95.00 0.00
20:29:34 all 95.42 0.00 20:29:34 0 96.00 0.00 20:29:34 1 93.94 0.00 20:29:34 2 97.00 0.00 20:29:34 3 94.00 0.00 20:29:34 4 98.00 0.00 20:29:34 5 98.00 0.00 20:29:34 6 93.00 0.00 20:29:34 7 94.00 0.00 20:29:34 8 96.00 0.00 20:29:34 9 97.00 0.00 20:29:34 10 93.00 0.00 20:29:34 11 96.00 0.00
20:29:35 all 96.33 0.00 20:29:35 0 96.00 0.00 20:29:35 1 94.06 0.00 20:29:35 2 96.00 0.00 20:29:35 3 98.00 0.00 20:29:35 4 98.02 0.00 20:29:35 5 99.00 0.00 20:29:35 6 93.00 0.00 20:29:35 7 96.04 0.00 20:29:35 8 96.00 0.00 20:29:35 9 97.00 0.00 20:29:35 10 94.00 0.00 20:29:35 11 95.96 0.00
20:29:36 all 94.84 0.00 20:29:36 0 98.00 0.00 20:29:36 1 93.00 0.00 20:29:36 2 95.00 0.00 20:29:36 3 96.00 0.00 20:29:36 4 96.97 0.00 20:29:36 5 97.00 0.00 20:29:36 6 94.00 0.00 20:29:36 7 93.00 0.00 20:29:36 8 95.05 0.00 20:29:36 9 92.00 0.00 20:29:36 10 94.00 0.00 20:29:36 11 96.00 0.00
20:29:37 all 94.75 0.00 20:29:37 0 97.00 0.00 20:29:37 1 91.92 0.00 20:29:37 2 93.00 0.00 20:29:37 3 96.00 0.00 20:29:37 4 94.00 0.00 20:29:37 5 97.00 0.00 20:29:37 6 94.95 0.00 20:29:37 7 95.00 0.00 20:29:37 8 95.96 0.00 20:29:37 9 96.00 0.00 20:29:37 10 93.00 0.00 20:29:37 11 94.06 0.00
20:29:38 all 95.83 0.00 20:29:38 0 97.03 0.00 20:29:38 1 95.05 0.00 20:29:38 2 93.00 0.00 20:29:38 3 96.00 0.00 20:29:38 4 98.00 0.00 20:29:38 5 97.00 0.00 20:29:38 6 95.05 0.00 20:29:38 7 95.00 0.00 20:29:38 8 96.00 0.00 20:29:38 9 95.00 0.00 20:29:38 10 95.00 0.00 20:29:38 11 95.96 0.00
20:29:39 all 95.83 0.00 20:29:39 0 96.97 0.00 20:29:39 1 95.96 0.00 20:29:39 2 98.00 0.00 20:29:39 3 96.00 0.00 20:29:39 4 97.00 0.00 20:29:39 5 98.00 0.00 20:29:39 6 97.00 0.00 20:29:39 7 94.95 0.00 20:29:39 8 96.00 0.00 20:29:39 9 96.00 0.00 20:29:39 10 92.00 0.00 20:29:39 11 93.07 0.00
20:29:40 all 95.33 0.00 20:29:40 0 97.00 0.00 20:29:40 1 96.00 0.00 20:29:40 2 95.00 0.00 20:29:40 3 93.00 0.00 20:29:40 4 99.00 0.00 20:29:40 5 96.00 0.00 20:29:40 6 96.00 0.00 20:29:40 7 94.06 0.00 20:29:40 8 94.06 0.00 20:29:40 9 97.00 0.00 20:29:40 10 94.00 0.00 20:29:40 11 91.00 0.00
20:29:41 all 96.09 0.00 20:29:41 0 97.00 0.00 20:29:41 1 94.00 0.00 20:29:41 2 95.00 0.00 20:29:41 3 96.00 0.00 20:29:41 4 97.00 0.00 20:29:41 5 98.00 0.00 20:29:41 6 94.95 0.00 20:29:41 7 96.00 0.00 20:29:41 8 98.99 0.00 20:29:41 9 96.00 0.00 20:29:41 10 97.00 0.00 20:29:41 11 96.00 0.00
...
This file contains CPU utilization data extracted from the
sys_util_xxxxxxxxxxxxxxxxxxx.dat
logfile. The first four columns are average data for all CPU cores. The first column is the timestamp,
the second is the ID, the third is user CPU utilization in percent and the fourth is idle percentage.
For the following columns this scheme will be repeated for each CPU core. The values are recorded
in ~ 1 second intervalls.
Sample Plots #
DISCLAIMER:
The telemetry data collected by this suite does not represent perfectly accurate profiling data!
The following 6 plots will be generated by the
telemetry_plots
module based on the
logfiles
recorded by the
logging tools. The placeholder ‘x’ will be replaced with the timestamp recorded
at launch of the
start_telemetry_run.sh
script and is the output of the command:
$ date +%Y-%m-%d_%H-%M-%S
PLOT_gpu_xxxxxxxxxxxxxxxxxxx.png
#
This plot shows the GPU utilization in percent (%) and is based on the file:
gpu_xxxxxxxxxxxxxxxxxxx.dat
. The data is recorded
in ~ 166ms intervalls.
The black dashed lines mark the start and end of the simulation run.
PLOT_gpu_long_xxxxxxxxxxxxxxxxxxx.png
#
This plot shows the GPU and GPU memory utilization in percent (%), the total GPU
memory used in GB and the total available GPU memory in GB and is based on the
file:
gpu_long_xxxxxxxxxxxxxxxxxxx.dat
. The data is recorded in ~ 500ms
intervalls. The black dashed lines mark the start and end of the simulation run.
PLOT_gpu_mem_xxxxxxxxxxxxxxxxxxx.png
#
This plot shows the GPU memory utilization in percent (%) and is based on the file:
gpu_mem_xxxxxxxxxxxxxxxxxxx.dat
. The data is recorded in ~ 166ms intervalls.
The black dashed lines mark the start and end of the simulation run.
PLOT_gpu_mem_long_xxxxxxxxxxxxxxxxxxx.png
#
This plot shows the GPU memory used in GB only for the process being monitored.
In the section
‘User Guide’ it is explained how that process is identified. The data is
based on the file:
gpu_mem_long_xxxxxxxxxxxxxxxxxxx.dat
and is recorded
in ~ 500ms intervalls. The black dashed lines mark the start and end of the simulation
run.
PLOT_sys_mem_xxxxxxxxxxxxxxxxxxx.png
#
This plot shows the total system memory used in GB and the total available system
memory in GB and is based on the file:
sys_mem_xxxxxxxxxxxxxxxxxxx.dat
.
The data is recorded in ~ 1s intervalls. The black dashed lines mark the start and
end of the simulation run.
PLOT_sys_util_xxxxxxxxxxxxxxxxxxx.png
#
The first subplot shows the CPU utilization in percent (%) averaged over all cores for
measurements in system space and user space. The second subplot shows the CPU
utilization in percent (%) for each core measured in system space. The third subplot
shows the CPU utilization in percent (%) for each core measured in user space. The
data is based on the file:
sys_util_xxxxxxxxxxxxxxxxxxx.dat
and is recorded
in ~ 1s intervalls. The black dashed lines mark the start and end of the simulation run.