In this post I will go through some useful command line tools to debug and tune your service. I will show a sample output and description of the most important parts of the output, feel free to read the manual pages for more details about the tools and supported flags to tune their output.
vmstat
This system tool is very useful to view and monitor statistics of your system processes, memory, paging, block IO, disks, and CPU. Its useful to check IO/CPU utilization and spot any bottlenecks.
Example:
$ vmstat 1 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 1 0 0 12583088 119480 1590740 0 0 406 72 470 1035 10 3 87 0 0 0 0 0 12582832 119480 1590740 0 0 0 0 451 1562 1 0 98 0 0 0 0 0 12579484 119480 1590732 0 0 0 0 408 1345 1 0 98 0 0
The above command is to show stats with 1 second delay.
Processes: Amount of runnable and un-interruptible processes
Memory: Free/cache memory and memory used as buffers and the amount of virtual memory used (swpd)
Swap: Swap in and out (from disk)
IO: Blocks in (READ) and out (WRITE) from disk
System: Number of interrupts per second and context switches
CPU: Time running user code vs system (kernel) code, idle time, and waiting for IO
iostat
This system tool is very useful to view and monitor IO statistics of your system.
Example:
$ iostat -xm 5 Linux 4.15.0-42-generic avg-cpu: %user %nice %system %iowait %steal %idle 5.27 0.02 1.59 0.15 0.00 92.97 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.00 5.29 20.82 6.42 0.43 0.11 40.69 0.02 1.18 0.25 4.20 0.14 0.39
The -x flag is to show extended statistics and -m is to view stats in megabytes per second. Of course some of the information above can also be viewed by the previous tool `vmstat`.
%user, %system, %iowait, %idle: The percentage of CPU spent running user code, system code, waiting for IO, and idle
rrqm/s, wrqm/s: The number of (read/write) requests merged per second that were queued to the device
r/s, w/s, rMB/s, wMB/s: Reads/Writes (after merge) per seconds, MBs per seconds (read/write)
await, r_await, w_await: Average time in ms for IO requests to the device to be served, average time in ms for read, average time in ms for write
nicstat
This tool isn't installed by default in linux systems but can be installed using package managers or downloaded from sourceforge. It basically does the same as previous tools but for network monitoring.
Example:
$ nicstat 5 Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat 20:10:52 lo 0.33 0.33 3.98 3.98 85.25 85.25 0.00 0.00 20:10:52 wlp2s0 6.18 1.85 12.23 9.03 517.6 210.0 0.00 0.00
Prints network statistics ever 5 seconds.
Int: Interface name
rKB/s, wKB/s: KBs Read(Received) Write(Sent)
rPk/s, wPk/s: Packets received/sent
rAvs, wAvs: Average packet size received/sent
%Util, Sat: Percentage utilization of the interface, Saturation (errors/second)
jcmd
This tool and the coming few ones come along with the JDK and can be used to send diagnostic commands to a running JVM.
Example:
$ jcmd 8880 VM.uptime # 8880 is the process ID, VM.uptime is to check uptime of the running java process 8880: 163.883 s $ jcmd 8880 VM.system_properties # View system properties of a single java process 8880: #Mon Dec 24 20:32:01 CET 2018 java.runtime.name=OpenJDK Runtime Environment sun.boot.library.path=/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64 java.vm.version=25.191-b12 java.vm.vendor=Oracle Corporation java.vendor.url=http\://java.oracle.com/ path.separator=\: java.vm.name=OpenJDK 64-Bit Server VM file.encoding.pkg=sun.io user.country=US ..... # and more ..... $ jcmd 8880 VM.version 8880: OpenJDK 64-Bit Server VM version 25.191-b12 JDK 8.0_191 $ jcmd 8880 VM.command_line # The command line of the process was started 8880: VM Arguments: java_command: HelloWorld -Dse7so=29 -client java_class_path (initial): . Launcher Type: SUN_STANDARD $ jcmd 8880 VM.flags # Enable flags - try with [-all] 8880: -XX:CICompilerCount=3 -XX:InitialHeapSize=262144000 -XX:MaxHeapSize=4175429632 -XX:MaxNewSize=1391460352 -XX:MinHeapDeltaBytes=524288 -XX:NewSize=87031808 -XX:OldSize=175112192 -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseParallelGC $ jcmd 8880 Thread.print # Thread dump - see also jstack tool below # Very long output here that can be saved to a file for later analysis $ jcmd 8880 GC.run # 8880: Command executed successfully $ jcmd 8880 GC.class_histogram # Show heap histogram ([C is character array, [B is byte array, [I is integer array) 8880: num #instances #bytes class name ---------------------------------------------- 1: 1157 1614520 [C 2: 131 216072 [I 3: 474 53992 java.lang.Class 4: 1145 27480 java.lang.String 5: 531 26632 [Ljava.lang.Object; 6: 20 25264 [B 7: 378 9072 java.util.LinkedList$Node 8: 211 6752 java.util.HashMap$Node 9: 190 6080 java.util.LinkedList 10: 75 5400 java.lang.reflect.Field 11: 256 4096 java.lang.Integer 12: 93 3720 java.lang.ref.SoftReference 13: 111 3552 java.util.Hashtable$Entry 14: 7 2632 java.lang.Thread 15: 12 1984 [Ljava.util.HashMap$Node; 16: 51 1896 [Ljava.lang.String; 17: 38 1824 sun.util.locale.LocaleObjectCache$CacheEntry 18: 47 1504 java.util.concurrent.ConcurrentHashMap$Node 19: 23 1472 java.net.URL 20: 14 1120 [Ljava.util.WeakHashMap$Entry; 21: 14 1120 java.lang.reflect.Constructor 22: 2 1064 [Ljava.lang.invoke.MethodHandle; 23: 1 1040 [Ljava.lang.Integer; 24: 26 1040 java.io.ObjectStreamField 25: 6 992 [Ljava.util.Hashtable$Entry; 26: 15 840 java.lang.Class$ReflectionData 27: 21 840 java.lang.ref.Finalizer
jconsole
In fact this tool is a GUI tool but it worth mentioning as it provides very useful information to monitor and control java applications.
It views information about Memory, CPU, Threads, and Classes loaded.
jmap
Provides heap dumps and other information about JVM memory usage.
Example:
$ jmap -dump:live,format=b,file=/tmp/dump.hprof 9293 Dumping heap to /tmp/dump.hprof ... Heap dump file created
Generates a dump from a live process, in binary format and saves it to /tmp/dump.hprof - you can then analyze it with any heap dump analyzer.
jstack
Useful to generate thread dumps with full stack traces.
Example:
$ stack -l > /tmp/myapp.tdump
You can then analyze the thread dump using any thread dump analyzer.
visualvm
This is also a GUI tool but very useful to monitor your java application. Its man page describes it as all-in-one java troubleshooting tool and in fact it is very nice and can be very useful along with the above commands. you can also use to analyze heap and thread dumps generated earlier using jmap and jstack.
jstat
Is one of the best tools to view very useful stats about GC, JIT compiler, and so many other things to monitor.
Example:
$ jstat -gcutil -t 9293 100ms Timestamp S0 S1 E O M CCS YGC YGCT FGC FGCT GCT 5452.8 0.00 0.00 77.88 0.86 96.16 84.57 10 0.056 10 0.295 0.351 5453.0 0.00 0.00 77.88 0.86 96.16 84.57 10 0.056 10 0.295 0.351
The above shows GC information (the -t to show timestamps) every 100ms.
S0, S1: Survivor spaces 0 and 1 utilization as a percentage of the space's current capacity
E, O, M: Eden space, Old space, Metaspace utilization as percentage of the space's current capacity
CCS: Compressed class space utilization as a percentage
YGC, YGCT: Number of young generation GC events and young generation gc time
FGC, FGCT: Number of full GC events, and full GC time
GCT: Total garbage collection time
When the Eden space percentage goes up this triggers a young generation minor GC event which then will reduce its size and the S0, S1 will be swapped and of course Old generation increases a bit (some objects move from Y to O) - Full GC events are triggered when the Old generation percentage goes up and as a result its size is reduced and fragmented.
When the Eden space percentage goes up this triggers a young generation minor GC event which then will reduce its size and the S0, S1 will be swapped and of course Old generation increases a bit (some objects move from Y to O) - Full GC events are triggered when the Old generation percentage goes up and as a result its size is reduced and fragmented.
$ jstat -class -t 9293 1s Timestamp Loaded Bytes Unloaded Bytes Time 5845.2 3602 7665.3 13 15.8 1.33 5846.2 3602 7665.3 13 15.8 1.33 5847.2 3602 7665.3 13 15.8 1.33 5848.2 3602 7665.3 13 15.8 1.33
The same command but to view class loader statistics every 1 second.
Loaded: Number of classes loaded
Bytes: Number of bytes loaded
Unloaded: Number of classes unloaded
Bytes: Number of bytes unloaded
Time: Time spent performing class loading and unloading
$ jstat -printcompilation -t 9293 1s Timestamp Compiled Size Type Method 6067.7 3744 684 1 java/util/ComparableTimSort sort 6068.7 3744 684 1 java/util/ComparableTimSort sort 6069.7 3744 684 1 java/util/ComparableTimSort sort
The same command but to view Java HotSpot VM compiler method statistics.
Compiled: Number of compilation tasks performed on the method
Size: Size of code byte code (not binary code)
Type: Compilation type of the most recently compiled method
Method: Class name and method name identifying the most recently compiled method
$ jstat -compiler -t 9293 1s Timestamp Compiled Failed Invalid Time FailedType FailedMethod 6378.9 3753 3 0 12.54 1 org/netbeans/lib/profiler/server/ProfilerRuntimeObjLiveness traceObjAlloc 6380.0 3753 3 0 12.54 1 org/netbeans/lib/profiler/server/ProfilerRuntimeObjLiveness traceObjAlloc 6381.0 3753 3 0 12.54 1 org/netbeans/lib/profiler/server/ProfilerRuntimeObjLiveness traceObjAlloc 6382.0 3753 3 0 12.54 1 org/netbeans/lib/profiler/server/ProfilerRuntimeObjLiveness traceObjAlloc
Java HotSpot VM Just-in-Time compiler statistics.
Compiled: Number of compilation tasks performed
Failed: Number of failed compilation tasks
Invalid: Number of invalidated compilation tasks
Time: Time spent on compilation
FailedMethod: The most recent failed method to compile
The last two examples are a bit hard to understand if you don't really know the basics of JIT and how it works .. you can go ahead and read what happens behind the scene to realize the importance of both examples and how you can use them to diagnose possible performance issues in your services.
Those are the most useful tools from my point of view to monitor, diagnose, troubleshoot your services .. please comment with other tools you might see useful but not stated above.
Cheers