An embedded JVM profiler which you start and stop at will from your code, and it will dump the profiling info into the console:
What it's not:
profile
method around the code.Please see the introductionary video for more details.
We're going to profile the example main method which looks like this:
Thread.sleep(500)
println(URL("https://aedict-online.eu").readText())
Thread.sleep(500)
Simply type this into your console:
git clone https://github.com/mvysny/suckless-ascii-profiler
cd suckless-ascii-profiler
./gradlew run
And you should see the output similar to the screenshot above.
The profiler is in both Maven Central and JCenter, so it's very easy to add to your project. Just add the following repository and dependency to your Gradle script:
dependencies {
compile "com.github.mvysny.sucklessprofiler:suckless-profiler:x.y"
}
Maven:
<dependencies>
<dependency>
<groupId>com.github.mvysny.sucklessprofiler</groupId>
<artifactId>suckless-profiler</artifactId>
<version>x.y</version>
</dependency>
</dependencies>
Note: please see the latest tag name above for the newest version
Then, in your code, just call the following:
final SucklessProfiler p = new SucklessProfiler();
p.start();
callMethodYouNeedToProfile();
p.stop(); // this will dump the profiling info into the console
or in Kotlin:
SucklessProfiler().apply {
coloredDump = true
}.profile {
Thread.sleep(500)
println(URL("https://aedict-online.eu").readText())
Thread.sleep(500)
} // this will call the stop() method and will dump the profiling info into the console
or in your (Vaadin) Servlet:
@WebServlet(urlPatterns = arrayOf("/*"), name = "MyUIServlet", asyncSupported = true)
@VaadinServletConfiguration(ui = MyUI::class, productionMode = false)
class MyUIServlet : VaadinServlet() {
override fun service(request: HttpServletRequest?, response: HttpServletResponse?) {
SucklessProfiler().apply {
coloredDump = true
pruneStacktraceBottom = true
dumpOnlyProfilingsLongerThan = Duration.ofSeconds(2)
}.profile {
super.service(request, response)
}
}
}
or in your JUnit tests:
public abstract class AbstractTest {
private static SucklessProfiler profiler;
@BeforeClass
public static void startProfiler() {
profiler = new SucklessProfiler();
profiler.setColoredDump(true);
profiler.setPruneStacktraceTop(true);
profiler.start();
}
@AfterClass
public static void dumpProfiler() {
profiler.stop(true);
}
}
All follow-up tips use class name matching, so it's best to show some examples for that:
java.*
will match java.lang.String
and java.io.BufferedReader
but not javax.net.SocketFactory
java.lang.String
will only match java.lang.String
and not java.lang.StringBuilder
If the call ends in one of java.*
classes, it probably ends up in Java built-in class as java.util.Map
which we do not want to dig into
and profile its internals. We want to collapse the call node (all sub-calls will simply be counted towards the own time of this method).
The difference between soft and hard collapsing is that soft-collapsing java.*
won't collapse for example
reflection calls like java.lang.Method
,
since the stack trace continues with your app's logic which is not java.*
.
Adding java.*
to hard collapse would collapse all reflection calls which potentially continue executing
your app's logic, which is generally not what you want to do. However, hard-collapsing
terminal libraries like JDBC drivers is generally okay to do.
If you need the general overview of where your app spends most of the time, you don't
need to include all those 1-ms methods into your stack trace. You can therefore set
profiler.collapseBelowPercent
variable to e.g. 5 (it's 0
by default) - that will
collapse all methods whose run time is below 5% of the total profile time.
After the dump is printed on profiler.stop(true)
, the last line looks like following:
Total: 100ms [DB: 25ms (25%), IO/Net: 10ms (10%)]
The profiler will sum up durations of certain call stacks, which will give you a nice overall statistics. By default the following groups are supported:
"com.zaxxer.hikari.*", "org.mariadb.jdbc.*", "org.h2.*", "com.mysql.jdbc.*", "org.postgresql.*"
"java.io.*", "java.net.*", "javax.net.*"
You can define your own groups, simply by adding stuff into the profiler.groupTotals
map as follows:
profiler.groupTotals["Password Hashing"] = listOf("my.project.BCryptPasswordEncoder")
A slice of time is always counted towards one group at most.
When the groups overlap (they match the same class + method), first one in the order of
profiler.groupTotals
keys wins (profiler.groupTotals
is a LinkedHashMap
therefore the keys are ordered).
When the groups target different parts of the call stack (e.g. you add groups both for JDBC and for your DAOs), then the group for the shallowest stack frame wins (in this case your DAOs since they're calling JDBC and they're closer up the call stack).
A thread stacktrace is captured every 20ms (configurable) into a list of stacktrace samples. That list is afterwards converted to a tree, and the time spent is assigned. The Stacktrace Sampling method of profiling is non-intrusive: for example you don't have to start a Java agent, as opposed to the tracing method (where every method is intercepted and call statistics are stored - very accurate but very slow and often unnecessary). Stacktrace Sampling will thus profile a real code execution.
Stacktrace sampling is also unfortunately quite inaccurate. Sampling stacktrace every 20ms means that we have no idea what's going on during those 20ms inbetween samplings. Thus, if a method is sampled only once, we cannot check whether the method took less than 1ms to run, or whether it took 39ms. To remedy this, you can increase the sampling rate to 10ms (or even 2ms) to obtain more accurate results while still maintaining quite minor performance hit. However, this is often not required.
Usually what you hunt for is the place where your code spends 200ms or more. And that is something we can detect with high accuracy. If a method is present in multiple samplings, then there is a high probability that that method was running quite long. Of course there is also the possibility that the method was called repeatedly, ran shortly and was captured in multiple stack samplings, but the probability of this is very low.
In order to obtain the stacktrace, the JVM must reach a safepoint; a thread may run for just a bit until it hits the safepoint, and therefore the terminal method may not be accurate. See JVM safepoints for more info. Especially high-CPU-usage functions may often be missed since it will take some time until they reach a safepoint.
To conclude:
The overhead is very low as compared to tracing, yet there still is a very minor overhead:
Thus, the profiler will use a bit of CPU and memory. However, as long as your app is not eating 100% CPU and is not taking nearly all of your JVM's memory, this overhead will not slow down your app. Your app will be slowed down by the JVM being paused while the stack trace is taken, however this pause is generally just a couple of nanoseconds and therefore negligible.
Copyright 2019 Martin Vysny
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.