Where does the time in a nonlinear finite element analysis go? For large models, solving equations dominates the analysis time, while for small to moderate models, constitutive models (state determination) dominate. It is impossible to know the relative cost of state determination and solving equations prior to an analysis. Profiling tools can tell you after the fact where a program spent its time.
This post shows how to use
gprof, a Linux/Unix profiling tool, to find the distribution of computation time for an OpenSees Tcl analysis. Profiling a standalone executable is far easier than profiling a shared object file such as
gprof tool is part of the
$ sudo apt-get install binutils
Note: I used Ubuntu 20.04 in generating this post.
Re-compile OpenSees in DEBUG Mode
To get a profile out of OpenSees, you need to enable certain compiler options. So that you don’t have to futz around with command line arguments, the necessary compiler options are enabled by setting the
DEBUG_MODE variable in your
Makefile.def configuration file.
DEBUG_MODE = DEBUG #DEBUG_MODE = NO_DEBUG
Now you need to wipe OpenSees clean using
make wipe in the top level
OpenSees/$ make wipe
Then re-build OpenSees. This can take a few minutes, so use multiple processors, e.g., 4, if you got ’em.
OpenSees/$ make -j 4
Due to sprinkling of profiling bread crumbs throughout the code, the resulting executable will be larger than normal and analyses will run slower than usual. So, when you’re done profiling, be sure to change back to
NO_DEBUG and re-compile after wiping OpenSees clean again. But don’t do that yet–you still need to profile.
Run an OpenSees Analysis
With OpenSees built in
DEBUG mode, you can run the Tcl script for an analysis you want to profile.
$ OpenSees model.tcl
Although the analysis will take longer to run compared to
NO_DEBUG, you’ll see the usual output, etc. However, with debug mode turned on, a binary file
gmon.out will be created in your working directory. This is a binary file that
gprof can turn into useful information. To this end, you run
gprof with the absolute path to your OpenSees executable (mine is in
~/bin/) and re-direct the output to a text file.
$ gprof ~/bin/OpenSees gmon.out > modelProfile.txt
You can redirect to whatever filename you want.
View Profile Results
Now, you can view the text file with profile results to see where the time went. The “Flat profile” at the start of the file gives the most useful information. The flat profile ranks the function calls by % time and also shows cumulative and self time for each function.
For example, in the profile below, 59.18% of the time was spent in Steel02:setTrialStrain(), with 15.12% and 11.09% spent in FiberSection2d::setTrialSectionDeformation() and UniaxialMaterial::setTrial(), respectively.
% cumulative self self total time seconds seconds calls s/call s/call name 59.18 8.06 8.06 279724800 0.00 0.00 Steel02::setTrialStrain(double, double) 15.12 10.12 2.06 699312 0.00 0.00 FiberSection2d::setTrialSectionDeformation(Vector const&) 11.09 11.63 1.51 626304 0.00 0.00 FiberSection2d::getInitialTangent() 4.48 12.24 0.61 279724800 0.00 0.00 UniaxialMaterial::setTrial(double, double&, double&, double)
More detailed explanations of the flat profile are provided in the text file generated from
gmon.out. The explanations come after the flat profile, which can be pretty long. Search the text file for “Copyright” and you’ll get pretty close to the explanation of terms.
You can find additional information on each function in the subsequent “Call graph” section. The function with the index number, e.g.,
, has parent functions listed above and child functions listed below. The
children fields represent the time propagated to and from the indexed function.
index % time self children called name 2.06 8.78 699312/699312 ForceBeamColumn2d::update()   79.6 2.06 8.78 699312 FiberSection2d::setTrialSectionDeformation(Vector const&)  0.61 8.17 279724800/279724800 UniaxialMaterial::setTrial(double, double&, double&, double)  0.00 0.00 699312/3471523 Vector::operator=(Vector const&)  0.00 0.00 1398624/1406520 Vector::operator()(int) const 
More detailed explanations of the call graph data are provided in the profile text file, after the length call graph list. Search for “Copyright” again and you’ll land near the explanation of call graph terms.
Improve the Code
Now you know where to focus to make your code more efficient. Or you can use the profile information when you post to an issue on GitHub.
When you’re done profiling, don’t forget to recompile OpenSees with
NO_DEBUG so that you get back to a lean, mean executable.
I’m not aware of good tools to profile OpenSeesPy analyses or to profile OpenSees Tcl in a Windows environment. If you know of profiling tools for either of these cases, let everyone know in the Comments section below.