Inlining decision heuristics.
Copyright (C) 2003-2024 Free Software Foundation, Inc.
Contributed by Jan Hubicka
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 3, or (at your option) any later
version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3. If not see
<http://www.gnu.org/licenses/>.
Inlining decision heuristics
The implementation of inliner is organized as follows:
inlining heuristics limits
can_inline_edge_p allow to check that particular inlining is allowed
by the limits specified by user (allowed function growth, growth and so
on).
Functions are inlined when it is obvious the result is profitable (such
as functions called once or when inlining reduce code size).
In addition to that we perform inlining of small functions and recursive
inlining.
inlining heuristics
The inliner itself is split into two passes:
pass_early_inlining
Simple local inlining pass inlining callees into current function.
This pass makes no use of whole unit analysis and thus it can do only
very simple decisions based on local properties.
The strength of the pass is that it is run in topological order
(reverse postorder) on the callgraph. Functions are converted into SSA
form just before this pass and optimized subsequently. As a result, the
callees of the function seen by the early inliner was already optimized
and results of early inlining adds a lot of optimization opportunities
for the local optimization.
The pass handle the obvious inlining decisions within the compilation
unit - inlining auto inline functions, inlining for size and
flattening.
main strength of the pass is the ability to eliminate abstraction
penalty in C++ code (via combination of inlining and early
optimization) and thus improve quality of analysis done by real IPA
optimizers.
Because of lack of whole unit knowledge, the pass cannot really make
good code size/performance tradeoffs. It however does very simple
speculative inlining allowing code size to grow by
EARLY_INLINING_INSNS when callee is leaf function. In this case the
optimizations performed later are very likely to eliminate the cost.
pass_ipa_inline
This is the real inliner able to handle inlining with whole program
knowledge. It performs following steps:
1) inlining of small functions. This is implemented by greedy
algorithm ordering all inlinable cgraph edges by their badness and
inlining them in this order as long as inline limits allows doing so.
This heuristics is not very good on inlining recursive calls. Recursive
calls can be inlined with results similar to loop unrolling. To do so,
special purpose recursive inliner is executed on function when
recursive edge is met as viable candidate.
2) Unreachable functions are removed from callgraph. Inlining leads
to devirtualization and other modification of callgraph so functions
may become unreachable during the process. Also functions declared as
extern inline or virtual functions are removed, since after inlining
we no longer need the offline bodies.
3) Functions called once and not exported from the unit are inlined.
This should almost always lead to reduction of code size by eliminating
the need for offline copy of the function.
Inliner uses greedy algorithm to inline calls in a priority order.
Badness is used as the key in a Fibonacci heap which roughly corresponds
to negation of benefit to cost ratios.
In case multiple calls has same priority we want to stabilize the outcomes
for which we use ids.