GCC Middle and Back End API Reference
|
#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "backend.h"
#include "target.h"
#include "rtl.h"
#include "tree.h"
#include "cfghooks.h"
#include "df.h"
#include "memmodel.h"
#include "tm_p.h"
#include "regs.h"
#include "insn-config.h"
#include "emit-rtl.h"
#include "output.h"
#include "tree-pass.h"
#include "cfgrtl.h"
#include "cfgbuild.h"
#include "bb-reorder.h"
#include "shrink-wrap.h"
#include "regcprop.h"
#include "rtl-iter.h"
#include "valtrack.h"
#include "function-abi.h"
#include "print-rtl.h"
Data Structures | |
struct | sw |
|
static |
Return whether we can duplicate basic block BB for shrink wrapping. We cannot if the block cannot be duplicated at all, or if any of its incoming edges are complex and come from a block that does not require a prologue (we cannot redirect such edges), or if the block is too big to copy. PRO is the basic block before which we would put the prologue, MAX_SIZE is the maximum size block we allow to be copied.
References can_duplicate_block_p(), CDI_DOMINATORS, dominated_by_p(), EDGE_COMPLEX, FOR_BB_INSNS, FOR_EACH_EDGE, get_attr_min_length(), NONDEBUG_INSN_P, and basic_block_def::preds.
Referenced by try_shrink_wrapping().
|
static |
Return whether basic block PRO can get the prologue. It cannot if it has incoming complex edges that need a prologue inserted (we make a new block for the prologue, so those edges would need to be redirected, which does not work). It also cannot if there exist registers live on entry to PRO that are clobbered by the prologue.
References CDI_DOMINATORS, df_get_live_in(), dominated_by_p(), EDGE_COMPLEX, FOR_EACH_EDGE, hard_reg_set_intersect_p(), basic_block_def::preds, and REG_SET_TO_HARD_REG_SET.
Referenced by try_shrink_wrapping().
|
static |
If we cannot handle placing some component's prologues or epilogues where we decided we should place them, unmark that component in COMPONENTS so that it is not wrapped separately.
References bitmap_and_compl(), bitmap_empty_p(), bitmap_subset_p(), cfun, dump_components(), dump_file, EXIT_BLOCK_PTR_FOR_FN, FOR_EACH_BB_FN, FOR_EACH_EDGE, gcc_assert, sw::has_components, SBITMAP_SIZE, single_pred_p(), basic_block_def::succs, SW(), and targetm.
Referenced by try_shrink_wrapping_separate().
|
static |
Separate shrink-wrapping Instead of putting all of the prologue and epilogue in one spot, we can put parts of it in places where those components are executed less frequently. The following code does this, for prologue and epilogue components that can be put in more than one location, and where those components can be executed more than once (the epilogue component will always be executed before the prologue component is executed a second time). What exactly is a component is target-dependent. The more usual components are simple saves/restores to/from the frame of callee-saved registers. This code treats components abstractly (as an sbitmap), letting the target handle all details. Prologue components are placed in such a way that for every component the prologue is executed as infrequently as possible. We do this by walking the dominator tree, comparing the cost of placing a prologue component before a block to the sum of costs determined for all subtrees of that block. From this placement, we then determine for each component all blocks where at least one of this block's dominators (including itself) will get a prologue inserted. That then is how the components are placed. We could place the epilogue components a bit smarter (we can save a bit of code size sometimes); this is a possible future improvement. Prologues and epilogues are preferably placed into a block, either at the beginning or end of it, if it is needed for all predecessor resp. successor edges; or placed on the edge otherwise. If the placement of any prologue/epilogue leads to a situation we cannot handle (for example, an abnormal edge would need to be split, or some targets want to use some specific registers that may not be available where we want to put them), separate shrink-wrapping for the components in that prologue/epilogue is aborted.
Print the sbitmap COMPONENTS to the DUMP_FILE if not empty, with the label LABEL.
References bitmap_bit_p, bitmap_empty_p(), dump_file, and simple_bitmap_def::n_bits.
Referenced by disqualify_problematic_components(), emit_common_heads_for_components(), emit_common_tails_for_components(), init_separate_shrink_wrap(), insert_prologue_epilogue_for_components(), spread_components(), and try_shrink_wrapping_separate().
|
static |
Place code for prologues and epilogues for COMPONENTS where we can put that code at the start of basic blocks.
References bb_note(), bitmap_and(), bitmap_and_compl(), bitmap_clear(), bitmap_copy(), bitmap_empty_p(), bitmap_ior(), cfun, dump_components(), dump_file, emit_insn_after(), end_sequence(), FOR_ALL_BB_FN, FOR_EACH_BB_FN, FOR_EACH_EDGE, get_insns(), sw::has_components, sw::head_components, basic_block_def::index, basic_block_def::preds, record_epilogue_seq(), record_prologue_seq(), SBITMAP_SIZE, start_sequence(), SW(), and targetm.
Referenced by try_shrink_wrapping_separate().
|
static |
Place code for prologues and epilogues for COMPONENTS where we can put that code at the end of basic blocks.
References BB_END, bitmap_and(), bitmap_and_compl(), bitmap_clear(), bitmap_copy(), bitmap_empty_p(), bitmap_ior(), cfun, control_flow_insn_p(), dump_components(), dump_file, EDGE_COUNT, emit_insn_after(), emit_insn_before(), end_sequence(), FOR_ALL_BB_FN, FOR_EACH_BB_FN, FOR_EACH_EDGE, get_insns(), sw::has_components, sw::head_components, basic_block_def::index, record_epilogue_seq(), record_prologue_seq(), SBITMAP_SIZE, simplejump_p(), start_sequence(), basic_block_def::succs, SW(), sw::tail_components, and targetm.
Referenced by try_shrink_wrapping_separate().
|
static |
Destroy the pass-specific data.
References basic_block_def::aux, cfun, FOR_ALL_BB_FN, free(), sw::has_components, sw::head_components, sw::needs_components, sbitmap_free(), SW(), and sw::tail_components.
Referenced by try_shrink_wrapping_separate().
|
static |
Do whatever needs to be done for exits that run without prologue. Sibcalls need nothing done. Normal exits get a simple_return inserted.
References BB_COPY_PARTITION, BB_END, cfun, basic_block_def::count, create_basic_block(), dump_file, EDGE_COUNT, emit_barrier_after_bb(), emit_jump_insn_after(), emit_note_after(), end(), EXIT_BLOCK_PTR_FOR_FN, INSN_UID(), JUMP_LABEL, make_single_succ_edge(), redirect_edge_succ(), simple_return_rtx, and targetm.
Referenced by try_shrink_wrapping().
|
static |
Create the pass-specific data structures for separately shrink-wrapping with components COMPONENTS.
References basic_block_def::aux, bitmap_clear(), bitmap_copy(), cfun, dump_components(), dump_file, EDGE_COUNT, FOR_ALL_BB_FN, sw::has_components, sw::head_components, basic_block_def::index, sw::needs_components, sbitmap_alloc(), SBITMAP_SIZE, basic_block_def::succs, SW(), sw::tail_components, and targetm.
Referenced by try_shrink_wrapping_separate().
|
static |
Place prologues and epilogues for COMPONENTS on edges, if we haven't already placed them inside blocks directly.
References basic_block_def::aux, BB_END, bitmap_and(), bitmap_and_compl(), bitmap_empty_p(), CALL_P, cfun, commit_edge_insertions(), dump_components(), dump_file, emit_insn_after(), emit_insn_before(), end_sequence(), EXIT_BLOCK_PTR_FOR_FN, FOR_EACH_BB_FN, FOR_EACH_EDGE, gcc_assert, get_insns(), sw::has_components, sw::head_components, insert_insn_on_edge(), record_epilogue_seq(), record_prologue_seq(), SBITMAP_SIZE, SIBLING_CALL_P, split_edge(), start_sequence(), basic_block_def::succs, SW(), sw::tail_components, and targetm.
Referenced by try_shrink_wrapping_separate().
|
static |
See whether there has a single live edge from BB, which dest uses [REGNO, END_REGNO). Return the live edge if its dest bb has one or two predecessors. Otherwise return NULL.
References cfun, df_get_live_in(), EDGE_COUNT, EXIT_BLOCK_PTR_FOR_FN, FOR_EACH_EDGE, i, NULL, REGNO_REG_SET_P, and basic_block_def::succs.
Referenced by move_insn_for_shrink_wrap().
|
static |
Try to move INSN from BB to a successor. Return true on success. USES and DEFS are the set of registers that are used and defined after INSN in BB. SPLIT_P indicates whether a live edge from BB is splitted or not.
References ALL, bb_note(), bitmap_and(), bitmap_empty_p(), can_throw_internal(), CLEAR_REGNO_REG_SET, CONSTANT_P, dead_debug_add(), dead_debug_insert_temp(), debug, DEBUG_BIND_INSN_P, DEBUG_TEMP_BEFORE_WITH_VALUE, defs, delete_insn(), df_get_live_in(), df_get_live_out(), df_grow_bb_info(), df_live, DF_LIVE_BB_INFO, DF_LR_BB_INFO, DF_REF_REG, DF_REF_REGNO, df_set_bb_dirty(), EDGE_COUNT, emit_insn_after(), END_REGNO(), FOR_BB_INSNS_REVERSE, FOR_EACH_INSN_DEF, FOR_EACH_INSN_USE, FOR_EACH_SUBRTX_VAR, frame_pointer_rtx, GET_CODE, GET_MODE, GET_RTX_CLASS, hard_frame_pointer_rtx, i, INSN_P, live_edge_for_reg(), mark_jump_label(), MAY_HAVE_DEBUG_BIND_INSNS, NULL, NULL_RTX, overlaps_hard_reg_set_p(), PATTERN(), basic_block_def::preds, refers_to_regno_p(), REG_P, REGNO, REGNO_REG_SET_P, RTX_BIN_ARITH, RTX_COMM_ARITH, RTX_COMM_COMPARE, RTX_COMPARE, RTX_CONST_OBJ, RTX_EXTRA, RTX_OBJ, RTX_TERNARY, RTX_UNARY, SET, SET_DEST, SET_REGNO_REG_SET, SET_SRC, split_edge(), stack_pointer_rtx, and basic_block_def::succs.
Referenced by prepare_shrink_wrap().
|
static |
Place the prologue for component WHICH, in the basic blocks dominated by HEAD. Do a DFS over the dominator tree, and set bit WHICH in the HAS_COMPONENTS of a block if either the block has that bit set in NEEDS_COMPONENTS, or it is cheaper to place the prologue here than in all dominator subtrees separately.
References bitmap_bit_p, bitmap_set_bit, CDI_DOMINATORS, CDI_POST_DOMINATORS, cfun, basic_block_def::count, dominated_by_p(), EDGE_FREQUENCY, first_dom_son(), FOR_EACH_EDGE, get_immediate_dominator(), sw::has_components, sw::needs_components, next_dom_son(), sw::own_cost, basic_block_def::preds, SW(), profile_count::to_frequency(), and sw::total_cost.
Referenced by try_shrink_wrapping_separate().
|
static |
Look for register copies in the first block of the function, and move them down into successor blocks if the register is used only on one path. This exposes more opportunities for shrink-wrapping. These kinds of sets often occur when incoming argument registers are moved to call-saved registers because their values are live across one or more calls during the function.
References BB_END, CLEAR_HARD_REG_SET, copyprop_hardreg_forward_bb_without_debug_insn(), dead_debug_local_finish(), dead_debug_local_init(), debug, defs, DF_REF_REG, END_REGNO(), FOR_BB_INSNS_REVERSE_SAFE, FOR_EACH_INSN_DEF, FOR_EACH_INSN_USE, HARD_REGISTER_P, i, JUMP_P, move_insn_for_shrink_wrap(), NONDEBUG_INSN_P, NULL, REG_P, REGNO, and SET_HARD_REG_BIT.
Referenced by try_shrink_wrapping().
bool requires_stack_frame_p | ( | rtx_insn * | insn, |
HARD_REG_SET | prologue_used, | ||
HARD_REG_SET | set_up_by_prologue ) |
Shrink-wrapping related optimizations. Copyright (C) 1987-2024 Free Software Foundation, Inc. This file is part of GCC. GCC is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3, or (at your option) any later version. GCC is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see <http://www.gnu.org/licenses/>.
This file handles shrink-wrapping related optimizations.
Return true if INSN requires the stack frame to be set up. PROLOGUE_USED contains the hard registers used in the function prologue. SET_UP_BY_PROLOGUE is the set of registers we expect the prologue to set up for the function.
References add_to_hard_reg_set(), CALL_P, can_throw_internal(), cfun, CLEAR_HARD_REG_SET, DF_REF_REG, df_regs_ever_live_p(), FAKE_CALL_P, FOR_EACH_INSN_DEF, FOR_EACH_INSN_USE, GET_MODE, hard_reg_set_intersect_p(), REG_P, REGNO, SIBLING_CALL_P, and TEST_HARD_REG_BIT.
Referenced by try_shrink_wrapping().
Set HAS_COMPONENTS in every block to the maximum it can be set to without setting it on any path from entry to exit where it was not already set somewhere (or, for blocks that have no path to the exit, consider only paths from the entry to the block itself). Return whether any changes were made to some HAS_COMPONENTS.
References bitmap_and(), bitmap_and_compl(), bitmap_clear(), bitmap_clear_bit(), bitmap_copy(), bitmap_empty_p(), bitmap_equal_p(), bitmap_ior(), bitmap_set_bit, cfun, dump_components(), dump_file, ENTRY_BLOCK_PTR_FOR_FN, EXIT_BLOCK_PTR_FOR_FN, FOR_ALL_BB_FN, FOR_EACH_BB_FN, FOR_EACH_EDGE, sw::has_components, sw::head_components, basic_block_def::index, n_basic_blocks_for_fn, basic_block_def::preds, SBITMAP_SIZE, single_succ(), basic_block_def::succs, SW(), sw::tail_components, and todo.
Referenced by try_shrink_wrapping_separate().
|
inlinestatic |
A helper function for accessing the pass-specific info.
References basic_block_def::aux, and gcc_assert.
Referenced by disqualify_problematic_components(), emit_common_heads_for_components(), emit_common_tails_for_components(), fini_separate_shrink_wrap(), init_separate_shrink_wrap(), insert_prologue_epilogue_for_components(), place_prologue_for_one_component(), spread_components(), and try_shrink_wrapping_separate().
Try to perform a kind of shrink-wrapping, making sure the prologue/epilogue is emitted only around those parts of the function that require it. There will be exactly one prologue, and it will be executed either zero or one time, on any path. Depending on where the prologue is placed, some of the basic blocks can be reached via both paths with and without a prologue. Such blocks will be duplicated here, and the edges changed to match. Paths that go to the exit without going through the prologue will use a simple_return instead of the epilogue. We maximize the number of those, making sure to only duplicate blocks that can be duplicated. If the prologue can then still be placed in multiple locations, we place it as early as possible. An example, where we duplicate blocks with control flow (legend: _B_egin, _R_eturn and _S_imple_return; edges without arrowhead should be taken to point down or to the right, to simplify the diagram; here, block 3 needs a prologue, the rest does not): B B | | 2 2 |\ |\ | 3 becomes | 3 |/ | \ 4 7 4 |\ |\ |\ | 5 | 8 | 5 |/ |/ |/ 6 9 6 | | | R S R (bb 4 is duplicated to 7, and so on; the prologue is inserted on the edge 2->3). Another example, where part of a loop is duplicated (again, bb 3 is the only block that needs a prologue): B 3<-- B ->3<-- | | | | | | | | v | becomes | | v | 2---4--- 2---5-- 4--- | | | R S R (bb 4 is duplicated to 5; the prologue is inserted on the edge 5->3). ENTRY_EDGE is the edge where the prologue will be placed, possibly changed by this function. PROLOGUE_SEQ is the prologue we will insert.
References add_to_hard_reg_set(), any_condjump_p(), profile_count::apply_scale(), basic_block_def::aux, BB_COPY_PARTITION, BB_END, bitmap_bit_p, bitmap_copy(), bitmap_set_bit, calculate_dominance_info(), can_dup_for_shrink_wrapping(), can_get_prologue(), CDI_DOMINATORS, CDI_POST_DOMINATORS, cfun, CLEAR_HARD_REG_BIT, CLEAR_HARD_REG_SET, basic_block_def::count, create_empty_bb(), crtl, dominated_by_p(), dump_file, duplicate_block(), EDGE_COUNT, ei_next(), ei_safe_edge(), ei_start, emit_barrier_after_bb(), ENTRY_BLOCK_PTR_FOR_FN, EXIT_BLOCK_PTR_FOR_FN, FOR_BB_INSNS, FOR_EACH_BB_FN, FOR_EACH_BB_REVERSE_FN, FOR_EACH_EDGE, force_nonfallthru(), frame_pointer_needed, free_dominance_info(), gcc_assert, get_immediate_dominator(), GET_MODE, get_uncond_jump_length(), handle_simple_exit(), HARD_FRAME_POINTER_REGNUM, basic_block_def::index, INSN_UID(), INVALID_REGNUM, JUMP_P, make_single_succ_edge(), n_basic_blocks_for_fn, nearest_common_dominator(), NEXT_INSN(), NONDEBUG_INSN_P, profile_count::nonzero_p(), NOTE_KIND, NOTE_P, note_stores(), note_uses(), NULL, PATTERN(), PIC_OFFSET_TABLE_REGNUM, pic_offset_table_rtx, basic_block_def::preds, prepare_shrink_wrap(), print_rtl_single(), record_hard_reg_sets(), record_hard_reg_uses(), redirect_edge_and_branch_force(), REGNO, requires_stack_frame_p(), hard_reg_set_container::set, SHRINK_WRAPPING_ENABLED, single_succ_edge(), split_edge(), basic_block_def::succs, targetm, and profile_count::zero().
Referenced by thread_prologue_and_epilogue_insns().
void try_shrink_wrapping_separate | ( | basic_block | first_bb | ) |
The main entry point to this subpass. FIRST_BB is where the prologue would be normally put.
References bitmap_and_compl(), bitmap_empty_p(), calculate_dominance_info(), CDI_DOMINATORS, CDI_POST_DOMINATORS, crtl, df_analyze(), df_live_add_problem(), df_live_set_all_dirty(), df_scan, DF_SCAN_EMPTY_ENTRY_EXIT, df_update_entry_exit_and_calls(), disqualify_problematic_components(), dump_components(), dump_file, emit_common_heads_for_components(), emit_common_tails_for_components(), EXECUTE_IF_SET_IN_BITMAP, fini_separate_shrink_wrap(), free_dominance_info(), sw::has_components, init_separate_shrink_wrap(), insert_prologue_epilogue_for_components(), place_prologue_for_one_component(), sbitmap_free(), spread_components(), SW(), targetm, and use_shrink_wrapping_separate().
Referenced by thread_prologue_and_epilogue_insns().
bool use_shrink_wrapping_separate | ( | void | ) |
References cfun, crtl, optimize_function_for_speed_p(), SHRINK_WRAPPING_ENABLED, and targetm.
Referenced by try_shrink_wrapping_separate().