GCC Middle and Back End API Reference
|
#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "backend.h"
#include "tree.h"
#include "gimple.h"
#include "cfghooks.h"
#include "tree-pass.h"
#include "ssa.h"
#include "gimple-pretty-print.h"
#include "fold-const.h"
#include "cfganal.h"
#include "gimple-iterator.h"
#include "gimplify-me.h"
#include "stor-layout.h"
#include "tree-cfg.h"
#include "tree-ssa-loop-manip.h"
#include "tree-ssa-loop-ivopts.h"
#include "tree-ssa-loop.h"
#include "tree-into-ssa.h"
#include "tree-ssa.h"
#include "cfgloop.h"
#include "tree-scalar-evolution.h"
#include "tree-vectorizer.h"
#include "tree-eh.h"
#include "gimple-fold.h"
#include "tree-affine.h"
#include "intl.h"
#include "rtl.h"
#include "memmodel.h"
#include "optabs.h"
#include "tree-ssa-loop-niter.h"
Data Structures | |
struct | ddr_hasher |
struct | rdg_vertex |
struct | rdg_edge |
struct | builtin_info |
struct | partition |
class | loop_distribution |
struct | pg_vdata |
struct | pg_edata |
struct | pg_edge_callback_data |
Macros | |
#define | MAX_DATAREFS_NUM ((unsigned) param_loop_max_datarefs_for_datadeps) |
#define | NUM_PARTITION_THRESHOLD (4) |
#define | DR_INDEX(dr) |
#define | RDGV_STMT(V) |
#define | RDGV_DATAREFS(V) |
#define | RDGV_HAS_MEM_WRITE(V) |
#define | RDGV_HAS_MEM_READS(V) |
#define | RDG_STMT(RDG, I) |
#define | RDG_DATAREFS(RDG, I) |
#define | RDG_MEM_WRITE_STMT(RDG, I) |
#define | RDG_MEM_READS_STMT(RDG, I) |
#define | RDGE_TYPE(E) |
Enumerations | |
enum | rdg_dep_type { flow_dd = 'f' , control_dd = 'c' } |
enum | partition_kind { PKIND_NORMAL , PKIND_PARTIAL_MEMSET , PKIND_MEMSET , PKIND_MEMCPY , PKIND_MEMMOVE } |
enum | partition_type { PTYPE_PARALLEL = 0 , PTYPE_SEQUENTIAL } |
enum | fuse_type { FUSE_NON_BUILTIN = 0 , FUSE_REDUCTION = 1 , FUSE_SHARE_REF = 2 , FUSE_SAME_SCC = 3 , FUSE_FINALIZE = 4 } |
Variables | |
static const char * | fuse_message [] |
#define DR_INDEX | ( | dr | ) |
Referenced by loop_distribution::build_rdg_partition_for_vertex().
#define MAX_DATAREFS_NUM ((unsigned) param_loop_max_datarefs_for_datadeps) |
Loop distribution. Copyright (C) 2006-2024 Free Software Foundation, Inc. Contributed by Georges-Andre Silber <Georges-Andre.Silber@ensmp.fr> and Sebastian Pop <sebastian.pop@amd.com>. This file is part of GCC. GCC is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3, or (at your option) any later version. GCC is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see <http://www.gnu.org/licenses/>.
This pass performs loop distribution: for example, the loop |DO I = 2, N | A(I) = B(I) + C | D(I) = A(I-1)*E |ENDDO is transformed to |DOALL I = 2, N | A(I) = B(I) + C |ENDDO | |DOALL I = 2, N | D(I) = A(I-1)*E |ENDDO Loop distribution is the dual of loop fusion. It separates statements of a loop (or loop nest) into multiple loops (or loop nests) with the same loop header. The major goal is to separate statements which may be vectorized from those that can't. This pass implements distribution in the following steps: 1) Seed partitions with specific type statements. For now we support two types seed statements: statement defining variable used outside of loop; statement storing to memory. 2) Build reduced dependence graph (RDG) for loop to be distributed. The vertices (RDG:V) model all statements in the loop and the edges (RDG:E) model flow and control dependencies between statements. 3) Apart from RDG, compute data dependencies between memory references. 4) Starting from seed statement, build up partition by adding depended statements according to RDG's dependence information. Partition is classified as parallel type if it can be executed paralleled; or as sequential type if it can't. Parallel type partition is further classified as different builtin kinds if it can be implemented as builtin function calls. 5) Build partition dependence graph (PG) based on data dependencies. The vertices (PG:V) model all partitions and the edges (PG:E) model all data dependencies between every partitions pair. In general, data dependence is either compilation time known or unknown. In C family languages, there exists quite amount compilation time unknown dependencies because of possible alias relation of data references. We categorize PG's edge to two types: "true" edge that represents compilation time known data dependencies; "alias" edge for all other data dependencies. 6) Traverse subgraph of PG as if all "alias" edges don't exist. Merge partitions in each strong connected component (SCC) correspondingly. Build new PG for merged partitions. 7) Traverse PG again and this time with both "true" and "alias" edges included. We try to break SCCs by removing some edges. Because SCCs by "true" edges are all fused in step 6), we can break SCCs by removing some "alias" edges. It's NP-hard to choose optimal edge set, fortunately simple approximation is good enough for us given the small problem scale. 8) Collect all data dependencies of the removed "alias" edges. Create runtime alias checks for collected data dependencies. 9) Version loop under the condition of runtime alias checks. Given loop distribution generally introduces additional overhead, it is only useful if vectorization is achieved in distributed loop. We version loop with internal function call IFN_LOOP_DIST_ALIAS. If no distributed loop can be vectorized, we simply remove distributed loops and recover to the original one. TODO: 1) We only distribute innermost two-level loop nest now. We should extend it for arbitrary loop nests in the future. 2) We only fuse partitions in SCC now. A better fusion algorithm is desired to minimize loop overhead, maximize parallelism and maximize data reuse.
Referenced by loop_distribution::distribute_loop().
#define NUM_PARTITION_THRESHOLD (4) |
Threshold controlling number of distributed partitions. Given it may be unnecessary if a memory stream cost model is invented in the future, we define it as a temporary macro, rather than a parameter.
Referenced by loop_distribution::finalize_partitions().
#define RDG_DATAREFS | ( | RDG, | |
I ) |
Referenced by loop_distribution::build_rdg_partition_for_vertex(), and find_single_drs().
#define RDG_MEM_READS_STMT | ( | RDG, | |
I ) |
Referenced by dot_rdg_1(), dump_rdg_vertex(), number_of_rw_in_partition(), number_of_rw_in_rdg(), and loop_distribution::share_memory_accesses().
#define RDG_MEM_WRITE_STMT | ( | RDG, | |
I ) |
Referenced by dot_rdg_1(), dump_rdg_vertex(), number_of_rw_in_partition(), number_of_rw_in_rdg(), and loop_distribution::share_memory_accesses().
#define RDG_STMT | ( | RDG, | |
I ) |
Referenced by loop_distribution::classify_partition(), create_rdg_cd_edges(), create_rdg_flow_edges(), find_single_drs(), and rdg_vertex_for_stmt().
#define RDGE_TYPE | ( | E | ) |
Referenced by create_edge_for_control_dependence(), create_rdg_edges_for_scalar(), and dot_rdg_1().
#define RDGV_DATAREFS | ( | V | ) |
Referenced by loop_distribution::create_rdg_vertices(), and free_rdg().
#define RDGV_HAS_MEM_READS | ( | V | ) |
Referenced by loop_distribution::create_rdg_vertices().
#define RDGV_HAS_MEM_WRITE | ( | V | ) |
Referenced by loop_distribution::create_rdg_vertices().
#define RDGV_STMT | ( | V | ) |
Referenced by loop_distribution::create_rdg_vertices(), dot_rdg_1(), and dump_rdg_vertex().
enum fuse_type |
enum partition_kind |
enum partition_type |
enum rdg_dep_type |
|
static |
Add edge <I, J> to partition dependence graph PG. Attach vector of data dependence relations to the EDGE if DDRS isn't NULL.
References add_edge(), graph_edge::data, gcc_assert, i, NULL, and vNULL.
Referenced by loop_distribution::build_partition_graph().
|
static |
Allocate and return builtin struct. Record information like DST_DR, SRC_DR, DST_BASE, SRC_BASE and SIZE in the allocated struct.
References builtin_info::dst_base, builtin_info::dst_dr, builtin_info::size, builtin_info::src_base, and builtin_info::src_dr.
Referenced by loop_distribution::classify_builtin_ldst(), classify_builtin_st(), and matching_alloc_calls_p().
|
static |
If X has a smaller topological sort number than Y, returns -1; if greater, returns 1.
References gcc_assert, loop_distribution::get_bb_top_order_index(), loop_distribution::get_bb_top_order_index_size(), basic_block_def::index, and y.
Referenced by loop_distribution::stmts_from_loop().
|
static |
Given data reference DR in loop nest LOOP, classify if it forms builtin memset call.
References alloc_builtin(), partition::builtin, compute_access_range(), const_with_all_bytes_same(), cst_and_fits_in_hwi(), DR_STMT, builtin_info::dst_base_base, builtin_info::dst_base_offset, flow_bb_inside_loop_p(), gimple_assign_rhs1(), gimple_bb(), int_cst_value(), INTEGRAL_TYPE_P, partition::kind, NULL, NULL_TREE, PKIND_MEMSET, PKIND_PARTIAL_MEMSET, builtin_info::size, split_constant_offset(), SSA_NAME_DEF_STMT, SSA_NAME_IS_DEFAULT_DEF, TREE_CODE, TREE_TYPE, TYPE_MODE, and unsigned_char_type_node.
Referenced by loop_distribution::classify_partition().
|
static |
Given data reference DR in LOOP_NEST, this function checks the enclosing loops from inner to outer to see if loop's step equals to access size at each level of loop. Return 2 if we can prove this at all level loops; record access base and size in BASE and SIZE; save loop's step at each level of loop in STEPS if it is not null. For example: int arr[100][100][100]; for (i = 0; i < 100; i++) ;steps[2] = 40000 for (j = 100; j > 0; j--) ;steps[1] = -400 for (k = 0; k < 100; k++) ;steps[0] = 4 arr[i][j - 1][k] = 0; ;base = &arr, size = 4000000 Return 1 if we can prove the equality at the innermost loop, but not all level loops. In this case, no information is recorded. Return 0 if no equality can be proven at any level loops.
References analyze_scalar_evolution(), build_fold_addr_expr, CDI_DOMINATORS, CHREC_LEFT, CHREC_RIGHT, dominated_by_p(), DR_REF, DR_STMT, EV_DIR_DECREASES, EV_DIR_UNKNOWN, fold_build1_loc(), fold_build2_loc(), fold_build_pointer_plus_loc(), fold_convert_loc(), gimple_bb(), gimple_location(), basic_block_def::loop_father, loop_outer(), NULL, number_of_latch_executions(), operand_equal_p(), scev_direction(), single_exit(), size_binop_loc(), size_one_node, sizetype, TREE_CODE, tree_contains_chrecs(), TREE_TYPE, and TYPE_SIZE_UNIT.
Referenced by loop_distribution::classify_builtin_ldst(), and classify_builtin_st().
|
static |
Compute alias check pairs and store them in COMP_ALIAS_PAIRS for LOOP's data dependence relations ALIAS_DDRS.
References chrec_dont_know, data_ref_segment_size(), DDR_A, DDR_B, DR_REF, dump_file, dump_flags, fold_convert, gcc_assert, i, latch_dominated_by_data_ref(), NULL_TREE, number_of_latch_executions(), prune_runtime_alias_test_list(), dr_with_seg_len_pair_t::REORDERED, size_binop, size_one_node, sizetype, TDF_DETAILS, tree_fits_uhwi_p(), tree_to_uhwi(), TREE_TYPE, TYPE_ALIGN_UNIT, and TYPE_SIZE_UNIT.
Referenced by version_loop_by_alias_check().
|
static |
If VAL memory representation contains the same value in all bytes, return that value, otherwise return -1. E.g. for 0x24242424 return 0x24, for IEEE double 747708026454360457216.0 return 0x44, etc.
References CHAR_BIT, const_with_all_bytes_same(), CONSTRUCTOR_NELTS, count, i, integer_zerop(), native_encode_expr(), real_isneg(), real_zerop(), TREE_CLOBBER_P, TREE_CODE, TREE_IMAGPART, TREE_REAL_CST_PTR, TREE_REALPART, VECTOR_CST_ENCODED_ELT, and vector_cst_encoded_nelts().
Referenced by classify_builtin_st(), const_with_all_bytes_same(), fuse_memset_builtins(), and generate_memset_builtin().
Return a copy of LOOP placed before LOOP.
References delete_update_ssa(), free_original_copy_tables(), gcc_assert, get_current_def(), gimple_phi_result(), gsi_end_p(), gsi_next(), gsi_start_phis(), initialize_original_copy_tables(), loop_preheader_edge(), NULL, PHI_ARG_DEF_PTR_FROM_EDGE, SET_USE, si, single_exit(), slpeel_tree_duplicate_loop_to_edge_cfg(), TREE_CODE, USE_FROM_PTR, and virtual_operand_p().
Referenced by generate_loops_for_partition().
|
static |
Creates an empty basic block after LOOP.
References single_exit(), and split_edge().
Referenced by generate_loops_for_partition().
|
static |
Creates an edge for the control dependences of BB to the vertex V.
References add_edge(), cd, control_dd, graph_edge::data, EXECUTE_IF_SET_IN_BITMAP, control_dependences::get_edge_src(), control_dependences::get_edges_dependent_on(), gsi_last_bb(), basic_block_def::index, is_ctrl_stmt(), rdg_vertex_for_stmt(), and RDGE_TYPE.
Referenced by create_rdg_cd_edges().
|
static |
Creates the edges of the reduced dependence graph RDG.
References cd, create_edge_for_control_dependence(), flow_bb_inside_loop_p(), FOR_EACH_EDGE, gimple_bb(), i, graph::n_vertices, and RDG_STMT.
Referenced by loop_distribution::build_rdg().
Creates dependence edges in RDG for all the uses of DEF. IDEF is the index of DEF in RDG.
References add_edge(), graph_edge::data, flow_dd, FOR_EACH_IMM_USE_FAST, rdg_vertex_for_stmt(), RDGE_TYPE, and USE_STMT.
Referenced by create_rdg_flow_edges().
|
static |
Creates the edges of the reduced dependence graph RDG.
References create_rdg_edges_for_scalar(), DEF_FROM_PTR, FOR_EACH_PHI_OR_STMT_DEF, i, graph::n_vertices, RDG_STMT, and SSA_OP_DEF.
Referenced by loop_distribution::build_rdg().
|
static |
Compute and return an expression whose value is the segment length which will be accessed by DR in NITERS iterations.
References DR_STEP, fold_convert, size_binop, size_one_node, and sizetype.
Referenced by compute_alias_check_pairs().
DEBUG_FUNCTION void debug_rdg | ( | struct graph * | rdg | ) |
Call dump_rdg on stderr.
References dump_rdg().
|
extern |
Debug PARTITIONS.
References dump_rdg_partitions().
DEBUG_FUNCTION void debug_rdg_vertex | ( | struct graph * | rdg, |
int | i ) |
Call dump_rdg_vertex on stderr.
References dump_rdg_vertex(), and i.
|
static |
Remove and destroy the loop LOOP.
References cancel_loop_tree(), CDI_DOMINATORS, delete_basic_block(), dominated_by_p(), free(), get_loop_body_in_dom_order(), gimple_debug_bind_get_value(), gimple_debug_bind_p(), gimple_debug_bind_reset_value(), gimple_phi_result(), gimple_vdef(), gsi_after_labels(), gsi_end_p(), gsi_move_before(), gsi_next(), gsi_start_bb(), gsi_start_phis(), gsi_stmt(), i, is_gimple_min_invariant(), loop_preheader_edge(), mark_virtual_operand_for_renaming(), mark_virtual_phi_result_for_renaming(), loop::num_nodes, recompute_dominator(), redirect_edge_pred(), rescan_loop_exit(), set_immediate_dominator(), single_exit(), single_pred_p(), TREE_CODE, update_stmt(), and virtual_operand_p().
Referenced by loop_distribution::execute().
If LOOP has a single non-volatile reduction statement, then return a pointer to it. Otherwise return NULL.
References determine_reduction_stmt_1(), and get_loop_body().
Referenced by loop_distribution::transform_reduction_loop().
|
static |
References gimple_clobber_p(), gimple_has_volatile_ops(), gimple_phi_result(), gsi_end_p(), gsi_next(), gsi_next_nondebug(), gsi_start_nondebug_bb(), gsi_start_phis(), gsi_stmt(), i, NULL, loop::num_nodes, stmt_has_scalar_dependences_outside_loop(), and virtual_operand_p().
Referenced by determine_reduction_stmt().
DEBUG_FUNCTION void dot_rdg | ( | struct graph * | rdg | ) |
Display the Reduced Dependence Graph using dotty.
References dot_rdg_1().
|
static |
References control_dd, graph_edge::dest, flow_dd, gcc_unreachable, i, graph::n_vertices, pp_flush(), pp_gimple_stmt_1(), pp_needs_newline(), RDG_MEM_READS_STMT, RDG_MEM_WRITE_STMT, RDGE_TYPE, RDGV_STMT, pretty_printer::set_output_stream(), graph_edge::succ_next, TDF_SLIM, and graph::vertices.
Referenced by dot_rdg().
|
static |
Dump the reduced dependence graph RDG to FILE.
References dump_rdg_vertex(), i, and graph::n_vertices.
Referenced by debug_rdg(), and loop_distribution::distribute_loop().
Dump to FILE the PARTITIONS.
References debug_bitmap_file(), FOR_EACH_VEC_ELT, i, and partition::stmts.
Referenced by debug_rdg_partitions(), and loop_distribution::distribute_loop().
|
static |
Dump vertex I in RDG to FILE.
References graph_edge::dest, i, graph_edge::pred_next, print_gimple_stmt(), RDG_MEM_READS_STMT, RDG_MEM_WRITE_STMT, RDGV_STMT, graph_edge::src, graph_edge::succ_next, TDF_MEMSYMS, TDF_VOPS, and graph::vertices.
Referenced by debug_rdg_vertex(), and dump_rdg().
|
static |
Given LOOP, this function records seed statements for distribution in WORK_LIST. Return false if there is nothing for distribution.
References can_copy_bbs_p(), dump_file, dump_flags, free(), get_loop_body_in_dom_order(), gimple_clobber_p(), gimple_has_side_effects(), gimple_phi_result(), gimple_vdef(), gsi_end_p(), gsi_next(), gsi_start_bb(), gsi_start_phis(), gsi_stmt(), i, loop::num, loop::num_nodes, stmt_has_scalar_dependences_outside_loop(), TDF_DETAILS, and virtual_operand_p().
Referenced by loop_distribution::execute().
|
static |
Given PARTITION of LOOP and RDG, record single load/store data references for builtin partition in SRC_DR/DST_DR, return false if there is no such data references.
References ADDR_SPACE_GENERIC_P, CDI_DOMINATORS, DECL_BIT_FIELD, dominated_by_p(), DR_IS_READ, DR_REF, DR_STMT, EXECUTE_IF_SET_IN_BITMAP, gimple_assign_single_p(), gimple_bb(), gimple_vuse(), loop::header, i, loop::latch, basic_block_def::loop_father, NULL, RDG_DATAREFS, RDG_STMT, single_exit(), TREE_CODE, TREE_OPERAND, TREE_TYPE, and TYPE_ADDR_SPACE.
Referenced by loop_distribution::classify_partition(), and loop_distribution::transform_reduction_loop().
|
static |
Callback function freeing data attached to edge E of graph.
References graph_edge::data, and NULL.
Referenced by loop_distribution::break_alias_scc_partitions(), and loop_distribution::merge_dep_scc_partitions().
|
static |
Free data attached to vertice of partition dependence graph PG.
References vertex::data, i, graph::n_vertices, and graph::vertices.
Referenced by loop_distribution::break_alias_scc_partitions(), and loop_distribution::merge_dep_scc_partitions().
Free the reduced dependence graph RDG.
References graph_edge::data, free(), free_graph(), get_loop_body(), gimple_set_uid(), gsi_end_p(), gsi_next(), gsi_start_bb(), gsi_start_phis(), gsi_stmt(), i, graph::n_vertices, loop::num_nodes, RDGV_DATAREFS, graph_edge::succ_next, and graph::vertices.
Referenced by loop_distribution::build_rdg(), loop_distribution::distribute_loop(), and loop_distribution::transform_reduction_loop().
Fuse adjacent memset builtin PARTITIONS if possible. This is a special case optimization transforming below code: __builtin_memset (&obj, 0, 100); _1 = &obj + 100; __builtin_memset (_1, 0, 200); _2 = &obj + 300; __builtin_memset (_2, 0, 100); into: __builtin_memset (&obj, 0, 400); Note we don't have dependence information between different partitions at this point, as a result, we can't handle nonadjacent memset builtin partitions since dependence might be broken.
References wi::add(), partition::builtin, const_with_all_bytes_same(), DR_STMT, builtin_info::dst_base_base, builtin_info::dst_base_offset, builtin_info::dst_dr, fold_build2, gcc_stablesort(), gimple_assign_rhs1(), i, partition::kind, wi::ne_p(), offset_cmp(), operand_equal_p(), partition_free(), PKIND_MEMSET, builtin_info::size, sizetype, wi::to_wide(), and TREE_CODE.
Referenced by loop_distribution::finalize_partitions().
|
static |
Generates code for PARTITION. Return whether LOOP needs to be destroyed.
References gcc_assert, gcc_unreachable, generate_loops_for_partition(), generate_memcpy_builtin(), generate_memset_builtin(), partition::kind, partition_reduction_p(), PKIND_MEMCPY, PKIND_MEMMOVE, PKIND_MEMSET, PKIND_NORMAL, and PKIND_PARTIAL_MEMSET.
Referenced by loop_distribution::distribute_loop().
|
static |
Generate code for PARTITION from the code in LOOP. The loop is copied when COPY_P is true. All the statements not flagged in the PARTITION bitmap are removed from the loop or from its copy. The statements are indexed in sequence inside a basic block, and the basic blocks of a loop are taken in dom order.
References as_a(), bitmap_bit_p, CASE_LOW, copy_loop_before(), CP_SIMPLE_PREHEADERS, create_bb_after_loop(), create_preheader(), dyn_cast(), free(), gcc_assert, get_loop_body_in_dom_order(), gimple_cond_make_false(), gimple_cond_make_true(), gimple_phi_result(), gimple_switch_label(), gimple_switch_set_index(), gimple_uid(), gsi_end_p(), gsi_next(), gsi_remove(), gsi_start_bb(), gsi_start_phis(), gsi_stmt(), i, is_gimple_debug(), basic_block_def::loop_father, MAY_HAVE_DEBUG_BIND_STMTS, NULL, loop::num, loop::num_nodes, loop::orig_loop_num, release_defs(), remove_phi_node(), reset_debug_uses(), single_exit(), partition::stmts, unlink_stmt_vdef(), update_stmt(), and virtual_operand_p().
Referenced by generate_code_for_partition().
Generate a call to memcpy for PARTITION in LOOP.
References aff_comb_cannot_overlap_p(), aff_combination_add(), aff_combination_scale(), build_fold_addr_expr, partition::builtin, builtin_decl_implicit(), builtin_info::dst_base, dump_file, dump_flags, fold_stmt(), force_gimple_operand_gsi(), gimple_build_call(), gimple_set_location(), GSI_CONTINUE_LINKING, gsi_insert_after(), gsi_last_bb(), partition::kind, partition::loc, loop_preheader_edge(), NULL_TREE, PKIND_MEMCPY, poly_int_tree_p(), ptr_derefs_may_alias_p(), ptr_type_node, rewrite_to_non_trapping_overflow(), builtin_info::size, builtin_info::src_base, TDF_DETAILS, wi::to_poly_widest(), and tree_to_aff_combination().
Referenced by generate_code_for_partition().
Generate a call to memset for PARTITION in LOOP.
References build_fold_addr_expr, build_int_cst(), partition::builtin, builtin_decl_implicit(), const_with_all_bytes_same(), DR_STMT, builtin_info::dst_base, builtin_info::dst_dr, dump_file, dump_flags, fold_convert, fold_stmt(), force_gimple_operand_gsi(), gimple_assign_rhs1(), gimple_build_assign(), gimple_build_call(), gimple_set_location(), GSI_CONTINUE_LINKING, gsi_insert_after(), gsi_last_bb(), integer_type_node, partition::loc, loop_preheader_edge(), make_ssa_name(), NULL_TREE, rewrite_to_non_trapping_overflow(), builtin_info::size, TDF_DETAILS, TREE_CODE, TREE_TYPE, and useless_type_conversion_p().
Referenced by generate_code_for_partition().
|
static |
Generate a call to rawmemchr and place it before LOOP. REDUCTION_VAR is replaced with a fresh SSA name representing the result of the call.
References copy_ssa_name(), DR_REF, force_gimple_operand(), g, generate_reduction_builtin_1(), gimple_build_assign(), gimple_build_call_internal(), gimple_call_set_lhs(), gimple_seq_add_stmt(), gimple_set_location(), partition::loc, NULL, NULL_TREE, TREE_TYPE, and TYPE_MODE.
Referenced by loop_distribution::transform_reduction_loop().
|
static |
A helper function for generate_{rawmemchr,strlen}_builtin functions in order to place new statements SEQ before LOOP and replace the old reduction variable with the new one.
References dump_file, dump_flags, FOR_EACH_IMM_USE_ON_STMT, FOR_EACH_IMM_USE_STMT, gcc_assert, GET_MODE_NAME, GSI_CONTINUE_LINKING, gsi_insert_seq_after(), gsi_last_bb(), loop_preheader_edge(), SET_USE, TDF_DETAILS, and update_stmt().
Referenced by generate_rawmemchr_builtin(), and generate_strlen_builtin_1().
|
static |
Generate a call to strlen and place it before LOOP. REDUCTION_VAR is replaced with a fresh SSA name representing the result of the call.
References build_fold_addr_expr, builtin_decl_implicit(), force_gimple_operand(), generate_strlen_builtin_1(), gimple_build_call(), gimple_call_set_lhs(), gimple_seq_add_stmt(), gimple_set_location(), partition::loc, make_ssa_name(), NULL, NULL_TREE, and size_type_node.
Referenced by loop_distribution::transform_reduction_loop().
|
static |
Helper function for generate_strlen_builtin(,_using_rawmemchr)
References g, generate_reduction_builtin_1(), gimple_build_assign(), gimple_convert(), gimple_seq_add_stmt(), integer_zerop(), make_ssa_name(), and TREE_TYPE.
Referenced by generate_strlen_builtin(), and generate_strlen_builtin_using_rawmemchr().
|
static |
Generate code in order to mimic the behaviour of strlen but this time over an array of elements with mode different than QI. REDUCTION_VAR is replaced with a fresh SSA name representing the result, i.e., the length.
References build_zero_cst(), count, end(), force_gimple_operand(), generate_strlen_builtin_1(), gimple_build_assign(), gimple_build_call_internal(), gimple_call_set_lhs(), gimple_convert(), gimple_seq_add_stmt(), gimple_set_location(), partition::loc, make_ssa_name(), NULL, NULL_TREE, ptrdiff_type_node, TREE_TYPE, TYPE_MODE, and TYPE_SIZE_UNIT.
Referenced by loop_distribution::transform_reduction_loop().
|
static |
Initialize vertice's data for partition dependence graph PG with PARTITIONS.
References vertex::data, i, pg_vdata::partition, and graph::vertices.
Referenced by loop_distribution::build_partition_graph().
|
inlinestatic |
Return true if LOOP's latch is dominated by statement for data reference DR.
References CDI_DOMINATORS, dominated_by_p(), DR_STMT, gimple_bb(), and single_exit().
Referenced by compute_alias_check_pairs().
gimple_opt_pass * make_pass_loop_distribution | ( | gcc::context * | ctxt | ) |
Returns the number of read and write operations in a PARTITION of the RDG.
References EXECUTE_IF_SET_IN_BITMAP, i, RDG_MEM_READS_STMT, RDG_MEM_WRITE_STMT, and partition::stmts.
Referenced by partition_contains_all_rw().
|
static |
Returns the number of read and write operations in the RDG.
References i, graph::n_vertices, RDG_MEM_READS_STMT, and RDG_MEM_WRITE_STMT.
Referenced by partition_contains_all_rw().
|
static |
Compare base offset of builtin mem* partitions P1 and P2.
References partition::builtin, and builtin_info::dst_base_offset.
Referenced by fuse_memset_builtins().
|
static |
Allocate and initialize a partition from BITMAP.
References BITMAP_ALLOC, partition::datarefs, partition::kind, partition::loc, NULL, PKIND_NORMAL, PTYPE_PARALLEL, partition::reduction_p, partition::stmts, partition::type, and UNKNOWN_LOCATION.
Referenced by loop_distribution::build_rdg_partition_for_vertex().
Returns true if the partition can be generated as a builtin.
References partition::kind, and PKIND_PARTIAL_MEMSET.
Referenced by loop_distribution::break_alias_scc_partitions(), loop_distribution::distribute_loop(), loop_distribution::finalize_partitions(), and version_loop_by_alias_check().
|
static |
Returns true when one of the PARTITIONS contains all the read or write operations of RDG.
References FOR_EACH_VEC_ELT, i, number_of_rw_in_partition(), and number_of_rw_in_rdg().
Referenced by loop_distribution::distribute_loop().
|
static |
Free PARTITION.
References BITMAP_FREE, partition::builtin, partition::datarefs, free(), and partition::stmts.
Referenced by loop_distribution::break_alias_scc_partitions(), loop_distribution::distribute_loop(), loop_distribution::finalize_partitions(), fuse_memset_builtins(), and loop_distribution::merge_dep_scc_partitions().
Returns true if the partition contains a reduction.
References partition::reduction_p.
Referenced by loop_distribution::break_alias_scc_partitions(), loop_distribution::build_partition_graph(), loop_distribution::distribute_loop(), generate_code_for_partition(), and loop_distribution::partition_merge_into().
|
static |
Callback function for traversing edge E in graph G. DATA is private callback data.
References pg_edata::alias_ddrs, pg_edge_callback_data::alias_ddrs, bitmap_bit_p, graph_edge::data, graph_edge::dest, g, i, NULL, pg_edge_callback_data::sccs_to_merge, graph_edge::src, and pg_edge_callback_data::vertices_component.
Referenced by loop_distribution::break_alias_scc_partitions().
|
static |
Callback function for graph travesal algorithm. It returns true if edge E should skipped when traversing the graph.
References graph_edge::data, and NULL.
Referenced by loop_distribution::break_alias_scc_partitions().
|
static |
Callback function for traversing edge E. DATA is private callback data.
References pg_edata::alias_ddrs, bitmap_bit_p, graph_edge::data, graph_edge::dest, i, NULL, pg_edge_callback_data::sccs_to_merge, graph_edge::src, and pg_edge_callback_data::vertices_component.
Referenced by loop_distribution::break_alias_scc_partitions().
|
static |
Compare postorder number of the partition graph vertices V1 and V2.
Referenced by sort_partitions_by_post_order().
Given innermost LOOP, return the outermost enclosing loop that forms a perfect loop nest.
References chrec_contains_symbols_defined_in_loop(), chrec_dont_know, loop::inner, loop_outer(), loop::next, NULL, NULL_TREE, loop::num, number_of_latch_executions(), and single_exit().
Referenced by loop_distribution::execute().
Returns the index of STMT in RDG.
References gcc_checking_assert, gimple_uid(), and RDG_STMT.
Referenced by create_edge_for_control_dependence(), create_rdg_edges_for_scalar(), loop_distribution::data_dep_in_cycle_p(), loop_distribution::get_data_dependence(), loop_distribution::pg_add_dependence_edges(), and loop_distribution::rdg_build_partitions().
Return true if we can count at least as many characters by taking pointer difference as we can count via reduction_var without an overflow. Thus compute 2^n < (2^(m-1) / s) where n = TYPE_PRECISION (reduction_var_type), m = TYPE_PRECISION (ptrdiff_type_node), and s = size of each character.
References wi::lshift(), wi::ltu_p(), ptrdiff_type_node, wi::to_widest(), TYPE_PRECISION, TYPE_SIZE_UNIT, and wi::udiv_trunc().
Referenced by loop_distribution::transform_reduction_loop().
|
static |
Sort partitions in PG in descending post order and store them in PARTITIONS.
References vertex::data, i, graph::n_vertices, pgcmp(), qsort, and graph::vertices.
Referenced by loop_distribution::break_alias_scc_partitions(), and loop_distribution::merge_dep_scc_partitions().
Returns true when DEF is an SSA_NAME defined in LOOP and used after the LOOP.
References flow_bb_inside_loop_p(), FOR_EACH_IMM_USE_FAST, gimple_bb(), is_gimple_debug(), and USE_STMT.
Referenced by stmt_has_scalar_dependences_outside_loop().
Returns true when STMT defines a scalar variable used after the loop LOOP.
References DEF_FROM_PTR, FOR_EACH_SSA_DEF_OPERAND, gimple_phi_result(), ssa_name_has_uses_outside_loop_p(), and SSA_OP_DEF.
Referenced by loop_distribution::classify_partition(), determine_reduction_stmt_1(), and find_seed_stmts_for_distribution().
|
inlinestatic |
Return true if loop versioning is needed to distrubute PARTITIONS. ALIAS_DDRS are data dependence relations for runtime alias check.
Referenced by loop_distribution::distribute_loop().
|
static |
Given data dependence relations in ALIAS_DDRS, generate runtime alias checks and version LOOP under condition of these runtime alias checks.
References profile_probability::apply_scale(), boolean_type_node, build_int_cst(), compute_alias_check_pairs(), create_runtime_alias_checks(), loop::dont_vectorize, dump_file, dump_flags, force_gimple_operand_1(), loop::force_vectorize, free_original_copy_tables(), gcc_assert, gimple_build_call_internal(), gimple_call_set_arg(), gimple_call_set_lhs(), gimple_seq_add_stmt_without_update(), gsi_insert_seq_before(), gsi_last_bb(), GSI_SAME_STMT, profile_probability::guessed_always(), i, initialize_original_copy_tables(), integer_type_node, profile_probability::invert(), is_gimple_val(), loop_version(), make_ssa_name(), NULL, NULL_TREE, loop::num, loop::orig_loop_num, partition_builtin_p(), TDF_DETAILS, TODO_update_ssa_no_phi, and update_ssa().
Referenced by loop_distribution::distribute_loop().
|
static |
Description on different fusing reason.
Referenced by loop_distribution::partition_merge_into().