GCC Middle and Back End API Reference
|
#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "backend.h"
#include "target.h"
#include "rtl.h"
#include "tree.h"
#include "gimple.h"
#include "cfghooks.h"
#include "tree-pass.h"
#include "ssa.h"
#include "optabs-tree.h"
#include "memmodel.h"
#include "optabs.h"
#include "diagnostic-core.h"
#include "fold-const.h"
#include "stor-layout.h"
#include "cfganal.h"
#include "gimplify.h"
#include "gimple-iterator.h"
#include "gimplify-me.h"
#include "tree-ssa-loop-ivopts.h"
#include "tree-ssa-loop-manip.h"
#include "tree-ssa-loop-niter.h"
#include "tree-ssa-loop.h"
#include "cfgloop.h"
#include "tree-scalar-evolution.h"
#include "tree-vectorizer.h"
#include "gimple-fold.h"
#include "cgraph.h"
#include "tree-cfg.h"
#include "tree-if-conv.h"
#include "internal-fn.h"
#include "tree-vector-builder.h"
#include "vec-perm-indices.h"
#include "tree-eh.h"
#include "case-cfn-macros.h"
#include "langhooks.h"
Macros | |
#define | INCLUDE_ALGORITHM |
#define INCLUDE_ALGORITHM |
Loop Vectorization Copyright (C) 2003-2024 Free Software Foundation, Inc. Contributed by Dorit Naishlos <dorit@il.ibm.com> and Ira Rosen <irar@il.ibm.com> This file is part of GCC. GCC is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3, or (at your option) any later version. GCC is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see <http://www.gnu.org/licenses/>.
|
static |
Function bb_in_loop_p Used as predicate for dfs order traversal of the loop bbs.
References flow_bb_inside_loop_p().
|
static |
Insert a conditional expression to enable masked vectorization. CODE is the code for the operation. VOP is the array of operands. MASK is the loop mask. GSI is a statement iterator used to place the new conditional expression.
References build_zero_cst(), gcc_unreachable, gimple_build_assign(), gsi_insert_before(), GSI_SAME_STMT, make_temp_ssa_name(), NULL, and TREE_TYPE.
Referenced by vect_transform_reduction().
|
static |
Writes into SEL a mask for a vec_perm, equivalent to a vec_shr by OFFSET vector elements (not bits) for a vector with NELT elements.
References i, int_vector_builder< T >::new_vector(), and offset.
Referenced by have_whole_vector_shift(), and vect_create_epilog_for_reduction().
|
static |
Return true if we can use CMP_TYPE as the comparison type to produce all masks required to mask LOOP_VINFO.
References direct_internal_fn_supported_p(), FOR_EACH_VEC_ELT, i, LOOP_VINFO_MASKS, NULL_TREE, OPTIMIZE_FOR_SPEED, and rgroup_controls::type.
Referenced by vect_verify_full_masking().
|
static |
Return true if the reduction PHI in LOOP with latch arg LOOP_ARG and has a handled computation expression. Store the main reduction operation in *CODE.
References as_a(), bitmap_set_bit, canonicalize_code(), gimple_match_op::code, conditional_internal_fn_code(), CONVERT_EXPR_CODE_P, dump_file, dump_flags, dump_printf(), dump_printf_loc(), dyn_cast(), flow_bb_inside_loop_p(), FOR_EACH_IMM_USE_ON_STMT, FOR_EACH_IMM_USE_STMT, FOR_EACH_VEC_ELT, gimple_assign_rhs1_ptr(), gimple_bb(), gimple_call_arg(), gimple_call_arg_ptr(), gimple_call_num_args(), gimple_extract_op(), gimple_nop_p(), i, ssa_op_iter::i, internal_fn_else_index(), is_gimple_debug(), code_helper::is_internal_fn(), MSG_NOTE, NULL_USE_OPERAND_P, gimple_match_op::num_ops, ssa_op_iter::numops, op_iter_init_phiuse(), op_iter_init_use(), op_iter_next_use(), gimple_match_op::ops, PHI_RESULT, SSA_NAME_DEF_STMT, SSA_NAME_VERSION, SSA_OP_USE, TDF_DETAILS, TREE_CODE, tree_nop_conversion_p(), TREE_TYPE, gimple_match_op::type, TYPE_SIGN, USE_FROM_PTR, USE_STMT, and visited.
Referenced by loop_cand::analyze_iloop_reduction_var(), check_reduction_path(), parloops_is_simple_reduction(), and vect_is_simple_reduction().
bool check_reduction_path | ( | dump_user_location_t | loc, |
loop_p | loop, | ||
gphi * | phi, | ||
tree | loop_arg, | ||
enum | tree_code ) |
Used in gimple-loop-interchange.c and tree-parloops.cc.
References check_reduction_path(), and path.
tree cse_and_gimplify_to_preheader | ( | loop_vec_info | loop_vinfo, |
tree | expr ) |
Return an invariant or register for EXPR and emit necessary computations in the LOOP_VINFO loop preheader.
References expr, force_gimple_operand(), hash_map< KeyId, Value, Traits >::get_or_insert(), gsi_insert_seq_on_edge_immediate(), is_gimple_min_invariant(), is_gimple_reg(), _loop_vec_info::ivexpr_map, loop_preheader_edge(), LOOP_VINFO_LOOP, NULL, NULL_TREE, and unshare_expr().
Referenced by vect_get_strided_load_store_ops(), vectorizable_load(), and vectorizable_store().
Helper function to pass to simplify_replace_tree to enable replacing tree's in the hash_map with its corresponding values.
Referenced by update_epilogue_loop_vinfo().
|
static |
Return true if there is an in-order reduction function for CODE, storing it in *REDUC_FN if so.
Referenced by vectorizable_reduction().
|
static |
Function get_initial_def_for_reduction Input: REDUC_INFO - the info_for_reduction INIT_VAL - the initial value of the reduction variable NEUTRAL_OP - a value that has no effect on the reduction, as per neutral_op_for_reduction Output: Return a vector variable, initialized according to the operation that STMT_VINFO performs. This vector will be used as the initial value of the vector of partial results. The value we need is a vector in which element 0 has value INIT_VAL and every other element has value NEUTRAL_OP.
References gcc_assert, get_vectype_for_scalar_type(), gimple_bb(), gimple_build(), gimple_build_vector(), gimple_build_vector_from_val(), gimple_convert(), INTEGRAL_TYPE_P, LOOP_VINFO_LOOP, nested_in_vect_loop_p(), NULL, operand_equal_p(), POINTER_TYPE_P, SCALAR_FLOAT_TYPE_P, TREE_TYPE, TYPE_VECTOR_SUBPARTS(), and vect_emit_reduction_init_stmts().
Referenced by vect_transform_cycle_phi().
|
static |
Get at the initial defs for the reduction PHIs for REDUC_INFO, which performs a reduction involving GROUP_SIZE scalar statements. NUMBER_OF_VECTORS is the number of vector defs to create. If NEUTRAL_OP is nonnull, introducing extra elements of that value will not change the result.
References CONSTANT_CLASS_P, duplicate_and_interleave(), gcc_assert, gimple_build(), gimple_build_vector(), gimple_build_vector_from_val(), gimple_convert(), i, known_eq, tree_vector_builder::new_vector(), NULL, operand_equal_p(), STMT_VINFO_VECTYPE, TREE_TYPE, TYPE_VECTOR_SUBPARTS(), useless_type_conversion_p(), and vect_emit_reduction_init_stmts().
Referenced by vect_transform_cycle_phi().
|
static |
Get a masked internal function equivalent to REDUC_FN. VECTYPE_IN is the type of the vector input.
References direct_internal_fn_supported_p(), and OPTIMIZE_FOR_SPEED.
Referenced by vect_reduction_update_partial_vector_usage(), and vectorize_fold_left_reduction().
|
static |
Checks whether the target supports whole-vector shifts for vectors of mode MODE. This is the case if _either_ the platform handles vec_shr_optab, _or_ it supports vec_perm_const with masks for all necessary shift amounts.
References calc_vec_perm_mask_for_shift(), can_implement_p(), can_vec_perm_const_p(), GET_MODE_NUNITS(), and i.
Referenced by vect_create_epilog_for_reduction(), and vect_model_reduction_cost().
stmt_vec_info info_for_reduction | ( | vec_info * | vinfo, |
stmt_vec_info | stmt_info ) |
For a statement STMT_INFO taking part in a reduction operation return the stmt_vec_info the meta information is stored on.
References as_a(), gcc_assert, gimple_phi_num_args(), is_a(), vec_info::lookup_def(), STMT_VINFO_DEF_TYPE, STMT_VINFO_REDUC_DEF, vect_double_reduction_def, vect_nested_cycle, vect_orig_stmt(), vect_phi_initial_value(), and VECTORIZABLE_CYCLE_DEF.
Referenced by vect_optimize_slp_pass::start_choosing_layouts(), vect_create_epilog_for_reduction(), vect_reduc_type(), vect_transform_cycle_phi(), vect_transform_reduction(), vectorizable_condition(), vectorizable_live_operation(), and vectorizable_reduction().
|
static |
Function is_nonwrapping_integer_induction. Check if STMT_VINO (which is part of loop LOOP) both increments and does not cause overflow.
References wi::add(), as_a(), gimple_phi_result(), max_stmt_executions(), wi::min_precision(), wi::mul(), wi::OVF_NONE, STMT_VINFO_LOOP_PHI_EVOLUTION_BASE_UNCHANGED, STMT_VINFO_LOOP_PHI_EVOLUTION_PART, wi::to_widest(), TREE_CODE, TREE_TYPE, TYPE_OVERFLOW_UNDEFINED, TYPE_PRECISION, and TYPE_SIGN.
Referenced by vectorizable_reduction().
|
static |
Given loop represented by LOOP_VINFO, return true if computation of LOOP_VINFO_NITERS (= LOOP_VINFO_NITERSM1 + 1) doesn't overflow, false otherwise.
References gcc_assert, get_max_loop_iterations(), LOOP_VINFO_LOOP, LOOP_VINFO_NITERS, LOOP_VINFO_NITERS_KNOWN_P, LOOP_VINFO_NITERSM1, wi::max_value(), wi::to_widest(), TREE_CODE, TREE_TYPE, and TYPE_SIGN.
Referenced by vect_transform_loop().
|
static |
For a vectorized stmt DEF_STMT_INFO adjust all vectorized PHI latch edge values originally defined by it.
References add_phi_arg(), as_a(), dyn_cast(), EXTRACT_LAST_REDUCTION, FOLD_LEFT_REDUCTION, FOR_EACH_IMM_USE_FAST, gcc_assert, gimple_assign_rhs1(), gimple_assign_set_rhs1(), gimple_assign_set_rhs2(), gimple_bb(), gimple_get_lhs(), gimple_phi_arg_location(), i, vec_info::lookup_stmt(), basic_block_def::loop_father, loop_latch_edge(), PHI_ARG_DEF_FROM_EDGE, SSA_NAME_DEF_STMT, STMT_VINFO_DEF_TYPE, STMT_VINFO_REDUC_TYPE, STMT_VINFO_RELEVANT_P, STMT_VINFO_VEC_STMTS, TREE_CODE, update_stmt(), USE_STMT, vect_first_order_recurrence, vect_orig_stmt(), and VECTORIZABLE_CYCLE_DEF.
Referenced by vect_transform_loop().
|
static |
Return a vector of type VECTYPE that is equal to the vector select operation "MASK ? VEC : IDENTITY". Insert the select statements before GSI.
References gimple_build_assign(), gsi_insert_before(), GSI_SAME_STMT, make_temp_ssa_name(), and NULL.
Referenced by vectorize_fold_left_reduction().
|
static |
When vectorizing early break statements instructions that happen before the early break in the current BB need to be moved to after the early break. This function deals with that and assumes that any validity checks has already been performed. While moving the instructions if it encounters a VUSE or VDEF it then corrects the VUSES as it moves the statements along. GDEST is the location in which to insert the new statements.
References CDI_DOMINATORS, dominated_by_p(), dump_enabled_p(), dump_printf_loc(), DUMP_VECT_SCOPE, dyn_cast(), FOR_EACH_IMM_USE_ON_STMT, FOR_EACH_IMM_USE_STMT, get_loop_exit_edges(), get_virtual_phi(), gimple_phi_arg_def(), gimple_phi_result(), gimple_set_vuse(), gimple_vuse(), gsi_after_labels(), gsi_for_stmt(), gsi_move_before(), GSI_NEW_STMT, is_empty(), vec_info::lookup_stmt(), LOOP_VINFO_EARLY_BRK_DEST_BB, LOOP_VINFO_EARLY_BRK_STORES, LOOP_VINFO_EARLY_BRK_VUSES, LOOP_VINFO_LOOP, MSG_NOTE, NULL_TREE, remove_phi_node(), SET_PHI_ARG_DEF_ON_EDGE, SET_USE, data_reference::stmt, update_stmt(), and vect_location.
Referenced by vect_transform_loop().
bool needs_fold_left_reduction_p | ( | tree | type, |
code_helper | code ) |
Return true if we need an in-order reduction for operation CODE on type TYPE. NEED_WRAPPING_INTEGRAL_OVERFLOW is true if integer overflow must wrap.
References INTEGRAL_TYPE_P, code_helper::is_tree_code(), operation_no_trapping_overflow(), SAT_FIXED_POINT_TYPE_P, and SCALAR_FLOAT_TYPE_P.
Referenced by vect_optimize_slp_pass::start_choosing_layouts(), vect_reassociating_reduction_p(), vect_slp_check_for_roots(), and vectorizable_reduction().
tree neutral_op_for_reduction | ( | tree | scalar_type, |
code_helper | code, | ||
tree | initial_value, | ||
bool | as_initial ) |
If there is a neutral value X such that a reduction would not be affected by the introduction of additional X elements, return that X, otherwise return null. CODE is the code of the reduction and SCALAR_TYPE is type of the scalar elements. If the reduction has just a single initial value then INITIAL_VALUE is that value, otherwise it is null. If AS_INITIAL is TRUE the value is supposed to be used as initial value. In that case no signed zero is returned.
References build_all_ones_cst(), build_one_cst(), build_real(), build_zero_cst(), dconstm0, HONOR_SIGNED_ZEROS(), code_helper::is_tree_code(), and NULL_TREE.
Referenced by convert_scalar_cond_reduction(), vect_create_epilog_for_reduction(), vect_expand_fold_left(), vect_find_reusable_accumulator(), vect_transform_cycle_phi(), and vectorizable_reduction().
void optimize_mask_stores | ( | class loop * | loop | ) |
The code below is trying to perform simple optimization - revert if-conversion for masked stores, i.e. if the mask of a store is zero do not perform it and all stored value producers also if possible. For example, for (i=0; i<n; i++) if (c[i]) { p1[i] += 1; p2[i] = p3[i] +2; } this transformation will produce the following semi-hammock: if (!mask__ifc__42.18_165 == { 0, 0, 0, 0, 0, 0, 0, 0 }) { vect__11.19_170 = MASK_LOAD (vectp_p1.20_168, 0B, mask__ifc__42.18_165); vect__12.22_172 = vect__11.19_170 + vect_cst__171; MASK_STORE (vectp_p1.23_175, 0B, mask__ifc__42.18_165, vect__12.22_172); vect__18.25_182 = MASK_LOAD (vectp_p3.26_180, 0B, mask__ifc__42.18_165); vect__19.28_184 = vect__18.25_182 + vect_cst__183; MASK_STORE (vectp_p2.29_187, 0B, mask__ifc__42.18_165, vect__19.28_184); }
References add_bb_to_loop(), add_phi_arg(), build_zero_cst(), CDI_DOMINATORS, cfun, basic_block_def::count, create_empty_bb(), create_phi_node(), dom_info_available_p(), dump_enabled_p(), dump_printf_loc(), EDGE_SUCC, find_loop_location(), flow_loop_nested_p(), FOR_EACH_IMM_USE_FAST, free(), gcc_assert, get_loop_body(), gimple_bb(), gimple_build_cond(), gimple_call_arg(), gimple_call_internal_p(), gimple_get_lhs(), gimple_has_volatile_ops(), gimple_set_vdef(), gimple_vdef(), gimple_vop(), gimple_vuse(), gsi_end_p(), gsi_for_stmt(), gsi_insert_after(), gsi_last_bb(), gsi_move_before(), gsi_next(), gsi_prev(), gsi_remove(), GSI_SAME_STMT, gsi_start_bb(), gsi_stmt(), has_zero_uses(), i, basic_block_def::index, is_gimple_debug(), last, profile_probability::likely(), basic_block_def::loop_father, make_edge(), make_single_succ_edge(), make_ssa_name(), MSG_NOTE, NULL, NULL_TREE, loop::num_nodes, release_defs(), set_immediate_dominator(), split_block(), TREE_CODE, TREE_TYPE, UNKNOWN_LOCATION, USE_STMT, vect_location, VECTOR_TYPE_P, and worklist.
bool reduction_fn_for_scalar_code | ( | code_helper | code, |
internal_fn * | reduc_fn ) |
Function reduction_fn_for_scalar_code Input: CODE - tree_code of a reduction operations. Output: REDUC_FN - the corresponding internal function to be used to reduce the vector of partial results into a single scalar result, or IFN_LAST if the operation is a supported reduction operation, but does not have such an internal function. Return FALSE if CODE currently cannot be vectorized as reduction.
References code_helper::is_tree_code().
Referenced by vect_slp_check_for_roots(), vectorizable_bb_reduc_epilogue(), vectorizable_reduction(), and vectorize_slp_instance_root_stmt().
void release_vec_loop_controls | ( | vec< rgroup_controls > * | controls | ) |
Free all levels of rgroup CONTROLS.
References rgroup_controls::controls, FOR_EACH_VEC_ELT, and i.
Referenced by vect_analyze_loop_2(), vect_verify_full_masking_avx512(), and _loop_vec_info::~_loop_vec_info().
|
static |
Error reporting helper for vect_is_simple_reduction below. GIMPLE statement STMT is printed with a message MSG.
References dump_printf_loc(), msg, and vect_location.
Referenced by vect_is_simple_reduction().
|
static |
Scale profiling counters by estimation for LOOP which is vectorized by factor VF. If FLAT is true, the loop we started with had unrealistically flat profile.
References profile_probability::always(), basic_block_def::count, dump_file, dump_flags, get_likely_max_loop_iterations_int(), loop::header, loop::latch, loop_preheader_edge(), profile_count::nonzero_p(), profile_count::probability_in(), scale_loop_profile(), set_edge_probability_and_rescale_others(), single_pred_edge(), and TDF_DETAILS.
Referenced by vect_transform_loop().
Update EPILOGUE's loop_vec_info. EPILOGUE was constructed as a copy of the original loop that has now been vectorized. The inits of the data_references need to be advanced with the number of iterations of the main loop. This has been computed in vect_do_peeling and is stored in parameter ADVANCE. We first restore the data_references initial offset with the values recored in ORIG_DRS_INIT. Since the loop_vec_info of this EPILOGUE was constructed for the original loop, its stmt_vec_infos all point to the original statements. These need to be updated to point to their corresponding copies as well as the SSA_NAMES in their PATTERN_DEF_SEQs and RELATED_STMTs. The data_reference's connections also need to be updated. Their corresponding dr_vec_info need to be reconnected to the EPILOGUE's stmt_vec_infos, their statements need to point to their corresponding copy, if they are gather loads or scatter stores then their reference needs to be updated to point to its corresponding copy.
References advance(), vec_info_shared::datarefs_copy, DR_BASE_ADDRESS, DR_REF, DR_STMT, find_in_mapping(), FOR_EACH_VEC_ELT, free(), gcc_assert, get_loop_body(), gimple_bb(), gimple_get_lhs(), gimple_num_ops(), gimple_op(), gimple_phi_result(), gimple_set_bb(), gimple_set_op(), gimple_uid(), gsi_end_p(), gsi_next(), gsi_start(), gsi_start_bb(), gsi_start_phis(), gsi_stmt(), i, is_gimple_debug(), loop_vec_info_for_loop(), LOOP_VINFO_BBS, LOOP_VINFO_DATAREFS, LOOP_VINFO_DRS_ADVANCED_BY, LOOP_VINFO_NBBS, NULL, NULL_TREE, loop::num_nodes, gphi_iterator::phi(), vec_info_shared::save_datarefs(), vec_info::shared, simplify_replace_tree(), vec_info::stmt_vec_infos, STMT_VINFO_GATHER_SCATTER_P, STMT_VINFO_MEMORY_ACCESS_TYPE, STMT_VINFO_PATTERN_DEF_SEQ, STMT_VINFO_RELATED_STMT, STMT_VINFO_STMT, STMT_VINFO_STRIDED_P, vect_stmt_to_vectorize(), vect_update_inits_of_drs(), and VMAT_GATHER_SCATTER.
Referenced by vect_transform_loop().
|
static |
Check if masking can be supported by inserting a conditional expression. CODE is the code for the operation. COND_FN is the conditional internal function, if it exists. VECTYPE_IN is the type of the vector input.
References direct_internal_fn_supported_p(), code_helper::is_tree_code(), and OPTIMIZE_FOR_SPEED.
Referenced by vect_reduction_update_partial_vector_usage(), and vect_transform_reduction().
Determine the main loop exit for the vectorizer.
References candidate(), CDI_DOMINATORS, chrec_contains_undetermined(), COMPARISON_CLASS_P, dominated_by_p(), get_loop_exit_condition(), get_loop_exit_edges(), integer_nonzerop(), integer_zerop(), loop::latch, tree_niter_desc::may_be_zero, niter_desc::niter, NULL, number_of_iterations_exit_assumptions(), single_pred(), and single_pred_p().
Referenced by set_uid_loop_bbs(), and vect_analyze_loop_form().
|
static |
Return true if STMT_INFO describes a double reduction phi and if the other phi in the reduction is also relevant for vectorization. This rejects cases such as: outer1: x_1 = PHI <x_3(outer2), ...>; ... inner: x_2 = ...; ... outer2: x_3 = PHI <x_2(inner)>; if nothing in x_2 or elsewhere makes x_1 relevant.
References STMT_VINFO_DEF_TYPE, STMT_VINFO_REDUC_DEF, STMT_VINFO_RELEVANT_P, and vect_double_reduction_def.
Referenced by vect_analyze_loop_operations().
opt_loop_vec_info vect_analyze_loop | ( | class loop * | loop, |
gimple * | loop_vectorized_call, | ||
vec_info_shared * | shared ) |
Function vect_analyze_loop. Apply a set of analyses on LOOP, and create a loop_vec_info struct for it. The different analyses will record information in the loop_vec_info struct.
References vect_loop_form_info::assumptions, dump_enabled_p(), dump_printf_loc(), DUMP_VECT_SCOPE, _loop_vec_info::epilogue_vinfo, opt_pointer_wrapper< PtrType_t >::failure_at(), fatal(), find_loop_nest(), free_numbers_of_iterations_estimates(), gcc_assert, GET_MODE_NAME, i, loop::inner, integer_onep(), known_eq, LOOP_C_FINITE, loop_constraint_set(), loop_cost_model(), vec_info_shared::loop_nest, loop_outer(), LOOP_REQUIRES_VERSIONING, loop_vec_info_for_loop(), LOOP_VINFO_EARLY_BREAKS, LOOP_VINFO_PEELING_FOR_NITER, LOOP_VINFO_VECT_FACTOR, LOOP_VINFO_VECTORIZABLE_P, LOOP_VINFO_VERSIONING_THRESHOLD, maybe_ge, MSG_MISSED_OPTIMIZATION, MSG_NOTE, NULL, opt_pointer_wrapper< PtrType_t >::propagate_failure(), scev_reset_htab(), loop::simdlen, loop::simduid, opt_pointer_wrapper< PtrType_t >::success(), vector_costs::suggested_epilogue_mode(), targetm, unlimited_cost_model(), vect_analyze_loop_1(), vect_analyze_loop_form(), VECT_COMPARE_COSTS, VECT_COST_MODEL_VERY_CHEAP, vect_joust_loop_vinfos(), vect_location, and _loop_vec_info::vector_costs.
Referenced by try_vectorize_loop_1().
|
static |
Analyze LOOP with VECTOR_MODES[MODE_I] and as epilogue if ORIG_LOOP_VINFO is not NULL. Set AUTODETECTED_VECTOR_MODE if VOIDmode and advance MODE_I to the next mode useful to analyze. Return the loop_vinfo on success and wrapped null on failure.
References dump_enabled_p(), dump_printf_loc(), fatal(), gcc_checking_assert, GET_MODE_INNER, GET_MODE_NAME, LOOP_VINFO_EPILOGUE_P, MSG_NOTE, NULL, opt_pointer_wrapper< PtrType_t >::propagate_failure(), related_vector_mode(), opt_pointer_wrapper< PtrType_t >::success(), _loop_vec_info::suggested_unroll_factor, vect_analyze_loop_2(), vect_chooses_same_modes_p(), vect_create_loop_vinfo(), vect_location, vec_info::vector_mode, and VECTOR_MODE_P.
Referenced by vect_analyze_loop().
|
static |
Function vect_analyze_loop_2. Apply a set of analyses on LOOP specified by LOOP_VINFO, the different analyses will record information in some members of LOOP_VINFO. FATAL indicates if some analysis meets fatal error. If one non-NULL pointer SUGGESTED_UNROLL_FACTOR is provided, it's intent to be filled with one worked out suggested unroll factor, while one NULL pointer shows it's going to apply the suggested unroll factor. SLP_DONE_FOR_SUGGESTED_UF is to hold the slp decision when the suggested unroll factor is worked out.
References vec_info_shared::check_datarefs(), direct_internal_fn_supported_p(), DR_GROUP_FIRST_ELEMENT, DR_GROUP_NEXT_ELEMENT, DR_GROUP_SIZE, dump_dec(), dump_enabled_p(), dump_printf(), dump_printf_loc(), hash_map< KeyId, Value, Traits >::empty(), opt_result::failure_at(), fatal(), FOR_EACH_VEC_ELT, gcc_assert, gsi_end_p(), gsi_next(), gsi_start(), gsi_start_bb(), gsi_start_phis(), gsi_stmt(), i, init_cost, integer_zerop(), is_empty(), is_gimple_debug(), known_eq, known_le, known_ne, vec_info::lookup_stmt(), LOOP_REQUIRES_VERSIONING, loop_vect, LOOP_VINFO_BBS, LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P, LOOP_VINFO_CHECK_UNEQUAL_ADDRS, LOOP_VINFO_COMP_ALIAS_DDRS, LOOP_VINFO_COST_MODEL_THRESHOLD, LOOP_VINFO_DATAREFS, LOOP_VINFO_EARLY_BREAKS, LOOP_VINFO_EPILOGUE_P, LOOP_VINFO_INT_NITERS, LOOP_VINFO_IV_EXIT, LOOP_VINFO_LENS, LOOP_VINFO_LOOP, LOOP_VINFO_LOWER_BOUNDS, LOOP_VINFO_MASKS, LOOP_VINFO_MAX_VECT_FACTOR, LOOP_VINFO_MUST_USE_PARTIAL_VECTORS_P, LOOP_VINFO_NITERS_KNOWN_P, LOOP_VINFO_ORIG_LOOP_INFO, LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS, LOOP_VINFO_PARTIAL_VECTORS_STYLE, LOOP_VINFO_PEELING_FOR_ALIGNMENT, LOOP_VINFO_PEELING_FOR_GAPS, LOOP_VINFO_PEELING_FOR_NITER, LOOP_VINFO_REDUCTION_CHAINS, LOOP_VINFO_RGROUP_IV_TYPE, LOOP_VINFO_SIMD_IF_COND, LOOP_VINFO_SLP_INSTANCES, LOOP_VINFO_UNALIGNED_DR, LOOP_VINFO_USING_DECREMENTING_IV_P, LOOP_VINFO_USING_PARTIAL_VECTORS_P, LOOP_VINFO_USING_SELECT_VL_P, LOOP_VINFO_VECT_FACTOR, LOOP_VINFO_VECTORIZABLE_P, LOOP_VINFO_VERSIONING_THRESHOLD, MAX_VECTORIZATION_FACTOR, maybe_ge, MSG_MISSED_OPTIMIZATION, MSG_NOTE, NULL, OPTIMIZE_FOR_SPEED, release_vec_loop_controls(), vec_info_shared::save_datarefs(), _loop_vec_info::scan_map, vec_info::shared, si, slp_inst_kind_store, SLP_INSTANCE_KIND, SLP_INSTANCE_LOADS, SLP_INSTANCE_TREE, SLP_TREE_DEF_TYPE, SLP_TREE_LANES, SLP_TREE_REPRESENTATIVE, SLP_TREE_SCALAR_STMTS, slpeel_can_duplicate_loop_p(), STMT_SLP_TYPE, vec_info::stmt_vec_infos, STMT_VINFO_DEF_TYPE, STMT_VINFO_GROUPED_ACCESS, STMT_VINFO_IN_PATTERN_P, STMT_VINFO_PATTERN_DEF_SEQ, STMT_VINFO_REDUC_DEF, STMT_VINFO_RELATED_STMT, STMT_VINFO_SLP_VECT_ONLY_PATTERN, STMT_VINFO_VECTYPE, opt_result::success(), _loop_vec_info::suggested_unroll_factor, TYPE_VECTOR_SUBPARTS(), vect_analyze_data_ref_accesses(), vect_analyze_data_ref_dependences(), vect_analyze_data_refs(), vect_analyze_data_refs_alignment(), vect_analyze_loop_costing(), vect_analyze_loop_operations(), vect_analyze_scalar_cycles(), vect_analyze_slp(), vect_apply_runtime_profitability_check_p(), vect_can_advance_ivs_p(), vect_compute_single_scalar_iteration_cost(), vect_detect_hybrid_slp(), vect_determine_partial_vectors_and_peeling(), vect_determine_vectorization_factor(), vect_dissolve_slp_only_groups(), vect_double_reduction_def, vect_enhance_data_refs_alignment(), vect_fixup_scalar_cycles_with_patterns(), vect_free_slp_instance(), vect_gather_slp_loads(), vect_get_datarefs_in_loop(), vect_grouped_load_supported(), vect_grouped_store_supported(), vect_internal_def, vect_load_lanes_supported(), vect_location, vect_make_slp_decision(), vect_mark_stmts_to_be_vectorized(), vect_optimize_slp(), vect_partial_vectors_len, vect_pattern_recog(), vect_prune_runtime_alias_test_list(), vect_reduction_def, vect_slp_analyze_operations(), vect_stmt_to_vectorize(), vect_store_lanes_supported(), vect_update_vf_for_slp(), vect_use_loop_mask_for_alignment_p(), vect_verify_full_masking(), vect_verify_full_masking_avx512(), vect_verify_loop_lens(), and _loop_vec_info::vector_costs.
Referenced by vect_analyze_loop_1().
|
static |
Analyze the cost of the loop described by LOOP_VINFO. Decide if it is worthwhile to vectorize. Return 1 if definitely yes, 0 if definitely no, or -1 if it's worth retrying.
References dump_enabled_p(), dump_printf_loc(), estimated_stmt_executions_int(), known_eq, known_le, known_lt, likely_max_stmt_executions_int(), loop_cost_model(), LOOP_REQUIRES_VERSIONING, LOOP_VINFO_COST_MODEL_THRESHOLD, LOOP_VINFO_EPILOGUE_P, LOOP_VINFO_INT_NITERS, LOOP_VINFO_LOOP, LOOP_VINFO_MAIN_LOOP_INFO, LOOP_VINFO_NITERS_KNOWN_P, LOOP_VINFO_NITERSM1, LOOP_VINFO_ORIG_LOOP_INFO, LOOP_VINFO_PEELING_FOR_ALIGNMENT, LOOP_VINFO_PEELING_FOR_GAPS, LOOP_VINFO_PEELING_FOR_NITER, LOOP_VINFO_USING_PARTIAL_VECTORS_P, LOOP_VINFO_VECT_FACTOR, MAX, MSG_MISSED_OPTIMIZATION, MSG_NOTE, wi::to_widest(), vect_apply_runtime_profitability_check_p(), VECT_COST_MODEL_VERY_CHEAP, vect_estimate_min_profitable_iters(), vect_known_niters_smaller_than_vf(), vect_location, vect_use_loop_mask_for_alignment_p(), and vect_vf_for_cost().
Referenced by vect_analyze_loop_2().
opt_result vect_analyze_loop_form | ( | class loop * | loop, |
gimple * | loop_vectorized_call, | ||
vect_loop_form_info * | info ) |
Function vect_analyze_loop_form. Verify that certain CFG restrictions hold, including: - the loop has a pre-header - the loop has a single entry - nested loops can have only a single exit. - the loop exit condition is simple enough - the number of iterations can be analyzed, i.e, a countable loop. The niter could be analyzed under some assumptions.
References vect_loop_form_info::assumptions, cfun, chrec_contains_undetermined(), vect_loop_form_info::conds, dump_enabled_p(), dump_generic_expr(), dump_printf(), dump_printf_loc(), DUMP_VECT_SCOPE, EDGE_COUNT, EDGE_PRED, empty_block_p(), loop::exits, expr_invariant_in_loop_p(), opt_result::failure_at(), free(), get_loop(), get_loop_body(), get_loop_exit_edges(), gimple_bb(), gimple_call_arg(), gimple_seq_empty_p(), loop::header, i, loop::inner, vect_loop_form_info::inner_loop_cond, integer_onep(), integer_zerop(), loop::latch, vect_loop_form_info::loop_exit, loop_exits_from_bb_p(), loop_preheader_edge(), MSG_MISSED_OPTIMIZATION, MSG_NOTE, NULL, loop::num_nodes, vect_loop_form_info::number_of_iterations, vect_loop_form_info::number_of_iterationsm1, phi_nodes(), basic_block_def::preds, single_exit(), single_pred(), single_succ_p(), opt_result::success(), TDF_DETAILS, tree_fits_shwi_p(), tree_to_shwi(), vec_init_loop_exit_info(), vect_analyze_loop_form(), vect_get_loop_niters(), and vect_location.
Referenced by gather_scalar_reductions(), vect_analyze_loop(), and vect_analyze_loop_form().
|
static |
Function vect_analyze_loop_operations. Scan the loop stmts and make sure they are all vectorizable.
References add_stmt_costs(), dump_enabled_p(), dump_printf_loc(), DUMP_VECT_SCOPE, opt_result::failure_at(), gcc_assert, gimple_clobber_p(), gimple_phi_num_args(), gimple_phi_result(), gsi_end_p(), gsi_next(), gsi_start_bb(), gsi_start_phis(), gsi_stmt(), i, is_gimple_debug(), is_loop_header_bb_p(), vec_info::lookup_def(), vec_info::lookup_stmt(), LOOP_VINFO_BBS, LOOP_VINFO_LOOP, MSG_NOTE, NULL, loop::num_nodes, PHI_ARG_DEF, PURE_SLP_STMT, si, STMT_VINFO_DEF_TYPE, STMT_VINFO_LIVE_P, STMT_VINFO_RELEVANT, STMT_VINFO_RELEVANT_P, opt_result::success(), vect_active_double_reduction_p(), vect_analyze_stmt(), vect_double_reduction_def, vect_first_order_recurrence, vect_induction_def, vect_internal_def, vect_location, vect_nested_cycle, vect_reduction_def, vect_used_in_outer, vect_used_in_outer_by_reduction, vect_used_in_scope, _loop_vec_info::vector_costs, vectorizable_induction(), vectorizable_lc_phi(), vectorizable_live_operation(), vectorizable_recurr(), vectorizable_reduction(), and virtual_operand_p().
Referenced by vect_analyze_loop_2().
|
static |
Function vect_analyze_scalar_cycles. Examine the cross iteration def-use cycles of scalar variables, by analyzing the loop-header PHIs of scalar variables. Classify each cycle as one of the following: invariant, induction, reduction, unknown. We do that for the loop represented by LOOP_VINFO, and also to its inner-loop, if exists. Examples for scalar cycles: Example1: reduction: loop1: for (i=0; i<N; i++) sum += a[i]; Example2: induction: loop2: for (i=0; i<N; i++) a[i] = i;
References loop::inner, LOOP_VINFO_LOOP, and vect_analyze_scalar_cycles_1().
Referenced by vect_analyze_loop_2().
|
static |
Function vect_analyze_scalar_cycles_1. Examine the cross iteration def-use cycles of scalar variables in LOOP. LOOP_VINFO represents the loop that is now being considered for vectorization (can be LOOP, or an outer-loop enclosing LOOP). SLP indicates there will be some subsequent slp analyses or not.
References analyze_scalar_evolution(), as_a(), dump_enabled_p(), dump_printf_loc(), DUMP_VECT_SCOPE, evolution_part_in_loop_num(), gcc_assert, gsi_end_p(), gsi_next(), gsi_start_phis(), loop::header, initial_condition_in_loop_num(), vec_info::lookup_stmt(), LOOP_VINFO_LOOP, LOOP_VINFO_REDUCTIONS, MSG_MISSED_OPTIMIZATION, MSG_NOTE, NULL, NULL_TREE, loop::num, gphi_iterator::phi(), PHI_RESULT, STMT_VINFO_DEF_TYPE, STMT_VINFO_LOOP_PHI_EVOLUTION_BASE_UNCHANGED, STMT_VINFO_LOOP_PHI_EVOLUTION_PART, STMT_VINFO_REDUC_DEF, STRIP_NOPS, TREE_CODE, vect_double_reduction_def, vect_first_order_recurrence, vect_induction_def, vect_inner_phi_in_double_reduction_p(), vect_is_nonlinear_iv_evolution(), vect_is_simple_iv_evolution(), vect_is_simple_reduction(), vect_location, vect_nested_cycle, vect_phi_first_order_recurrence_p(), vect_reduction_def, vect_unknown_def_type, virtual_operand_p(), and worklist.
Referenced by vect_analyze_scalar_cycles().
|
static |
Return true if vectorizing a loop using NEW_LOOP_VINFO appears to be better than vectorizing it using OLD_LOOP_VINFO. Assume that OLD_LOOP_VINFO is better unless something specifically indicates otherwise. Note that this deliberately isn't a partial order.
References vector_costs::better_epilogue_loop_than_p(), gcc_assert, known_eq, LOOP_VINFO_LOOP, LOOP_VINFO_ORIG_LOOP_INFO, LOOP_VINFO_VECT_FACTOR, loop::simdlen, and _loop_vec_info::vector_costs.
Referenced by vect_joust_loop_vinfos().
bool vect_can_vectorize_without_simd_p | ( | code_helper | code | ) |
Likewise, but taking a code_helper.
References code_helper::is_tree_code(), and vect_can_vectorize_without_simd_p().
Return true if we can emulate CODE on an integer mode representation of a vector.
Referenced by vect_can_vectorize_without_simd_p(), vectorizable_operation(), and vectorizable_reduction().
|
static |
Calculate the cost of one scalar iteration of the loop.
References add_stmt_costs(), DR_IS_READ, DUMP_VECT_SCOPE, vector_costs::finish_cost(), gsi_end_p(), gsi_next(), gsi_start_bb(), gsi_stmt(), i, init_cost, loop::inner, is_gimple_assign(), is_gimple_call(), vec_info::lookup_stmt(), basic_block_def::loop_father, LOOP_VINFO_BBS, LOOP_VINFO_INNER_LOOP_COST_FACTOR, LOOP_VINFO_LOOP, LOOP_VINFO_SCALAR_ITERATION_COST, loop::num_nodes, record_stmt_cost(), _loop_vec_info::scalar_costs, scalar_load, scalar_stmt, scalar_store, si, STMT_VINFO_DATA_REF, STMT_VINFO_DEF_TYPE, STMT_VINFO_LIVE_P, STMT_VINFO_RELEVANT_P, vect_nop_conversion_p(), vect_prologue, vect_stmt_to_vectorize(), and VECTORIZABLE_CYCLE_DEF.
Referenced by vect_analyze_loop_2().
|
static |
Function vect_create_epilog_for_reduction Create code at the loop-epilog to finalize the result of a reduction computation. STMT_INFO is the scalar reduction stmt that is being vectorized. SLP_NODE is an SLP node containing a group of reduction statements. The first one in this group is STMT_INFO. SLP_NODE_INSTANCE is the SLP node instance containing SLP_NODE REDUC_INDEX says which rhs operand of the STMT_INFO is the reduction phi (counting from 0) LOOP_EXIT is the edge to update in the merge block. In the case of a single exit this edge is always the main loop exit. This function: 1. Completes the reduction def-use cycles. 2. "Reduces" each vector of partial results VECT_DEFS into a single result, by calling the function specified by REDUC_FN if available, or by other means (whole-vector shifts or a scalar loop). The function also creates a new phi node at the loop exit to preserve loop-closed form, as illustrated below. The flow at the entry to this function: loop: vec_def = phi <vec_init, null> # REDUCTION_PHI VECT_DEF = vector_stmt # vectorized form of STMT_INFO s_loop = scalar_stmt # (scalar) STMT_INFO loop_exit: s_out0 = phi <s_loop> # (scalar) EXIT_PHI use <s_out0> use <s_out0> The above is transformed by this function into: loop: vec_def = phi <vec_init, VECT_DEF> # REDUCTION_PHI VECT_DEF = vector_stmt # vectorized form of STMT_INFO s_loop = scalar_stmt # (scalar) STMT_INFO loop_exit: s_out0 = phi <s_loop> # (scalar) EXIT_PHI v_out1 = phi <VECT_DEF> # NEW_EXIT_PHI v_out2 = reduce <v_out1> s_out3 = extract_field <v_out2, 0> s_out4 = adjust_result <s_out3> use <s_out4> use <s_out4>
References add_phi_arg(), as_a(), as_combined_fn(), bitsize_int, bitsize_zero_node, boolean_type_node, build1(), build3(), build_index_vector(), build_int_cst(), build_vector_from_val(), build_zero_cst(), calc_vec_perm_mask_for_shift(), COND_REDUCTION, copy_ssa_name(), create_iv(), create_phi_node(), directly_supported_p(), dump_enabled_p(), dump_printf_loc(), exact_log2(), flow_bb_inside_loop_p(), FOR_EACH_IMM_USE_FAST, FOR_EACH_IMM_USE_ON_STMT, FOR_EACH_IMM_USE_STMT, FOR_EACH_VEC_ELT, gcc_assert, gcc_checking_assert, gcc_unreachable, GET_MODE_NUNITS(), GET_MODE_PRECISION(), get_related_vectype_for_scalar_type(), get_same_sized_vectype(), gimple_assign_rhs1(), gimple_assign_rhs_code(), gimple_assign_set_lhs(), gimple_bb(), gimple_build(), gimple_build_assign(), gimple_build_call_internal(), gimple_build_vector_from_val(), gimple_call_set_lhs(), gimple_convert(), gimple_get_lhs(), gimple_op(), gimple_phi_arg_def(), gimple_phi_num_args(), gimple_seq_last_stmt(), gsi_after_labels(), gsi_insert_before(), gsi_insert_seq_before(), GSI_SAME_STMT, have_whole_vector_shift(), loop::header, i, info_for_reduction(), loop::inner, INTEGER_INDUC_COND_REDUCTION, poly_int< N, C >::is_constant(), is_gimple_debug(), least_common_multiple(), vec_info::lookup_def(), loop_latch_edge(), loop_preheader_edge(), LOOP_VINFO_IV_EXIT, LOOP_VINFO_LOOP, make_ssa_name(), make_unsigned_type(), MSG_NOTE, nested_in_vect_loop_p(), neutral_op_for_reduction(), NULL, NULL_TREE, PHI_RESULT, phis, pow2p_hwi(), hash_map< KeyId, Value, Traits >::put(), REDUC_GROUP_FIRST_ELEMENT, _loop_vec_info::reusable_accumulators, SCALAR_TYPE_MODE, SET_PHI_ARG_DEF, SET_USE, single_imm_use(), single_succ(), single_succ_edge(), single_succ_p(), array_slice< T >::size(), _loop_vec_info::skip_this_loop_edge, SLP_TREE_CHILDREN, SLP_TREE_LANES, SLP_TREE_REPRESENTATIVE, SLP_TREE_SCALAR_STMTS, SLP_TREE_VEC_DEFS, SSA_NAME_DEF_STMT, STMT_VINFO_DEF_TYPE, STMT_VINFO_IN_PATTERN_P, STMT_VINFO_REDUC_CODE, STMT_VINFO_REDUC_DEF, STMT_VINFO_REDUC_EPILOGUE_ADJUSTMENT, STMT_VINFO_REDUC_FN, STMT_VINFO_REDUC_IDX, STMT_VINFO_REDUC_TYPE, STMT_VINFO_REDUC_VECTYPE, STMT_VINFO_RELATED_STMT, STMT_VINFO_VEC_INDUC_COND_INITIAL_VAL, STMT_VINFO_VEC_STMTS, targetm, poly_int< N, C >::to_constant(), TREE_CODE, TREE_CODE_REDUCTION, tree_to_uhwi(), TREE_TYPE, truth_type_for(), TYPE_MODE, TYPE_SIZE, TYPE_UNSIGNED, TYPE_VECTOR_SUBPARTS(), UNKNOWN_LOCATION, update_stmt(), USE_STMT, vect_create_destination_var(), vect_create_partial_epilog(), vect_double_reduction_def, vect_gen_perm_mask_any(), vect_get_slp_vect_def(), vect_iv_increment_position(), vect_location, vect_orig_stmt(), vect_stmt_to_vectorize(), VECTOR_MODE_P, and VECTOR_TYPE_P.
Referenced by vectorizable_live_operation().
loop_vec_info vect_create_loop_vinfo | ( | class loop * | loop, |
vec_info_shared * | shared, | ||
const vect_loop_form_info * | info, | ||
loop_vec_info | orig_loop_info ) |
Create a loop_vec_info for LOOP with SHARED and the vect_analyze_loop_form result.
References vect_loop_form_info::assumptions, vect_loop_form_info::conds, estimated_stmt_executions(), i, loop::inner, vect_loop_form_info::inner_loop_cond, integer_onep(), vec_info::lookup_stmt(), vect_loop_form_info::loop_exit, loop_exit_ctrl_vec_info_type, LOOP_VINFO_EARLY_BREAKS, LOOP_VINFO_EPILOGUE_P, LOOP_VINFO_INNER_LOOP_COST_FACTOR, LOOP_VINFO_IV_EXIT, LOOP_VINFO_LOOP_CONDS, LOOP_VINFO_LOOP_IV_COND, LOOP_VINFO_MAIN_LOOP_INFO, LOOP_VINFO_NITERS, LOOP_VINFO_NITERS_ASSUMPTIONS, LOOP_VINFO_NITERS_UNCHANGED, LOOP_VINFO_NITERSM1, LOOP_VINFO_ORIG_LOOP_INFO, vect_loop_form_info::number_of_iterations, vect_loop_form_info::number_of_iterationsm1, wi::smin(), STMT_VINFO_DEF_TYPE, STMT_VINFO_TYPE, generic_wide_int< storage >::to_uhwi(), and vect_condition_def.
Referenced by gather_scalar_reductions(), and vect_analyze_loop_1().
|
static |
Create vector init for vectorized iv.
References build_one_cst(), build_vector_type(), build_zero_cst(), gcc_assert, gcc_unreachable, gimple_build(), gimple_build_vector(), gimple_build_vector_from_val(), gimple_convert(), i, init_expr(), poly_int< N, C >::is_constant(), TREE_TYPE, TYPE_VECTOR_SUBPARTS(), unsigned_type_for(), vect_gen_perm_mask_any(), vect_step_op_mul, vect_step_op_neg, vect_step_op_shl, and vect_step_op_shr.
Referenced by vectorizable_nonlinear_induction().
|
static |
Create vector step for vectorized iv.
References begin(), build_int_cst(), gcc_assert, gimple_build(), i, poly_int< N, C >::is_constant(), wi::mul(), NULL, poly_int< N, C >::to_constant(), wi::to_wide(), TREE_TYPE, vect_step_op_mul, vect_step_op_neg, and wide_int_to_tree().
Referenced by vectorizable_nonlinear_induction().
|
static |
References build_vector_from_val(), CONSTANT_CLASS_P, gcc_assert, NULL, TREE_CODE, unshare_expr(), vect_init_vector(), and vect_step_op_neg.
Referenced by vectorizable_nonlinear_induction().
|
static |
Reduce the vector VEC_DEF down to VECTYPE with reduction operation CODE emitting stmts before GSI. Returns a vector def of VECTYPE.
References bitsize_int, build1(), build3(), build_nonstandard_integer_type(), build_vector_type(), convert_optab_handler(), gcc_assert, get_related_vectype_for_scalar_type(), gimple_build(), gimple_build_assign(), gimple_seq_add_stmt_without_update(), make_ssa_name(), poly_int< N, C >::to_constant(), tree_to_uhwi(), TREE_TYPE, TYPE_MODE, TYPE_SIZE, and TYPE_VECTOR_SUBPARTS().
Referenced by vect_create_epilog_for_reduction(), and vect_transform_cycle_phi().
opt_result vect_determine_partial_vectors_and_peeling | ( | loop_vec_info | loop_vinfo | ) |
Determine if operating on full vectors for LOOP_VINFO might leave some scalar iterations still to do. If so, decide how we should handle those scalar iterations. The possibilities are: (1) Make LOOP_VINFO operate on partial vectors instead of full vectors. In this case: LOOP_VINFO_USING_PARTIAL_VECTORS_P == true LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P == false LOOP_VINFO_PEELING_FOR_NITER == false (2) Make LOOP_VINFO operate on full vectors and use an epilogue loop to handle the remaining scalar iterations. In this case: LOOP_VINFO_USING_PARTIAL_VECTORS_P == false LOOP_VINFO_PEELING_FOR_NITER == true There are two choices: (2a) Consider vectorizing the epilogue loop at the same VF as the main loop, but using partial vectors instead of full vectors. In this case: LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P == true (2b) Consider vectorizing the epilogue loop at lower VFs only. In this case: LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P == false
References dump_enabled_p(), dump_printf_loc(), opt_result::failure_at(), LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P, LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P, LOOP_VINFO_EPILOGUE_P, LOOP_VINFO_MUST_USE_PARTIAL_VECTORS_P, LOOP_VINFO_PEELING_FOR_NITER, LOOP_VINFO_USING_PARTIAL_VECTORS_P, LOOP_VINFO_USING_SELECT_VL_P, MSG_NOTE, opt_result::success(), _loop_vec_info::suggested_unroll_factor, vect_known_niters_smaller_than_vf(), vect_location, and vect_need_peeling_or_partial_vectors_p().
Referenced by vect_analyze_loop_2(), and vect_do_peeling().
|
static |
Function vect_determine_vectorization_factor Determine the vectorization factor (VF). VF is the number of data elements that are operated upon in parallel in a single iteration of the vectorized loop. For example, when vectorizing a loop that operates on 4byte elements, on a target with vector size (VS) 16byte, the VF is set to 4, since 4 elements can fit in a single vector register. We currently support vectorization of loops in which all types operated upon are of the same size. Therefore this function currently sets VF according to the size of the types operated upon, and fails if there are multiple sizes in the loop. VF is also the factor by which the loop iterations are strip-mined, e.g.: original loop: for (i=0; i<N; i++){ a[i] = b[i] + c[i]; } vectorized loop: for (i=0; i<N; i+=VF){ a[i:VF] = b[i:VF] + c[i:VF]; }
References dump_dec(), dump_enabled_p(), dump_printf(), dump_printf_loc(), DUMP_VECT_SCOPE, opt_result::failure_at(), gcc_assert, get_vectype_for_scalar_type(), gsi_end_p(), gsi_next(), gsi_start_bb(), gsi_start_phis(), gsi_stmt(), i, is_gimple_debug(), known_le, vec_info::lookup_stmt(), LOOP_VINFO_BBS, LOOP_VINFO_LOOP, LOOP_VINFO_VECT_FACTOR, MSG_NOTE, NULL_TREE, loop::num_nodes, PHI_RESULT, si, STMT_VINFO_LIVE_P, STMT_VINFO_RELEVANT_P, STMT_VINFO_VECTYPE, opt_result::success(), TREE_TYPE, TYPE_VECTOR_SUBPARTS(), vect_determine_vf_for_stmt(), vect_location, and vect_update_max_nunits().
Referenced by vect_analyze_loop_2().
|
static |
Subroutine of vect_determine_vectorization_factor. Set the vector types of STMT_INFO and all attached pattern statements and update the vectorization factor VF accordingly. Return true on success or false if something prevented vectorization.
References dump_enabled_p(), dump_printf_loc(), gsi_end_p(), gsi_next(), gsi_start(), gsi_stmt(), vec_info::lookup_stmt(), MSG_NOTE, si, STMT_VINFO_IN_PATTERN_P, STMT_VINFO_PATTERN_DEF_SEQ, STMT_VINFO_RELATED_STMT, opt_result::success(), vect_determine_vf_for_stmt_1(), and vect_location.
Referenced by vect_determine_vectorization_factor().
|
static |
Subroutine of vect_determine_vf_for_stmt that handles only one statement. VECTYPE_MAYBE_SET_P is true if STMT_VINFO_VECTYPE may already be set for general statements (not just data refs).
References dump_enabled_p(), dump_printf_loc(), gcc_assert, gimple_clobber_p(), MSG_NOTE, stmt_vectype(), STMT_VINFO_DATA_REF, STMT_VINFO_LIVE_P, STMT_VINFO_RELEVANT_P, STMT_VINFO_VECTYPE, opt_result::success(), vect_get_vector_types_for_stmt(), vect_location, and vect_update_max_nunits().
Referenced by vect_determine_vf_for_stmt().
|
static |
Look for SLP-only access groups and turn each individual access into its own group.
References dr_info::dr, dr_vec_info::dr, DR_GROUP_FIRST_ELEMENT, DR_GROUP_GAP, DR_GROUP_NEXT_ELEMENT, DR_GROUP_SIZE, DR_INIT, DR_IS_WRITE, DR_MISALIGNMENT_UNKNOWN, DR_REF, DR_STMT, DUMP_VECT_SCOPE, FOR_EACH_VEC_ELT, gcc_assert, i, vec_info::lookup_stmt(), LOOP_VINFO_DATAREFS, dr_vec_info::misalignment, NULL, STMT_SLP_TYPE, STMT_VINFO_DR_INFO, STMT_VINFO_GROUPED_ACCESS, STMT_VINFO_SLP_VECT_ONLY, STMT_VINFO_STRIDED_P, dr_vec_info::target_alignment, TREE_INT_CST_LOW, and vect_stmt_to_vectorize().
Referenced by vect_analyze_loop_2().
Return a mask type with twice as many elements as OLD_TYPE, given that it should have mode NEW_MODE.
References build_truth_vector_type_for_mode(), new_mode(), and TYPE_VECTOR_SUBPARTS().
Referenced by supportable_narrowing_operation().
|
static |
SEQ is a sequence of instructions that initialize the reduction described by REDUC_INFO. Emit them in the appropriate place.
References gcc_assert, gsi_insert_seq_before(), gsi_insert_seq_on_edge_immediate(), gsi_last_bb(), GSI_SAME_STMT, loop_preheader_edge(), LOOP_VINFO_LOOP, and _loop_vec_info::skip_main_loop_edge.
Referenced by get_initial_def_for_reduction(), and get_initial_defs_for_reduction().
|
static |
STMT_INFO is a dot-product reduction whose multiplication operands have different signs. Emit a sequence to emulate the operation using a series of signed DOT_PROD_EXPRs and return the last statement generated. VEC_DEST is the result of the vector operation and VOP lists its inputs.
References build_vector_from_val(), gimple_build_assign(), i, wi::lrshift(), make_ssa_name(), signed_type_for(), wi::to_wide(), TREE_TYPE, TYPE_MIN_VALUE, TYPE_UNSIGNED, vect_finish_stmt_generation(), and wide_int_to_tree().
Referenced by vect_transform_reduction().
Return true if VECTYPE represents a vector that requires lowering by the vector lowering pass.
References TREE_TYPE, TYPE_MODE, TYPE_PRECISION, VECTOR_BOOLEAN_TYPE_P, and VECTOR_MODE_P.
Referenced by vectorizable_call(), vectorizable_operation(), vectorizable_reduction(), and vectorizable_shift().
|
static |
Loop Vectorization Pass. This pass tries to vectorize loops. For example, the vectorizer transforms the following simple loop: short a[N]; short b[N]; short c[N]; int i; for (i=0; i<N; i++){ a[i] = b[i] + c[i]; } as if it was manually vectorized by rewriting the source code into: typedef int __attribute__((mode(V8HI))) v8hi; short a[N]; short b[N]; short c[N]; int i; v8hi *pa = (v8hi*)a, *pb = (v8hi*)b, *pc = (v8hi*)c; v8hi va, vb, vc; for (i=0; i<N/8; i++){ vb = pb[i]; vc = pc[i]; va = vb + vc; pa[i] = va; } The main entry to this pass is vectorize_loops(), in which the vectorizer applies a set of analyses on a given set of loops, followed by the actual vectorization transformation for the loops that had successfully passed the analysis phase. Throughout this pass we make a distinction between two types of data: scalars (which are represented by SSA_NAMES), and memory references ("data-refs"). These two types of data require different handling both during analysis and transformation. The types of data-refs that the vectorizer currently supports are ARRAY_REFS which base is an array DECL (not a pointer), and INDIRECT_REFS through pointers; both array and pointer accesses are required to have a simple (consecutive) access pattern. Analysis phase: =============== The driver for the analysis phase is vect_analyze_loop(). It applies a set of analyses, some of which rely on the scalar evolution analyzer (scev) developed by Sebastian Pop. During the analysis phase the vectorizer records some information per stmt in a "stmt_vec_info" struct which is attached to each stmt in the loop, as well as general information about the loop as a whole, which is recorded in a "loop_vec_info" struct attached to each loop. Transformation phase: ===================== The loop transformation phase scans all the stmts in the loop, and creates a vector stmt (or a sequence of stmts) for each scalar stmt S in the loop that needs to be vectorized. It inserts the vector code sequence just before the scalar stmt S, and records a pointer to the vector code in STMT_VINFO_VEC_STMT (stmt_info) (stmt_info is the stmt_vec_info struct attached to S). This pointer will be used for the vectorization of following stmts which use the def of stmt S. Stmt S is removed if it writes to memory; otherwise, we rely on dead code elimination for removing it. For example, say stmt S1 was vectorized into stmt VS1: VS1: vb = px[i]; S1: b = x[i]; STMT_VINFO_VEC_STMT (stmt_info (S1)) = VS1 S2: a = b; To vectorize stmt S2, the vectorizer first finds the stmt that defines the operand 'b' (S1), and gets the relevant vector def 'vb' from the vector stmt VS1 pointed to by STMT_VINFO_VEC_STMT (stmt_info (S1)). The resulting sequence would be: VS1: vb = px[i]; S1: b = x[i]; STMT_VINFO_VEC_STMT (stmt_info (S1)) = VS1 VS2: va = vb; S2: a = b; STMT_VINFO_VEC_STMT (stmt_info (S2)) = VS2 Operands that are not SSA_NAMEs, are data-refs that appear in load/store operations (like 'x[i]' in S1), and are handled differently. Target modeling: ================= Currently the only target specific information that is used is the size of the vector (in bytes) - "TARGET_VECTORIZE_UNITS_PER_SIMD_WORD". Targets that can support different sizes of vectors, for now will need to specify one value for "TARGET_VECTORIZE_UNITS_PER_SIMD_WORD". More flexibility will be added in the future. Since we only vectorize operations which vector form can be expressed using existing tree codes, to verify that an operation is supported, the vectorizer checks the relevant optab at the relevant machine_mode (e.g, optab_handler (add_optab, V8HImode)). If the value found is CODE_FOR_nothing, then there's no target support, and we can't vectorize the stmt. For additional information on this project see: http://gcc.gnu.org/projects/tree-ssa/vectorization.html
Function vect_estimate_min_profitable_iters Return the number of iterations required for the vector version of the loop to be profitable relative to the cost of the scalar version of the loop. *RET_MIN_PROFITABLE_NITERS is a cost model profitability threshold of iterations for vectorization. -1 value means loop vectorization is not profitable. This returned value may be used for dynamic profitability check. *RET_MIN_PROFITABLE_ESTIMATE is a profitability threshold to be used for static check against estimated number of iterations.
References add_stmt_cost(), vector_costs::body_cost(), cond_branch_not_taken, cond_branch_taken, dump_enabled_p(), dump_printf(), dump_printf_loc(), vector_costs::epilogue_cost(), rgroup_controls::factor, vector_costs::finish_cost(), FOR_EACH_VEC_ELT, loop::force_vectorize, gcc_assert, dump_user_location_t::get_location_t(), HOST_WIDE_INT_PRINT_UNSIGNED, i, known_le, LOOP_REQUIRES_VERSIONING, LOOP_REQUIRES_VERSIONING_FOR_ALIAS, LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT, LOOP_REQUIRES_VERSIONING_FOR_NITERS, LOOP_VINFO_CHECK_UNEQUAL_ADDRS, LOOP_VINFO_COMP_ALIAS_DDRS, LOOP_VINFO_EPILOGUE_P, LOOP_VINFO_FULLY_MASKED_P, LOOP_VINFO_FULLY_WITH_LENGTH_P, LOOP_VINFO_LENS, LOOP_VINFO_LOOP, LOOP_VINFO_LOWER_BOUNDS, LOOP_VINFO_MASKS, LOOP_VINFO_MAX_VECT_FACTOR, LOOP_VINFO_MAY_MISALIGN_STMTS, LOOP_VINFO_NITERS_KNOWN_P, LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS, LOOP_VINFO_PARTIAL_VECTORS_STYLE, LOOP_VINFO_PEELING_FOR_ALIGNMENT, LOOP_VINFO_PEELING_FOR_GAPS, LOOP_VINFO_RGROUP_IV_TYPE, LOOP_VINFO_SCALAR_ITERATION_COST, LOOP_VINFO_USING_DECREMENTING_IV_P, LOOP_VINFO_USING_PARTIAL_VECTORS_P, LOOP_VINFO_VECT_FACTOR, MAX, rgroup_controls::max_nscalars_per_iter, MAX_VECTORIZATION_FACTOR, MSG_MISSED_OPTIMIZATION, MSG_NOTE, NULL, NULL_TREE, vector_costs::prologue_cost(), _loop_vec_info::scalar_costs, scalar_stmt, si, vector_costs::suggested_unroll_factor(), vector_costs::total_cost(), TREE_TYPE, rgroup_controls::type, TYPE_PRECISION, unlimited_cost_model(), vect_body, vect_epilogue, vect_get_peel_iters_epilogue(), vect_get_stmt_cost(), vect_known_niters_smaller_than_vf(), vect_location, vect_partial_vectors_avx512, vect_partial_vectors_while_ult, vect_prologue, vect_rgroup_iv_might_wrap_p(), vect_use_loop_mask_for_alignment_p(), vect_vf_for_cost(), _loop_vec_info::vector_costs, vector_stmt, and warning_at().
Referenced by vect_analyze_loop_costing().
|
static |
Successively apply CODE to each element of VECTOR_RHS, in left-to-right order, starting with LHS. Insert the extraction statements before GSI and associate the new scalar SSA names with variable SCALAR_DEST. If MASK is nonzero mask the input and then operate on it unconditionally. Return the SSA name for the result.
References bitsize_int, build3(), build_vector_from_val(), gimple_assign_set_lhs(), gimple_build_assign(), gsi_insert_before(), GSI_SAME_STMT, make_ssa_name(), make_temp_ssa_name(), neutral_op_for_reduction(), NULL, NULL_TREE, tree_to_uhwi(), TREE_TYPE, and TYPE_SIZE.
Referenced by vectorize_fold_left_reduction().
|
static |
See if LOOP_VINFO is an epilogue loop whose main loop had a reduction that REDUC_INFO can build on. Adjust REDUC_INFO and return true if so, otherwise return false.
References as_a(), can_vec_extract(), directly_supported_p(), gcc_assert, hash_map< KeyId, Value, Traits >::get(), get_related_vectype_for_scalar_type(), gimple_bb(), known_gt, LOOP_VINFO_ORIG_LOOP_INFO, _loop_vec_info::main_loop_edge, neutral_op_for_reduction(), num_phis(), operand_equal_p(), PHI_ARG_DEF_FROM_EDGE, vect_reusable_accumulator::reduc_info, vect_reusable_accumulator::reduc_input, _loop_vec_info::reusable_accumulators, _loop_vec_info::skip_main_loop_edge, SSA_NAME_DEF_STMT, STMT_VINFO_REDUC_CODE, STMT_VINFO_REDUC_EPILOGUE_ADJUSTMENT, STMT_VINFO_REDUC_TYPE, STMT_VINFO_VECTYPE, TREE_CODE, TREE_CODE_REDUCTION, TREE_TYPE, TYPE_MODE, and TYPE_VECTOR_SUBPARTS().
Referenced by vect_transform_cycle_phi().
|
static |
Transfer group and reduction information from STMT_INFO to its pattern stmt.
References gcc_assert, gcc_checking_assert, REDUC_GROUP_FIRST_ELEMENT, REDUC_GROUP_NEXT_ELEMENT, REDUC_GROUP_SIZE, STMT_VINFO_DEF_TYPE, and STMT_VINFO_RELATED_STMT.
Referenced by vect_fixup_scalar_cycles_with_patterns().
|
static |
Fixup scalar cycles that now have their stmts detected as patterns.
References FOR_EACH_VEC_ELT, i, last, LOOP_VINFO_REDUCTION_CHAINS, loop::next, NULL, REDUC_GROUP_FIRST_ELEMENT, REDUC_GROUP_NEXT_ELEMENT, _loop_vec_info::reductions, STMT_VINFO_DEF_TYPE, STMT_VINFO_IN_PATTERN_P, STMT_VINFO_REDUC_IDX, STMT_VINFO_RELATED_STMT, vect_fixup_reduc_chain(), vect_internal_def, and vect_stmt_to_vectorize().
Referenced by vect_analyze_loop_2().
tree vect_gen_loop_len_mask | ( | loop_vec_info | loop_vinfo, |
gimple_stmt_iterator * | gsi, | ||
gimple_stmt_iterator * | cond_gsi, | ||
vec_loop_lens * | lens, | ||
unsigned int | nvectors, | ||
tree | vectype, | ||
tree | stmt, | ||
unsigned int | index, | ||
unsigned int | factor ) |
Generate the tree for the loop len mask and return it. Given the lens, nvectors, vectype, index and factor to gen the len mask as below. tree len_mask = VCOND_MASK_LEN (compare_mask, ones, zero, len, bias)
References build_all_ones_cst(), build_int_cst(), build_zero_cst(), gimple_build_call_internal(), gimple_call_set_lhs(), gsi_insert_before(), GSI_SAME_STMT, intQI_type_node, LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS, make_temp_ssa_name(), NULL, TREE_TYPE, and vect_get_loop_len().
Referenced by vectorizable_early_exit().
|
static |
References DECL_P, opt_result::failure_at(), gcc_checking_assert, cgraph_node::get(), get_base_address(), gimple_call_arg(), gimple_call_fndecl(), gimple_call_internal_p(), gimple_call_lhs(), gimple_call_num_args(), gsi_end_p(), gsi_next(), gsi_start_bb(), gsi_stmt(), i, is_gimple_call(), is_gimple_debug(), NULL, NULL_TREE, loop::num_nodes, REFERENCE_CLASS_P, loop::safelen, cgraph_node::simd_clones, opt_result::success(), TREE_CODE, TREE_OPERAND, and vect_find_stmt_data_reference().
Referenced by vect_analyze_loop_2().
int vect_get_known_peeling_cost | ( | loop_vec_info | loop_vinfo, |
int | peel_iters_prologue, | ||
int * | peel_iters_epilogue, | ||
stmt_vector_for_cost * | scalar_cost_vec, | ||
stmt_vector_for_cost * | prologue_cost_vec, | ||
stmt_vector_for_cost * | epilogue_cost_vec ) |
Calculate cost of peeling the loop PEEL_ITERS_PROLOGUE times.
References cond_branch_taken, FOR_EACH_VEC_ELT, LOOP_VINFO_NITERS_KNOWN_P, record_stmt_cost(), si, vect_epilogue, vect_get_peel_iters_epilogue(), and vect_prologue.
Referenced by vect_enhance_data_refs_alignment(), and vect_peeling_hash_get_lowest_cost().
tree vect_get_loop_len | ( | loop_vec_info | loop_vinfo, |
gimple_stmt_iterator * | gsi, | ||
vec_loop_lens * | lens, | ||
unsigned int | nvectors, | ||
tree | vectype, | ||
unsigned int | index, | ||
unsigned int | factor ) |
Given a complete set of lengths LENS, extract length number INDEX for an rgroup that operates on NVECTORS vectors of type VECTYPE, where 0 <= INDEX < NVECTORS. Return a value that contains FACTOR multipled by the number of elements that should be processed. Insert any set-up statements before GSI.
References rgroup_controls::bias_adjusted_ctrl, build_int_cst(), rgroup_controls::controls, rgroup_controls::factor, gcc_assert, gimple_build(), gimple_build_nop(), gsi_insert_seq_before(), GSI_SAME_STMT, i, LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS, LOOP_VINFO_RGROUP_COMPARE_TYPE, LOOP_VINFO_RGROUP_IV_TYPE, make_temp_ssa_name(), NULL, NULL_TREE, SSA_NAME_DEF_STMT, rgroup_controls::type, and TYPE_VECTOR_SUBPARTS().
Referenced by vect_gen_loop_len_mask(), vect_get_loop_variant_data_ptr_increment(), vect_get_strided_load_store_ops(), vectorizable_call(), vectorizable_condition(), vectorizable_induction(), vectorizable_live_operation_1(), vectorizable_load(), vectorizable_operation(), vectorizable_store(), and vectorize_fold_left_reduction().
tree vect_get_loop_mask | ( | loop_vec_info | loop_vinfo, |
gimple_stmt_iterator * | gsi, | ||
vec_loop_masks * | masks, | ||
unsigned int | nvectors, | ||
tree | vectype, | ||
unsigned int | index ) |
Given a complete set of masks MASKS, extract mask number INDEX for an rgroup that operates on NVECTORS vectors of type VECTYPE, where 0 <= INDEX < NVECTORS. Insert any set-up statements before GSI. See the comment above vec_loop_masks for more details about the mask arrangement.
References build_int_cst(), rgroup_controls::controls, rgroup_controls::factor, gcc_assert, gcc_unreachable, GET_MODE_CLASS, gimple_build(), gimple_build_nop(), gimple_convert(), gsi_insert_seq_before(), GSI_SAME_STMT, i, integer_type_node, known_eq, LOOP_VINFO_PARTIAL_VECTORS_STYLE, LOOP_VINFO_VECT_FACTOR, make_temp_ssa_name(), NULL, vec_loop_masks::rgc_vec, SSA_NAME_DEF_STMT, TREE_TYPE, truth_type_for(), rgroup_controls::type, lang_hooks_for_types::type_for_mode, TYPE_MODE, TYPE_VECTOR_SUBPARTS(), lang_hooks::types, vect_partial_vectors_avx512, and vect_partial_vectors_while_ult.
Referenced by vect_transform_reduction(), vectorizable_call(), vectorizable_condition(), vectorizable_early_exit(), vectorizable_live_operation_1(), vectorizable_load(), vectorizable_operation(), vectorizable_simd_clone_call(), vectorizable_store(), and vectorize_fold_left_reduction().
|
static |
Function vect_get_loop_niters. Determine how many iterations the loop is executed and place it in NUMBER_OF_ITERATIONS. Place the number of latch iterations in NUMBER_OF_ITERATIONSM1. Place the condition under which the niter information holds in ASSUMPTIONS. Return the loop exit conditions.
References niter_desc::assumptions, tree_niter_desc::assumptions, boolean_true_node, boolean_type_node, build_int_cst(), build_minus_one_cst(), chrec_contains_undetermined(), chrec_dont_know, COMPARISON_CLASS_P, dump_enabled_p(), dump_printf_loc(), DUMP_VECT_SCOPE, loop::exits, fold_build1, fold_build2, fold_build3, FOR_EACH_VEC_ELT, get_loop_exit_condition(), get_loop_exit_edges(), i, integer_nonzerop(), integer_zerop(), tree_niter_desc::may_be_zero, MSG_NOTE, niter_desc::niter, tree_niter_desc::niter, NULL, NULL_TREE, number_of_iterations_exit_assumptions(), rewrite_to_non_trapping_overflow(), TREE_CODE, TREE_TYPE, unshare_expr(), and vect_location.
Referenced by vect_analyze_loop_form().
|
static |
Calculate the maximum number of scalars per iteration for every rgroup in LOOP_VINFO.
References FOR_EACH_VEC_ELT, i, LOOP_VINFO_MASKS, MAX, and rgroup_controls::max_nscalars_per_iter.
Referenced by vect_verify_full_masking().
|
static |
Estimate the number of peeled epilogue iterations for LOOP_VINFO. PEEL_ITERS_PROLOGUE is the number of peeled prologue iterations, or -1 if not known.
References dump_enabled_p(), dump_printf_loc(), LOOP_VINFO_INT_NITERS, LOOP_VINFO_NITERS_KNOWN_P, LOOP_VINFO_PEELING_FOR_GAPS, MIN, MSG_NOTE, vect_location, and vect_vf_for_cost().
Referenced by vect_estimate_min_profitable_iters(), and vect_get_known_peeling_cost().
Return a mask type with half the number of elements as OLD_TYPE, given that it should have mode NEW_MODE.
References build_truth_vector_type_for_mode(), new_mode(), and TYPE_VECTOR_SUBPARTS().
Referenced by supportable_widening_operation(), and vect_maybe_permute_loop_masks().
|
static |
Return true if PHI, described by STMT_INFO, is the inner PHI in what we are assuming is a double reduction. For example, given a structure like this: outer1: x_1 = PHI <x_4(outer2), ...>; ... inner: x_2 = PHI <x_1(outer1), ...>; ... x_3 = ...; ... outer2: x_4 = PHI <x_3(inner)>; ... outer loop analysis would treat x_1 as a double reduction phi and this function would then return true for x_2.
References FOR_EACH_PHI_ARG, vec_info::lookup_def(), SSA_OP_USE, STMT_VINFO_DEF_TYPE, USE_FROM_PTR, and vect_double_reduction_def.
Referenced by vect_analyze_scalar_cycles_1().
|
static |
Return true if (a) STMT_INFO is a DOT_PROD_EXPR reduction whose multiplication operands have differing signs and (b) we intend to emulate the operation using a series of signed DOT_PROD_EXPRs. See vect_emulate_mixed_dot_prod for the actual sequence used.
References directly_supported_p(), dyn_cast(), gcc_assert, gimple_assign_rhs1(), gimple_assign_rhs2(), gimple_assign_rhs_code(), optab_vector_mixed_sign, STMT_VINFO_REDUC_VECTYPE_IN, STMT_VINFO_VECTYPE, TREE_TYPE, and TYPE_SIGN.
Referenced by vect_transform_reduction(), and vectorizable_lane_reducing().
|
static |
Function vect_is_nonlinear_iv_evolution Only support nonlinear induction for integer type 1. neg 2. mul by constant 3. lshift/rshift by constant. For neg induction, return a fake step as integer -1.
References build_int_cst(), gimple_assign_rhs1(), gimple_assign_rhs2(), gimple_assign_rhs_code(), gimple_phi_num_args(), init_expr(), INTEGRAL_TYPE_P, is_gimple_assign(), loop_latch_edge(), loop_preheader_edge(), PHI_ARG_DEF_FROM_EDGE, PHI_RESULT, SSA_NAME_DEF_STMT, STMT_VINFO_LOOP_PHI_EVOLUTION_BASE_UNCHANGED, STMT_VINFO_LOOP_PHI_EVOLUTION_PART, STMT_VINFO_LOOP_PHI_EVOLUTION_TYPE, TREE_CODE, TREE_TYPE, vect_step_op_mul, vect_step_op_neg, vect_step_op_shl, and vect_step_op_shr.
Referenced by vect_analyze_scalar_cycles_1().
|
static |
Function vect_is_simple_iv_evolution. FORNOW: A simple evolution of an induction variables in the loop is considered a polynomial evolution.
References cfun, dump_enabled_p(), dump_printf_loc(), evolution_part_in_loop_num(), flow_bb_inside_loop_p(), get_loop(), gimple_bb(), init_expr(), initial_condition_in_loop_num(), INTEGRAL_TYPE_P, MSG_MISSED_OPTIMIZATION, MSG_NOTE, NULL_TREE, SCALAR_FLOAT_TYPE_P, SSA_NAME_DEF_STMT, TREE_CODE, tree_is_chrec(), TREE_TYPE, unshare_expr(), and vect_location.
Referenced by vect_analyze_scalar_cycles_1().
|
static |
Function vect_is_simple_reduction (1) Detect a cross-iteration def-use cycle that represents a simple reduction computation. We look for the following pattern: loop_header: a1 = phi < a0, a2 > a3 = ... a2 = operation (a3, a1) or a3 = ... loop_header: a1 = phi < a0, a2 > a2 = operation (a3, a1) such that: 1. operation is commutative and associative and it is safe to change the order of the computation 2. no uses for a2 in the loop (a2 is used out of the loop) 3. no uses of a1 in the loop besides the reduction operation 4. no uses of a1 outside the loop. Conditions 1,4 are tested here. Conditions 2,3 are tested in vect_mark_stmts_to_be_vectorized. (2) Detect a cross-iteration def-use cycle in nested loops, i.e., nested cycles. (3) Detect cycles of phi nodes in outer-loop vectorization, i.e., double reductions: a1 = phi < a0, a2 > inner loop (def of a3) a2 = phi < a3 > (4) Detect condition expressions, ie: for (int i = 0; i < N; i++) if (a[i] < val) ret_val = a[i];
References as_a(), check_reduction_path(), gimple_match_op::code, COND_REDUCTION, CONVERT_EXPR_CODE_P, dump_enabled_p(), dump_printf_loc(), dyn_cast(), flow_bb_inside_loop_p(), flow_loop_nested_p(), FOR_EACH_IMM_USE_FAST, gcc_unreachable, gimple_assign_rhs1_ptr(), gimple_bb(), gimple_call_arg_ptr(), gimple_extract_op(), gimple_get_lhs(), gimple_phi_num_args(), has_single_use(), has_zero_uses(), i, loop::inner, is_a(), is_gimple_assign(), is_gimple_call(), is_gimple_debug(), loop_latch_edge(), LOOP_VINFO_LOOP, LOOP_VINFO_REDUCTION_CHAINS, MSG_MISSED_OPTIMIZATION, MSG_NOTE, NULL, path, PHI_ARG_DEF, PHI_ARG_DEF_FROM_EDGE, PHI_RESULT, REDUC_GROUP_FIRST_ELEMENT, REDUC_GROUP_NEXT_ELEMENT, REDUC_GROUP_SIZE, report_vect_op(), SSA_NAME_DEF_STMT, STMT_VINFO_DEF_TYPE, STMT_VINFO_REDUC_CODE, STMT_VINFO_REDUC_IDX, STMT_VINFO_REDUC_TYPE, TREE_CODE, TREE_CODE_REDUCTION, USE_STMT, vect_double_reduction_def, and vect_location.
Referenced by vect_analyze_scalar_cycles_1().
widest_int vect_iv_limit_for_partial_vectors | ( | loop_vec_info | loop_vinfo | ) |
Decide whether it is possible to use a zero-based induction variable when vectorizing LOOP_VINFO with partial vectors. If it is, return the value that the induction variable must be able to hold in order to ensure that the rgroups eventually have no active vector elements. Return -1 otherwise.
References LOOP_VINFO_LOOP, LOOP_VINFO_MASK_SKIP_NITERS, LOOP_VINFO_PEELING_FOR_ALIGNMENT, LOOP_VINFO_VECT_FACTOR, max_loop_iterations(), wi::to_widest(), TREE_CODE, and vect_max_vf().
Referenced by vect_rgroup_iv_might_wrap_p(), vect_verify_full_masking(), and vect_verify_full_masking_avx512().
|
static |
Decide whether to replace OLD_LOOP_VINFO with NEW_LOOP_VINFO. Return true if we should.
References dump_enabled_p(), dump_printf_loc(), GET_MODE_NAME, MSG_NOTE, vect_better_loop_vinfo_p(), vect_location, and vec_info::vector_mode.
Referenced by vect_analyze_loop().
|
static |
Return true if we know that the iteration count is smaller than the vectorization factor. Return false if it isn't, or if we can't be sure either way.
References LOOP_VINFO_INT_NITERS, LOOP_VINFO_LOOP, LOOP_VINFO_NITERS_KNOWN_P, max_stmt_executions_int(), and vect_vf_for_cost().
Referenced by vect_analyze_loop_costing(), vect_determine_partial_vectors_and_peeling(), and vect_estimate_min_profitable_iters().
|
static |
Kill any debug uses outside LOOP of SSA names defined in STMT_INFO.
References DEF_FROM_PTR, dump_enabled_p(), dump_printf_loc(), flow_bb_inside_loop_p(), FOR_EACH_IMM_USE_STMT, FOR_EACH_PHI_OR_STMT_DEF, gcc_unreachable, gimple_bb(), gimple_debug_bind_p(), gimple_debug_bind_reset_value(), is_gimple_debug(), MSG_NOTE, SSA_OP_DEF, update_stmt(), and vect_location.
Referenced by vect_transform_loop(), and vect_transform_loop_stmt().
|
static |
Calculate the minimum precision necessary to represent: MAX_NITERS * FACTOR as an unsigned integer, where MAX_NITERS is the maximum number of loop header iterations for the original scalar form of LOOP_VINFO.
References LOOP_VINFO_LOOP, LOOP_VINFO_NITERSM1, max_loop_iterations(), wi::min_precision(), wi::smin(), wi::to_widest(), TREE_TYPE, TYPE_MAX_VALUE, and UNSIGNED.
Referenced by vect_verify_full_masking(), and vect_verify_loop_lens().
|
static |
TODO: Close dependency between vect_model_*_cost and vectorizable_* functions. Design better to avoid maintenance issues.
Function vect_model_reduction_cost. Models cost for a reduction operation, including the vector ops generated within the strip-mine loop in some cases, the initial definition before the loop, and the epilogue code that must be generated.
References gimple_match_op::code, COND_REDUCTION, directly_supported_p(), dump_enabled_p(), dump_printf(), exact_log2(), EXTRACT_LAST_REDUCTION, FOLD_LEFT_REDUCTION, gcc_unreachable, gimple_extract_op(), have_whole_vector_shift(), LOOP_VINFO_LOOP, MSG_NOTE, nested_in_vect_loop_p(), NULL, record_stmt_cost(), scalar_stmt, scalar_to_vec, STMT_VINFO_VECTYPE, tree_to_uhwi(), gimple_match_op::type, TYPE_MODE, TYPE_SIZE, vec_to_scalar, vect_body, vect_epilogue, vect_nunits_for_cost(), vect_orig_stmt(), vect_prologue, VECTOR_MODE_P, and vector_stmt.
Referenced by vectorizable_reduction().
|
static |
True if the loop needs peeling or partial vectors when vectorized.
References exact_log2(), likely_max_stmt_executions_int(), LOOP_REQUIRES_VERSIONING, LOOP_VINFO_COST_MODEL_THRESHOLD, LOOP_VINFO_INT_NITERS, LOOP_VINFO_LOOP, LOOP_VINFO_NITERS, LOOP_VINFO_NITERS_KNOWN_P, LOOP_VINFO_ORIG_LOOP_INFO, LOOP_VINFO_PEELING_FOR_ALIGNMENT, LOOP_VINFO_PEELING_FOR_GAPS, LOOP_VINFO_VECT_FACTOR, and tree_ctz().
Referenced by vect_determine_partial_vectors_and_peeling().
tree vect_peel_nonlinear_iv_init | ( | gimple_seq * | stmts, |
tree | init_expr, | ||
tree | skip_niters, | ||
tree | step_expr, | ||
enum vect_induction_op_type | induction_type ) |
Peel init_expr by skip_niter for induction_type.
References begin(), build_zero_cst(), exp(), wi::from_mpz(), gcc_assert, gcc_unreachable, gimple_build(), gimple_convert(), init_expr(), wi::to_mpz(), wi::to_wide(), TREE_CODE, tree_fits_uhwi_p(), TREE_INT_CST_LOW, tree_to_uhwi(), TREE_TYPE, TYPE_PRECISION, TYPE_SIGN, TYPE_UNSIGNED, UNSIGNED, unsigned_type_for(), vect_step_op_mul, vect_step_op_neg, vect_step_op_shl, vect_step_op_shr, and wide_int_to_tree().
Referenced by vect_update_ivs_after_vectorizer(), and vectorizable_nonlinear_induction().
|
static |
Returns true if Phi is a first-order recurrence. A first-order recurrence is a non-reduction recurrence relation in which the value of the recurrence in the current loop iteration equals a value defined in the previous iteration.
References flow_bb_inside_loop_p(), FOR_EACH_IMM_USE_FAST, get_vectype_for_scalar_type(), gimple_bb(), gimple_phi_result(), is_a(), is_gimple_debug(), loop::latch, loop_latch_edge(), LOOP_VINFO_LOOP, PHI_ARG_DEF_FROM_EDGE, SSA_NAME_DEF_STMT, SSA_NAME_IS_DEFAULT_DEF, TREE_CODE, TREE_TYPE, USE_STMT, and vect_stmt_dominates_stmt_p().
Referenced by vect_analyze_scalar_cycles_1().
void vect_record_loop_len | ( | loop_vec_info | loop_vinfo, |
vec_loop_lens * | lens, | ||
unsigned int | nvectors, | ||
tree | vectype, | ||
unsigned int | factor ) |
Record that LOOP_VINFO would need LENS to contain a sequence of NVECTORS lengths for controlling an operation on VECTYPE. The operation splits each element of VECTYPE into FACTOR separate subelements, measuring the length as a number of these subelements.
References rgroup_controls::factor, gcc_assert, LOOP_VINFO_VECT_FACTOR, rgroup_controls::max_nscalars_per_iter, rgroup_controls::type, and TYPE_VECTOR_SUBPARTS().
Referenced by check_load_store_for_partial_vectors(), vect_reduction_update_partial_vector_usage(), vectorizable_call(), vectorizable_condition(), vectorizable_early_exit(), vectorizable_live_operation(), and vectorizable_operation().
void vect_record_loop_mask | ( | loop_vec_info | loop_vinfo, |
vec_loop_masks * | masks, | ||
unsigned int | nvectors, | ||
tree | vectype, | ||
tree | scalar_mask ) |
Record that a fully-masked version of LOOP_VINFO would need MASKS to contain a sequence of NVECTORS masks that each control a vector of type VECTYPE. If SCALAR_MASK is nonnull, the fully-masked loop would AND these vector masks with the vector version of SCALAR_MASK.
References hash_set< KeyId, Lazy, Traits >::add(), gcc_assert, vec_loop_masks::mask_set, and _loop_vec_info::scalar_cond_masked_set.
Referenced by check_load_store_for_partial_vectors(), vect_reduction_update_partial_vector_usage(), vectorizable_call(), vectorizable_condition(), vectorizable_early_exit(), vectorizable_live_operation(), vectorizable_operation(), and vectorizable_simd_clone_call().
|
static |
Given an operation with CODE in loop reduction path whose reduction PHI is specified by REDUC_INFO, the operation has TYPE of scalar result, and its input vectype is represented by VECTYPE_IN. The vectype of vectorized result may be different from VECTYPE_IN, either in base type or vectype lanes, lane-reducing operation is the case. This function check if it is possible, and how to perform partial vectorization on the operation in the context of LOOP_VINFO.
References direct_internal_fn_supported_p(), dump_enabled_p(), dump_printf_loc(), expand_vec_cond_expr_p(), FLOAT_TYPE_P, FOLD_LEFT_REDUCTION, get_conditional_internal_fn(), get_masked_reduction_fn(), HONOR_SIGN_DEPENDENT_ROUNDING(), internal_fn_mask_index(), LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P, LOOP_VINFO_LENS, LOOP_VINFO_MASKS, MSG_MISSED_OPTIMIZATION, NULL, OPTIMIZE_FOR_SPEED, STMT_VINFO_REDUC_FN, STMT_VINFO_REDUC_TYPE, truth_type_for(), use_mask_by_cond_expr_p(), vect_get_num_copies(), vect_location, vect_record_loop_len(), and vect_record_loop_mask().
Referenced by vectorizable_lane_reducing(), and vectorizable_reduction().
bool vect_rgroup_iv_might_wrap_p | ( | loop_vec_info | loop_vinfo, |
rgroup_controls * | rgc ) |
For the given rgroup_controls RGC, check whether an induction variable would ever hit a value that produces a set of all-false masks or zero lengths before wrapping around. Return true if it's possible to wrap around before hitting the desirable value, otherwise return false.
References rgroup_controls::factor, LOOP_VINFO_RGROUP_COMPARE_TYPE, rgroup_controls::max_nscalars_per_iter, wi::min_precision(), TYPE_PRECISION, UNSIGNED, and vect_iv_limit_for_partial_vectors().
Referenced by vect_estimate_min_profitable_iters(), and vect_set_loop_condition_partial_vectors().
bool vect_transform_cycle_phi | ( | loop_vec_info | loop_vinfo, |
stmt_vec_info | stmt_info, | ||
gimple ** | vec_stmt, | ||
slp_tree | slp_node, | ||
slp_instance | slp_node_instance ) |
Transform phase of a cycle PHI.
References add_phi_arg(), as_a(), build_vector_from_val(), build_vector_type(), build_vector_type_for_mode(), COND_REDUCTION, CONST_COND_REDUCTION, create_phi_node(), EXTRACT_LAST_REDUCTION, FOLD_LEFT_REDUCTION, gcc_assert, get_initial_def_for_reduction(), get_initial_defs_for_reduction(), gimple_convert(), gimple_phi_result(), GSI_CONTINUE_LINKING, gsi_end_p(), gsi_insert_seq_after(), gsi_insert_seq_on_edge_immediate(), gsi_last_bb(), gsi_prev(), gsi_stmt(), loop::header, i, info_for_reduction(), loop::inner, INTEGER_INDUC_COND_REDUCTION, integer_zerop(), loop_preheader_edge(), LOOP_VINFO_LOOP, _loop_vec_info::main_loop_edge, nested_in_vect_loop_p(), neutral_op_for_reduction(), NULL, NULL_TREE, num_phis(), operand_equal_p(), _slp_tree::push_vec_def(), REDUC_GROUP_FIRST_ELEMENT, _loop_vec_info::skip_this_loop_edge, SLP_TREE_CHILDREN, SLP_TREE_LANES, SLP_TREE_NUMBER_OF_VEC_STMTS, SLP_TREE_SCALAR_STMTS, stmt_ends_bb_p(), STMT_VINFO_DEF_TYPE, STMT_VINFO_FORCE_SINGLE_CYCLE, STMT_VINFO_REDUC_CODE, STMT_VINFO_REDUC_DEF, STMT_VINFO_REDUC_EPILOGUE_ADJUSTMENT, STMT_VINFO_REDUC_TYPE, STMT_VINFO_VEC_INDUC_COND_INITIAL_VAL, STMT_VINFO_VEC_STMTS, STMT_VINFO_VECTYPE, TREE_CODE, tree_int_cst_lt(), TREE_TYPE, TYPE_MODE, TYPE_VECTOR_SUBPARTS(), UNKNOWN_LOCATION, useless_type_conversion_p(), vect_create_destination_var(), vect_create_partial_epilog(), vect_find_reusable_accumulator(), vect_get_main_loop_result(), vect_get_num_copies(), vect_get_slp_defs(), vect_get_vec_defs_for_operand(), vect_phi_initial_value(), vect_reduction_def, and vect_stmt_to_vectorize().
Referenced by vect_transform_stmt().
class loop * vect_transform_loop | ( | loop_vec_info | loop_vinfo, |
gimple * | loop_vectorized_call ) |
Function vect_transform_loop. The analysis phase has determined that the loop is vectorizable. Vectorize the loop - created vectorized stmts to replace the scalar stmts in the loop, and update the loop exit condition. Returns scalar epilogue loop if any.
References advance(), loop::any_estimate, loop::any_likely_upper_bound, loop::any_upper_bound, profile_count::apply_probability(), build_int_cst(), build_one_cst(), build_zero_cst(), vec_info_shared::check_datarefs(), conditional_internal_fn_code(), basic_block_def::count, loop::dont_vectorize, DR_GROUP_FIRST_ELEMENT, dump_enabled_p(), dump_printf(), dump_printf_loc(), DUMP_VECT_SCOPE, dyn_cast(), EDGE_COUNT, fold_build2, FOR_EACH_VEC_ELT, loop::force_vectorize, gcc_assert, GET_MODE_NAME, gimple_build_assign(), gimple_call_arg(), gimple_call_builtin_p(), gimple_call_internal_fn(), gimple_call_internal_p(), gimple_call_num_args(), gimple_clobber_p(), gimple_get_lhs(), gimple_seq_empty_p(), gsi_after_labels(), GSI_CONTINUE_LINKING, gsi_end_p(), gsi_insert_seq_before(), gsi_next(), gsi_remove(), gsi_replace(), gsi_start(), gsi_start_bb(), gsi_start_phis(), gsi_stmt(), loop::header, i, loop::inner, integer_onep(), poly_int< N, C >::is_constant(), known_eq, vec_info::lookup_stmt(), loop_niters_no_overflow(), loop_preheader_edge(), LOOP_REQUIRES_VERSIONING, LOOP_VINFO_BBS, LOOP_VINFO_COST_MODEL_THRESHOLD, LOOP_VINFO_DRS_ADVANCED_BY, LOOP_VINFO_EARLY_BREAKS, LOOP_VINFO_EPILOGUE_P, LOOP_VINFO_INT_NITERS, LOOP_VINFO_INV_PATTERN_DEF_SEQ, LOOP_VINFO_IV_EXIT, LOOP_VINFO_LOOP, LOOP_VINFO_NITERS, LOOP_VINFO_NITERS_KNOWN_P, LOOP_VINFO_NITERS_UNCHANGED, LOOP_VINFO_NITERSM1, LOOP_VINFO_ORIG_LOOP_INFO, LOOP_VINFO_PEELING_FOR_ALIGNMENT, LOOP_VINFO_PEELING_FOR_GAPS, LOOP_VINFO_SCALAR_IV_EXIT, LOOP_VINFO_SCALAR_LOOP, LOOP_VINFO_SCALAR_LOOP_SCALING, LOOP_VINFO_SLP_INSTANCES, LOOP_VINFO_USING_PARTIAL_VECTORS_P, LOOP_VINFO_VECT_FACTOR, LOOP_VINFO_VERSIONING_THRESHOLD, MAY_HAVE_DEBUG_BIND_STMTS, maybe_flat_loop_profile(), maybe_set_vectorized_backedge_value(), move_early_exit_stmts(), MSG_NOTE, loop::nb_iterations_estimate, loop::nb_iterations_likely_upper_bound, loop::nb_iterations_upper_bound, NULL, NULL_TREE, loop::num_nodes, _loop_vec_info::peeling_for_alignment, _loop_vec_info::peeling_for_gaps, basic_block_def::preds, PURE_SLP_STMT, release_defs(), vec_info::remove_stmt(), loop::safelen, scale_loop_frequencies(), scale_profile_for_vect_loop(), vec_info::shared, si, loop::simduid, single_pred_p(), vec_info::slp_instances, split_edge(), split_loop_exit_edge(), STMT_VINFO_DEF_TYPE, STMT_VINFO_GROUPED_ACCESS, STMT_VINFO_IN_PATTERN_P, STMT_VINFO_LIVE_P, STMT_VINFO_PATTERN_DEF_SEQ, STMT_VINFO_RELATED_STMT, STMT_VINFO_RELEVANT_P, STMT_VINFO_VECTYPE, TREE_TYPE, TYPE_VECTOR_SUBPARTS(), wi::udiv_ceil(), wi::udiv_floor(), wi::umin(), unlink_stmt_vdef(), loop::unroll, unshare_expr(), update_epilogue_loop_vinfo(), vect_apply_runtime_profitability_check_p(), vect_build_loop_niters(), vect_do_peeling(), vect_double_reduction_def, vect_first_order_recurrence, vect_free_slp_instance(), vect_gen_vector_loop_niters(), vect_induction_def, vect_internal_def, vect_location, vect_loop_kill_debug_uses(), vect_loop_versioning(), vect_nested_cycle, vect_prepare_for_masked_peels(), vect_reduction_def, vect_remove_stores(), vect_schedule_slp(), vect_set_loop_condition(), vect_transform_loop_stmt(), vect_transform_stmt(), vect_use_loop_mask_for_alignment_p(), vect_vf_for_cost(), vec_info::vector_mode, and VECTOR_TYPE_P.
Referenced by vect_transform_loops().
|
static |
Vectorize STMT_INFO if relevant, inserting any new instructions before GSI. When vectorizing STMT_INFO as a store, set *SEEN_STORE to its stmt_vec_info.
References dump_enabled_p(), dump_printf_loc(), gcc_assert, gimple_call_internal_p(), gimple_call_lhs(), is_gimple_call(), LOOP_VINFO_LOOP, LOOP_VINFO_VECT_FACTOR, MAY_HAVE_DEBUG_BIND_STMTS, MSG_NOTE, NULL, PURE_SLP_STMT, STMT_SLP_TYPE, STMT_VINFO_LIVE_P, STMT_VINFO_RELEVANT_P, STMT_VINFO_VECTYPE, TYPE_VECTOR_SUBPARTS(), vect_location, vect_loop_kill_debug_uses(), and vect_transform_stmt().
Referenced by vect_transform_loop().
bool vect_transform_reduction | ( | loop_vec_info | loop_vinfo, |
stmt_vec_info | stmt_info, | ||
gimple_stmt_iterator * | gsi, | ||
gimple ** | vec_stmt, | ||
slp_tree | slp_node ) |
Transform the definition stmt STMT_INFO of a reduction PHI backedge value.
References as_a(), build_vect_cond_expr(), canonicalize_code(), gimple_match_op::code, commutative_binary_op_p(), conditional_internal_fn_code(), count, dump_enabled_p(), dump_printf_loc(), FOLD_LEFT_REDUCTION, gcc_assert, gcc_unreachable, get_conditional_internal_fn(), gimple_build_assign(), gimple_build_call_internal(), gimple_call_set_lhs(), gimple_call_set_nothrow(), gimple_extract_op(), gimple_get_lhs(), gimple_set_lhs(), i, info_for_reduction(), loop::inner, internal_fn_else_index(), code_helper::is_internal_fn(), code_helper::is_tree_code(), lane_reducing_op_p(), LOOP_VINFO_FULLY_MASKED_P, LOOP_VINFO_LENS, LOOP_VINFO_LOOP, LOOP_VINFO_MASKS, make_ssa_name(), MSG_NOTE, nested_in_vect_loop_p(), NULL_TREE, loop::num, gimple_match_op::num_ops, gimple_match_op::ops, _slp_tree::push_vec_def(), SSA_NAME_DEF_STMT, SSA_NAME_IS_DEFAULT_DEF, STMT_VINFO_DEF_TYPE, STMT_VINFO_FORCE_SINGLE_CYCLE, STMT_VINFO_REDUC_DEF, STMT_VINFO_REDUC_FN, STMT_VINFO_REDUC_IDX, STMT_VINFO_REDUC_TYPE, STMT_VINFO_REDUC_VECTYPE_IN, STMT_VINFO_VEC_STMTS, STMT_VINFO_VECTYPE, TREE_CODE, truth_type_for(), gimple_match_op::type, use_mask_by_cond_expr_p(), vect_create_destination_var(), vect_double_reduction_def, vect_emulate_mixed_dot_prod(), vect_finish_stmt_generation(), vect_get_loop_mask(), vect_get_num_copies(), vect_get_vec_defs(), vect_get_vec_defs_for_operand(), vect_is_emulated_mixed_dot_prod(), vect_location, vect_orig_stmt(), and vectorize_fold_left_reduction().
Referenced by vect_transform_stmt().
|
static |
Update vectorized iv with vect_step, induc_def is init.
References build_vector_type(), gcc_unreachable, gimple_build(), gimple_convert(), TREE_TYPE, TYPE_VECTOR_SUBPARTS(), unsigned_type_for(), vect_step_op_mul, vect_step_op_neg, vect_step_op_shl, and vect_step_op_shr.
Referenced by vectorizable_nonlinear_induction().
|
static |
Scan the loop stmts and dependent on whether there are any (non-)SLP statements update the vectorization factor.
References dump_dec(), dump_enabled_p(), dump_printf(), dump_printf_loc(), DUMP_VECT_SCOPE, gcc_assert, gsi_end_p(), gsi_next(), gsi_start_bb(), gsi_start_phis(), gsi_stmt(), i, is_gimple_debug(), known_ne, vec_info::lookup_stmt(), LOOP_VINFO_BBS, LOOP_VINFO_LOOP, LOOP_VINFO_SLP_UNROLLING_FACTOR, LOOP_VINFO_VECT_FACTOR, MSG_NOTE, loop::num_nodes, PURE_SLP_STMT, si, STMT_VINFO_DEF_TYPE, STMT_VINFO_RELEVANT_P, vect_location, vect_stmt_to_vectorize(), and VECTORIZABLE_CYCLE_DEF.
Referenced by vect_analyze_loop_2().
|
static |
Each statement in LOOP_VINFO can be masked where necessary. Check whether we can actually generate the masks required. Return true if so, storing the type of the scalar IV in LOOP_VINFO_RGROUP_COMPARE_TYPE.
References build_nonstandard_integer_type(), can_produce_all_loop_masks_p(), rgroup_controls::factor, FOR_EACH_MODE_IN_CLASS, GET_MODE_BITSIZE(), is_empty(), LOOP_VINFO_MASKS, LOOP_VINFO_PARTIAL_VECTORS_STYLE, LOOP_VINFO_RGROUP_COMPARE_TYPE, LOOP_VINFO_RGROUP_IV_TYPE, LOOP_VINFO_VECT_FACTOR, rgroup_controls::max_nscalars_per_iter, wi::min_precision(), NULL_TREE, opt_mode< T >::require(), vec_loop_masks::rgc_vec, targetm, truth_type_for(), rgroup_controls::type, TYPE_PRECISION, TYPE_VECTOR_SUBPARTS(), UINT_MAX, UNSIGNED, vect_get_max_nscalars_per_iter(), vect_iv_limit_for_partial_vectors(), vect_min_prec_for_max_niters(), and vect_partial_vectors_while_ult.
Referenced by vect_analyze_loop_2().
|
static |
Each statement in LOOP_VINFO can be masked where necessary. Check whether we can actually generate AVX512 style masks. Return true if so, storing the type of the scalar IV in LOOP_VINFO_RGROUP_IV_TYPE.
References rgroup_controls::bias_adjusted_ctrl, build_nonstandard_integer_type(), build_vector_type(), rgroup_controls::compare_type, error_mark_node, expand_vec_cmp_expr_p(), rgroup_controls::factor, FOR_EACH_MODE_IN_CLASS, GET_MODE_BITSIZE(), GET_MODE_CLASS, is_empty(), LOOP_VINFO_MASKS, LOOP_VINFO_PARTIAL_VECTORS_STYLE, LOOP_VINFO_RGROUP_COMPARE_TYPE, LOOP_VINFO_RGROUP_IV_TYPE, LOOP_VINFO_VECT_FACTOR, rgroup_controls::max_nscalars_per_iter, wi::min_precision(), NULL_TREE, release_vec_loop_controls(), opt_mode< T >::require(), vec_loop_masks::rgc_vec, targetm, TREE_TYPE, truth_type_for(), rgroup_controls::type, TYPE_MODE, TYPE_PRECISION, TYPE_VECTOR_SUBPARTS(), UINT_MAX, UNSIGNED, vect_iv_limit_for_partial_vectors(), vect_max_vf(), and vect_partial_vectors_avx512.
Referenced by vect_analyze_loop_2().
|
static |
Check whether we can use vector access with length based on precison comparison. So far, to keep it simple, we only allow the case that the precision of the target supported length is larger than the precision required by loop niters.
References BITS_PER_WORD, build_nonstandard_integer_type(), dump_enabled_p(), dump_printf_loc(), opt_mode< T >::exists(), rgroup_controls::factor, FOR_EACH_MODE_IN_CLASS, FOR_EACH_VEC_ELT, gcc_assert, get_len_load_store_mode(), GET_MODE_BITSIZE(), i, internal_len_load_store_bias(), is_empty(), LOOP_VINFO_LENS, LOOP_VINFO_NITERS, LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS, LOOP_VINFO_PARTIAL_VECTORS_STYLE, LOOP_VINFO_RGROUP_COMPARE_TYPE, LOOP_VINFO_RGROUP_IV_TYPE, MAX, rgroup_controls::max_nscalars_per_iter, MSG_MISSED_OPTIMIZATION, NULL_TREE, opt_mode< T >::require(), targetm, TREE_TYPE, TYPE_PRECISION, vect_location, vect_min_prec_for_max_niters(), VECT_PARTIAL_BIAS_UNSUPPORTED, vect_partial_vectors_len, and vec_info::vector_mode.
Referenced by vect_analyze_loop_2().
bool vectorizable_induction | ( | loop_vec_info | loop_vinfo, |
stmt_vec_info | stmt_info, | ||
gimple ** | vec_stmt, | ||
slp_tree | slp_node, | ||
stmt_vector_for_cost * | cost_vec ) |
Function vectorizable_induction Check if STMT_INFO performs an induction computation that can be vectorized. If VEC_STMT is also passed, vectorize the induction PHI: create a vectorized phi to replace it, put it in VEC_STMT, and add it to the same basic block. Return true if STMT_INFO is vectorizable in this way.
References add_phi_arg(), as_a(), build1(), build_index_vector(), build_int_cst(), build_int_cstu(), build_real_from_wide(), build_vector_from_val(), CONSTANT_CLASS_P, create_phi_node(), directly_supported_p(), dump_enabled_p(), dump_printf_loc(), DUMP_VECT_SCOPE, dyn_cast(), expr, FLOAT_TYPE_P, flow_bb_inside_loop_p(), fold_convert, FOR_EACH_IMM_USE_FAST, FOR_EACH_VEC_ELT, force_gimple_operand(), gcc_assert, get_same_sized_vectype(), gimple_assign_lhs(), gimple_bb(), gimple_build(), gimple_build_assign(), gimple_build_vector(), gimple_build_vector_from_val(), gimple_convert(), gimple_get_lhs(), gimple_phi_arg_def(), gsi_after_labels(), GSI_CONTINUE_LINKING, gsi_for_stmt(), gsi_insert_on_edge_immediate(), gsi_insert_seq_after(), gsi_insert_seq_before(), gsi_insert_seq_on_edge_immediate(), GSI_SAME_STMT, loop::header, i, induc_vec_info_type, init_expr(), loop::inner, integer_type_node, integer_zerop(), INTEGRAL_TYPE_P, poly_int< N, C >::is_constant(), is_gimple_debug(), least_common_multiple(), vec_info::lookup_stmt(), loop_latch_edge(), loop_preheader_edge(), LOOP_VINFO_LENS, LOOP_VINFO_LOOP, LOOP_VINFO_MASK_SKIP_NITERS, LOOP_VINFO_USING_SELECT_VL_P, LOOP_VINFO_VECT_FACTOR, MSG_MISSED_OPTIMIZATION, MSG_NOTE, nested_in_vect_loop_p(), NULL, NULL_TREE, PHI_ARG_DEF_FROM_EDGE, PHI_RESULT, _slp_tree::push_vec_def(), record_stmt_cost(), SCALAR_FLOAT_TYPE_P, scalar_to_vec, si, SLP_TREE_CHILDREN, SLP_TREE_LANES, SLP_TREE_NUMBER_OF_VEC_STMTS, SLP_TREE_SCALAR_STMTS, SLP_TREE_VEC_DEFS, SLP_TREE_VECTYPE, SSA_NAME_DEF_STMT, STMT_VINFO_DEF_TYPE, STMT_VINFO_LIVE_P, STMT_VINFO_LOOP_PHI_EVOLUTION_PART, STMT_VINFO_LOOP_PHI_EVOLUTION_TYPE, STMT_VINFO_RELEVANT_P, STMT_VINFO_TYPE, STMT_VINFO_VEC_STMTS, STMT_VINFO_VECTYPE, TREE_CODE, TREE_TYPE, type_has_mode_precision_p(), TYPE_VECTOR_SUBPARTS(), UNKNOWN_LOCATION, unshare_expr(), UNSIGNED, USE_STMT, useless_type_conversion_p(), vect_body, vect_get_loop_len(), vect_get_new_ssa_name(), vect_get_new_vect_var(), vect_get_num_copies(), vect_get_slp_vect_def(), vect_get_vec_defs_for_operand(), vect_induction_def, vect_init_vector(), vect_location, vect_maybe_update_slp_op_vectype(), vect_phi_initial_value(), vect_prologue, vect_simple_var, vect_step_op_add, vector_stmt, and vectorizable_nonlinear_induction().
Referenced by vect_analyze_loop_operations(), vect_analyze_stmt(), and vect_transform_stmt().
bool vectorizable_lane_reducing | ( | loop_vec_info | loop_vinfo, |
stmt_vec_info | stmt_info, | ||
slp_tree | slp_node, | ||
stmt_vector_for_cost * | cost_vec ) |
Check if STMT_INFO is a lane-reducing operation that can be vectorized in the context of LOOP_VINFO, and vector cost will be recorded in COST_VEC, and the analysis is for slp if SLP_NODE is not NULL. For a lane-reducing operation, the loop reduction path that it lies in, may contain normal operation, or other lane-reducing operation of different input type size, an example as: int sum = 0; for (i) { ... sum += d0[i] * d1[i]; // dot-prod <vector(16) char> sum += w[i]; // widen-sum <vector(16) char> sum += abs(s0[i] - s1[i]); // sad <vector(8) short> sum += n[i]; // normal <vector(4) int> ... } Vectorization factor is essentially determined by operation whose input vectype has the most lanes ("vector(16) char" in the example), while we need to choose input vectype with the least lanes ("vector(4) int" in the example) to determine effective number of vector reduction PHIs.
References dump_enabled_p(), dump_printf(), dump_printf_loc(), gcc_assert, get_vectype_for_scalar_type(), gimple_assign_lhs(), gimple_assign_rhs_code(), gimple_num_ops(), i, INTEGRAL_TYPE_P, lane_reducing_stmt_p(), LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P, MSG_MISSED_OPTIMIZATION, MSG_NOTE, record_stmt_cost(), reduc_vec_info_type, scalar_to_vec, STMT_VINFO_DEF_TYPE, STMT_VINFO_REDUC_DEF, STMT_VINFO_REDUC_IDX, STMT_VINFO_REDUC_TYPE, STMT_VINFO_REDUC_VECTYPE_IN, STMT_VINFO_TYPE, TREE_CODE_REDUCTION, TREE_TYPE, type_has_mode_precision_p(), vect_body, vect_get_num_copies(), vect_is_emulated_mixed_dot_prod(), vect_is_simple_use(), vect_location, vect_maybe_update_slp_op_vectype(), vect_orig_stmt(), vect_prologue, vect_reduction_def, vect_reduction_update_partial_vector_usage(), vector_stmt, and VECTORIZABLE_CYCLE_DEF.
Referenced by vect_analyze_stmt().
bool vectorizable_lc_phi | ( | loop_vec_info | loop_vinfo, |
stmt_vec_info | stmt_info, | ||
gimple ** | vec_stmt, | ||
slp_tree | slp_node ) |
Vectorizes LC PHIs.
References add_phi_arg(), create_phi_node(), dump_enabled_p(), dump_printf_loc(), gimple_bb(), gimple_phi_arg_def(), gimple_phi_num_args(), gimple_phi_result(), i, is_a(), lc_phi_info_type, MSG_MISSED_OPTIMIZATION, _slp_tree::push_vec_def(), single_pred_edge(), SLP_TREE_CHILDREN, SLP_TREE_VECTYPE, STMT_VINFO_DEF_TYPE, STMT_VINFO_TYPE, STMT_VINFO_VEC_STMTS, STMT_VINFO_VECTYPE, UNKNOWN_LOCATION, vect_create_destination_var(), vect_double_reduction_def, vect_get_num_copies(), vect_get_vec_defs(), vect_internal_def, vect_location, and vect_maybe_update_slp_op_vectype().
Referenced by vect_analyze_loop_operations(), vect_analyze_stmt(), and vect_transform_stmt().
bool vectorizable_live_operation | ( | vec_info * | vinfo, |
stmt_vec_info | stmt_info, | ||
slp_tree | slp_node, | ||
slp_instance | slp_node_instance, | ||
int | slp_index, | ||
bool | vec_stmt_p, | ||
stmt_vector_for_cost * | cost_vec ) |
Function vectorizable_live_operation. STMT_INFO computes a value that is used outside the loop. Check if it can be supported.
References as_a(), bitsize_int, build3(), build_nonstandard_integer_type(), build_zero_cst(), can_vec_extract_var_idx_p(), direct_internal_fn_supported_p(), dump_enabled_p(), dump_printf_loc(), dyn_cast(), EXTRACT_LAST_REDUCTION, flow_bb_inside_loop_p(), flow_loop_nested_p(), fold_convert, FOLD_LEFT_REDUCTION, FOR_EACH_IMM_USE_ON_STMT, FOR_EACH_IMM_USE_STMT, force_gimple_operand(), gcc_assert, get_loop_exit_edges(), gimple_bb(), gimple_build_assign(), gimple_get_lhs(), gimple_phi_arg_edge(), gimple_phi_result(), gsi_after_labels(), gsi_for_stmt(), gsi_insert_before(), gsi_insert_seq_after(), gsi_insert_seq_before(), GSI_SAME_STMT, info_for_reduction(), int_const_binop(), is_a(), is_gimple_debug(), is_simple_and_all_uses_invariant(), vec_info::lookup_stmt(), loop_exit_edge_p(), LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P, LOOP_VINFO_EARLY_BREAKS, LOOP_VINFO_EARLY_BREAKS_VECT_PEELED, LOOP_VINFO_FULLY_MASKED_P, LOOP_VINFO_FULLY_WITH_LENGTH_P, LOOP_VINFO_IV_EXIT, LOOP_VINFO_LENS, LOOP_VINFO_LOOP, LOOP_VINFO_MASKS, MSG_MISSED_OPTIMIZATION, MSG_NOTE, NULL, NULL_TREE, OPTIMIZE_FOR_SPEED, phi_arg_index_from_use(), PURE_SLP_STMT, record_stmt_cost(), REDUC_GROUP_FIRST_ELEMENT, remove_phi_node(), SET_USE, si, SLP_TREE_LANES, SLP_TREE_NUMBER_OF_VEC_STMTS, SLP_TREE_VEC_DEFS, SLP_TREE_VECTYPE, SSA_NAME_DEF_STMT, SSA_NAME_IS_DEFAULT_DEF, SSA_NAME_OCCURS_IN_ABNORMAL_PHI, STMT_VINFO_DEF_TYPE, STMT_VINFO_LIVE_P, STMT_VINFO_REDUC_DEF, STMT_VINFO_REDUC_TYPE, STMT_VINFO_RELEVANT_P, STMT_VINFO_VEC_STMTS, STMT_VINFO_VECTYPE, TREE_CODE, tree_to_uhwi(), TREE_TYPE, TYPE_MODE, TYPE_VECTOR_SUBPARTS(), update_stmt(), vec_to_scalar, vect_create_epilog_for_reduction(), vect_epilogue, vect_get_num_copies(), vect_induction_def, vect_location, vect_orig_stmt(), vect_record_loop_len(), vect_record_loop_mask(), vect_stmt_dominates_stmt_p(), vect_stmt_to_vectorize(), VECTOR_BOOLEAN_TYPE_P, vector_element_bits_tree(), and vectorizable_live_operation_1().
Referenced by can_vectorize_live_stmts(), vect_analyze_loop_operations(), vect_bb_slp_mark_live_stmts(), vect_schedule_slp_node(), and vect_slp_analyze_node_operations_1().
|
static |
Function vectorizable_live_operation_1. helper function for vectorizable_live_operation.
References build3(), build_int_cst(), build_nonstandard_integer_type(), build_one_cst(), copy_ssa_name(), create_phi_node(), fold_convert, force_gimple_operand(), gcc_assert, gimple_build(), gimple_convert(), gimple_phi_num_args(), gimple_seq_add_seq(), gsi_after_labels(), gsi_insert_seq_before(), gsi_last(), GSI_SAME_STMT, i, int_const_binop(), integer_zerop(), LOOP_VINFO_EARLY_BREAKS, LOOP_VINFO_FULLY_MASKED_P, LOOP_VINFO_FULLY_WITH_LENGTH_P, LOOP_VINFO_LENS, LOOP_VINFO_MASKS, LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS, NULL, NULL_TREE, SET_PHI_ARG_DEF, single_pred_p(), SLP_TREE_LANES, STMT_VINFO_VECTYPE, tree_to_uhwi(), TREE_TYPE, vect_get_loop_len(), vect_get_loop_mask(), and VECTOR_BOOLEAN_TYPE_P.
Referenced by vectorizable_live_operation().
|
static |
Function vectorizable_induction Check if STMT_INFO performs an nonlinear induction computation that can be vectorized. If VEC_STMT is also passed, vectorize the induction PHI: create a vectorized phi to replace it, put it in VEC_STMT, and add it to the same basic block. Return true if STMT_INFO is vectorizable in this way.
References add_phi_arg(), can_vec_perm_const_p(), create_phi_node(), directly_supported_p(), dump_enabled_p(), dump_printf_loc(), DUMP_VECT_SCOPE, dyn_cast(), fold_convert, gcc_assert, gcc_unreachable, gimple_bb(), gsi_after_labels(), gsi_insert_seq_before(), gsi_insert_seq_on_edge_immediate(), GSI_SAME_STMT, loop::header, i, induc_vec_info_type, init_expr(), INTEGRAL_TYPE_P, poly_int< N, C >::is_constant(), loop_latch_edge(), loop_preheader_edge(), LOOP_VINFO_LOOP, LOOP_VINFO_MASK_SKIP_NITERS, LOOP_VINFO_VECT_FACTOR, maybe_ge, MSG_MISSED_OPTIMIZATION, MSG_NOTE, nested_in_vect_loop_p(), NULL, NULL_TREE, optab_vector, PHI_RESULT, _slp_tree::push_vec_def(), record_stmt_cost(), scalar_to_vec, si, SLP_TREE_LANES, SSA_NAME_DEF_STMT, STMT_VINFO_LOOP_PHI_EVOLUTION_PART, STMT_VINFO_LOOP_PHI_EVOLUTION_TYPE, STMT_VINFO_TYPE, STMT_VINFO_VEC_STMTS, STMT_VINFO_VECTYPE, TREE_CODE, tree_fits_uhwi_p(), tree_nop_conversion_p(), tree_to_uhwi(), TREE_TYPE, TYPE_MODE, TYPE_PRECISION, TYPE_VECTOR_SUBPARTS(), UNKNOWN_LOCATION, vect_body, vect_create_nonlinear_iv_init(), vect_create_nonlinear_iv_step(), vect_create_nonlinear_iv_vec_step(), vect_get_new_vect_var(), vect_get_num_copies(), vect_location, vect_peel_nonlinear_iv_init(), vect_phi_initial_value(), vect_prologue, vect_simple_var, vect_step_op_add, vect_step_op_mul, vect_step_op_neg, vect_step_op_shl, vect_step_op_shr, vect_update_nonlinear_iv(), and vector_stmt.
Referenced by vectorizable_induction().
bool vectorizable_phi | ( | vec_info * | , |
stmt_vec_info | stmt_info, | ||
gimple ** | vec_stmt, | ||
slp_tree | slp_node, | ||
stmt_vector_for_cost * | cost_vec ) |
Vectorizes PHIs.
References add_phi_arg(), as_a(), create_phi_node(), dump_enabled_p(), dump_printf_loc(), FOR_EACH_VEC_ELT, gcc_assert, gimple_bb(), gimple_phi_arg_edge(), gimple_phi_num_args(), gimple_phi_result(), i, is_a(), is_empty(), MSG_MISSED_OPTIMIZATION, phi_info_type, _slp_tree::push_vec_def(), record_stmt_cost(), SLP_TREE_CHILDREN, SLP_TREE_DEF_TYPE, SLP_TREE_NUMBER_OF_VEC_STMTS, SLP_TREE_VEC_DEFS, SLP_TREE_VECTYPE, STMT_VINFO_DEF_TYPE, STMT_VINFO_TYPE, UNKNOWN_LOCATION, useless_type_conversion_p(), vect_body, vect_create_destination_var(), vect_get_slp_defs(), vect_internal_def, vect_location, vect_maybe_update_slp_op_vectype(), and vector_stmt.
Referenced by vect_analyze_stmt(), and vect_transform_stmt().
bool vectorizable_recurr | ( | loop_vec_info | loop_vinfo, |
stmt_vec_info | stmt_info, | ||
gimple ** | vec_stmt, | ||
slp_tree | slp_node, | ||
stmt_vector_for_cost * | cost_vec ) |
Vectorizes first order recurrences. An overview of the transformation is described below. Suppose we have the following loop. int t = 0; for (int i = 0; i < n; ++i) { b[i] = a[i] - t; t = a[i]; } There is a first-order recurrence on 'a'. For this loop, the scalar IR looks (simplified) like: scalar.preheader: init = 0; scalar.body: i = PHI <0(scalar.preheader), i+1(scalar.body)> _2 = PHI <(init(scalar.preheader), <_1(scalar.body)> _1 = a[i] b[i] = _1 - _2 if (i < n) goto scalar.body In this example, _2 is a recurrence because it's value depends on the previous iteration. We vectorize this as (VF = 4) vector.preheader: vect_init = vect_cst(..., ..., ..., 0) vector.body i = PHI <0(vector.preheader), i+4(vector.body)> vect_1 = PHI <vect_init(vector.preheader), v2(vector.body)> vect_2 = a[i, i+1, i+2, i+3]; vect_3 = vec_perm (vect_1, vect_2, { 3, 4, 5, 6 }) b[i, i+1, i+2, i+3] = vect_2 - vect_3 if (..) goto vector.body In this function, vectorizable_recurr, we code generate both the vector PHI node and the permute since those together compute the vectorized value of the scalar PHI. We do not yet have the backedge value to fill in there nor into the vec_perm. Those are filled in maybe_set_vectorized_backedge_value and vect_schedule_scc. TODO: Since the scalar loop does not have a use of the recurrence outside of the loop the natural way to implement peeling via vectorizing the live value doesn't work. For now peeling of loops with a recurrence is not implemented. For SLP the supported cases are restricted to those requiring a single vector recurrence PHI.
References add_phi_arg(), as_a(), build_vector_from_val(), can_vec_perm_const_p(), create_phi_node(), dump_enabled_p(), dump_printf_loc(), FOR_EACH_VEC_ELT, gimple_bb(), gimple_build_assign(), gimple_convert(), gimple_phi_result(), gsi_for_stmt(), gsi_insert_seq_on_edge_immediate(), gsi_next(), i, is_a(), vec_info::lookup_def(), loop_latch_edge(), loop_preheader_edge(), LOOP_VINFO_LOOP, make_ssa_name(), maybe_gt, MSG_MISSED_OPTIMIZATION, MSG_NOTE, NULL, NULL_TREE, PHI_ARG_DEF_FROM_EDGE, _slp_tree::push_vec_def(), record_stmt_cost(), recurr_info_type, scalar_to_vec, SLP_TREE_CHILDREN, SLP_TREE_LANES, SLP_TREE_NUMBER_OF_VEC_STMTS, SLP_TREE_VECTYPE, SSA_NAME_DEF_STMT, STMT_VINFO_DEF_TYPE, STMT_VINFO_TYPE, STMT_VINFO_VEC_STMTS, STMT_VINFO_VECTYPE, TREE_CODE, TREE_TYPE, TYPE_MODE, TYPE_VECTOR_SUBPARTS(), types_compatible_p(), UNKNOWN_LOCATION, useless_type_conversion_p(), vect_body, vect_finish_stmt_generation(), vect_first_order_recurrence, vect_gen_perm_mask_checked(), vect_get_new_vect_var(), vect_get_num_copies(), vect_init_vector(), vect_location, vect_maybe_update_slp_op_vectype(), vect_prologue, vect_simple_var, vect_stmt_to_vectorize(), and vector_stmt.
Referenced by vect_analyze_loop_operations(), vect_analyze_stmt(), and vect_transform_stmt().
bool vectorizable_reduction | ( | loop_vec_info | loop_vinfo, |
stmt_vec_info | stmt_info, | ||
slp_tree | slp_node, | ||
slp_instance | slp_node_instance, | ||
stmt_vector_for_cost * | cost_vec ) |
Function vectorizable_reduction. Check if STMT_INFO performs a reduction operation that can be vectorized. If VEC_STMT is also passed, vectorize STMT_INFO: create a vectorized stmt to replace it, put it in VEC_STMT, and insert it at GSI. Return true if STMT_INFO is vectorizable in this way. This function also handles reduction idioms (patterns) that have been recognized in advance during vect_pattern_recog. In this case, STMT_INFO may be of this form: X = pattern_expr (arg0, arg1, ..., X) and its STMT_VINFO_RELATED_STMT points to the last stmt in the original sequence that had been detected and replaced by the pattern-stmt (STMT_INFO). This function also handles reduction of condition expressions, for example: for (int i = 0; i < N; i++) if (a[i] < value) last = a[i]; This is handled by vectorising the loop and creating an additional vector containing the loop indexes for which "a[i] < value" was true. In the function epilogue this is reduced to a single max value and then used to index into the vector of results. In some cases of reduction patterns, the type of the reduction variable X is different than the type of the other arguments of STMT_INFO. In such cases, the vectype that is used when transforming STMT_INFO into a vector stmt is different than the vectype that is used to determine the vectorization factor, because it consists of a different number of elements than the actual number of elements that are being operated upon in parallel. For example, consider an accumulation of shorts into an int accumulator. On some targets it's possible to vectorize this pattern operating on 8 shorts at a time (hence, the vectype for purposes of determining the vectorization factor should be V8HI); on the other hand, the vectype that is used to create the vector form is actually V4SI (the type of the result). Upon entry to this function, STMT_VINFO_VECTYPE records the vectype that indicates what is the actual level of parallelism (V8HI in the example), so that the right vectorization factor would be derived. This vectype corresponds to the type of arguments to the reduction stmt, and should *NOT* be used to create the vectorized stmt. The right vectype for the vectorized stmt is obtained from the type of the result X: get_vectype_for_scalar_type (vinfo, TREE_TYPE (X)) This means that, contrary to "regular" reductions (or "regular" stmts in general), the following equation: STMT_VINFO_VECTYPE == get_vectype_for_scalar_type (vinfo, TREE_TYPE (X)) does *NOT* necessarily hold for reduction patterns.
References as_a(), associative_binary_op_p(), boolean_type_node, build_int_cst(), can_duplicate_and_interleave_p(), gimple_match_op::code, commutative_binary_op_p(), COMPARISON_CLASS_P, COND_REDUCTION, conditional_internal_fn_code(), CONST_COND_REDUCTION, CONVERT_EXPR_CODE_P, cycle_phi_info_type, direct_internal_fn_supported_p(), directly_supported_p(), dump_enabled_p(), dump_printf(), dump_printf_loc(), EXTRACT_LAST_REDUCTION, fold_binary, FOLD_LEFT_REDUCTION, fold_left_reduction_fn(), FOR_EACH_VEC_ELT, gcc_assert, gcc_unreachable, GET_MODE_PRECISION(), GET_MODE_SIZE(), get_same_sized_vectype(), get_vectype_for_scalar_type(), wi::geu_p(), gimple_bb(), gimple_extract_op(), gimple_phi_result(), loop::header, i, info_for_reduction(), loop::inner, int_const_binop(), INTEGER_INDUC_COND_REDUCTION, integer_one_node, integer_onep(), integer_zerop(), INTEGRAL_TYPE_P, is_a(), poly_int< N, C >::is_constant(), code_helper::is_internal_fn(), is_nonwrapping_integer_induction(), known_eq, lane_reducing_op_p(), vec_info::lookup_def(), vec_info::lookup_stmt(), loop_latch_edge(), LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P, LOOP_VINFO_LOOP, LOOP_VINFO_VECT_FACTOR, make_unsigned_type(), max_loop_iterations(), MSG_MISSED_OPTIMIZATION, MSG_NOTE, needs_fold_left_reduction_p(), nested_in_vect_loop_p(), neutral_op_for_reduction(), NULL, NULL_TREE, gimple_match_op::num_ops, gimple_match_op::ops, optab_vector, OPTIMIZE_FOR_SPEED, PHI_ARG_DEF_FROM_EDGE, PHI_RESULT, POINTER_TYPE_P, record_stmt_cost(), REDUC_GROUP_FIRST_ELEMENT, REDUC_GROUP_NEXT_ELEMENT, reduc_vec_info_type, reduction_fn_for_scalar_code(), SCALAR_FLOAT_TYPE_P, SCALAR_TYPE_MODE, single_imm_use(), SLP_TREE_CHILDREN, SLP_TREE_LANES, SLP_TREE_NUMBER_OF_VEC_STMTS, SLP_TREE_SCALAR_STMTS, SLP_TREE_VECTYPE, STMT_VINFO_DEF_TYPE, STMT_VINFO_FORCE_SINGLE_CYCLE, STMT_VINFO_IN_PATTERN_P, STMT_VINFO_LIVE_P, STMT_VINFO_LOOP_PHI_EVOLUTION_BASE_UNCHANGED, STMT_VINFO_LOOP_PHI_EVOLUTION_PART, STMT_VINFO_REDUC_CODE, STMT_VINFO_REDUC_DEF, STMT_VINFO_REDUC_FN, STMT_VINFO_REDUC_IDX, STMT_VINFO_REDUC_TYPE, STMT_VINFO_REDUC_VECTYPE, STMT_VINFO_REDUC_VECTYPE_IN, STMT_VINFO_RELATED_STMT, STMT_VINFO_RELEVANT, STMT_VINFO_STMT, STMT_VINFO_TYPE, STMT_VINFO_VEC_INDUC_COND_INITIAL_VAL, STMT_VINFO_VECTYPE, _loop_vec_info::suggested_unroll_factor, wi::to_widest(), TREE_CODE, TREE_CODE_REDUCTION, tree_int_cst_lt(), tree_int_cst_sgn(), tree_nop_conversion_p(), TREE_TYPE, gimple_match_op::type, type_has_mode_precision_p(), TYPE_MAX_VALUE, TYPE_MIN_VALUE, TYPE_MODE, TYPE_VECTOR_SUBPARTS(), types_compatible_p(), vect_body, vect_can_vectorize_without_simd_p(), vect_constant_def, vect_double_reduction_def, vect_emulated_vector_p(), vect_get_num_copies(), vect_induction_def, vect_internal_def, vect_is_simple_use(), vect_location, vect_maybe_update_slp_op_vectype(), vect_model_reduction_cost(), vect_nested_cycle, vect_orig_stmt(), vect_phi_initial_value(), vect_reduction_def, vect_reduction_update_partial_vector_usage(), vect_stmt_to_vectorize(), vect_unknown_def_type, vect_unused_in_scope, vect_used_in_outer, vect_used_only_live, vector_stmt, and VECTORIZABLE_CYCLE_DEF.
Referenced by vect_analyze_loop_operations(), and vect_analyze_stmt().
|
static |
Perform an in-order reduction (FOLD_LEFT_REDUCTION). STMT_INFO is the statement that sets the live-out value. REDUC_DEF_STMT is the phi statement. CODE is the operation performed by STMT_INFO and OPS are its scalar operands. REDUC_INDEX is the index of the operand in OPS that is set by REDUC_DEF_STMT. REDUC_FN is the function that implements in-order reduction, or IFN_LAST if we should open-code it. VECTYPE_IN is the type of the vector input. MASKS specifies the masks that should be used to control the operation in a fully-masked loop.
References binary_op, build_int_cst(), build_minus_one_cst(), build_zero_cst(), conditional_internal_fn_code(), const_unop(), FOR_EACH_VEC_ELT, gcc_assert, gcc_checking_assert, get_masked_reduction_fn(), gimple_build_assign(), gimple_build_call_internal(), gimple_get_lhs(), gimple_phi_result(), gimple_set_lhs(), gsi_for_stmt(), gsi_insert_before(), gsi_remove(), GSI_SAME_STMT, HONOR_SIGN_DEPENDENT_ROUNDING(), HONOR_SIGNED_ZEROS(), i, intQI_type_node, code_helper::is_tree_code(), known_eq, LOOP_VINFO_FULLY_MASKED_P, LOOP_VINFO_FULLY_WITH_LENGTH_P, LOOP_VINFO_LOOP, LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS, make_ssa_name(), merge_with_identity(), nested_in_vect_loop_p(), NULL, NULL_TREE, prepare_vec_mask(), _slp_tree::push_vec_def(), SLP_TREE_CHILDREN, SLP_TREE_SCALAR_STMTS, SSA_NAME_DEF_STMT, STMT_VINFO_VEC_STMTS, STMT_VINFO_VECTYPE, TREE_CODE_LENGTH, TREE_TYPE, truth_type_for(), TYPE_VECTOR_SUBPARTS(), useless_type_conversion_p(), vect_create_destination_var(), vect_expand_fold_left(), vect_finish_replace_stmt(), vect_finish_stmt_generation(), vect_get_loop_len(), vect_get_loop_mask(), vect_get_num_copies(), vect_get_slp_defs(), vect_get_vec_defs_for_operand(), and vect_orig_stmt().
Referenced by vect_transform_reduction().