LCOV - code coverage report
Current view: top level - gcc - bb-reorder.cc (source / functions) Coverage Total Hit
Test: gcc.info Lines: 85.1 % 1205 1025
Test Date: 2024-03-23 14:05:01 Functions: 90.2 % 41 37
Legend: Lines: hit not hit | Branches: + taken - not taken # not executed Branches: - 0 0

             Branch data     Line data    Source code
       1                 :             : /* Basic block reordering routines for the GNU compiler.
       2                 :             :    Copyright (C) 2000-2024 Free Software Foundation, Inc.
       3                 :             : 
       4                 :             :    This file is part of GCC.
       5                 :             : 
       6                 :             :    GCC is free software; you can redistribute it and/or modify it
       7                 :             :    under the terms of the GNU General Public License as published by
       8                 :             :    the Free Software Foundation; either version 3, or (at your option)
       9                 :             :    any later version.
      10                 :             : 
      11                 :             :    GCC is distributed in the hope that it will be useful, but WITHOUT
      12                 :             :    ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
      13                 :             :    or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
      14                 :             :    License for more details.
      15                 :             : 
      16                 :             :    You should have received a copy of the GNU General Public License
      17                 :             :    along with GCC; see the file COPYING3.  If not see
      18                 :             :    <http://www.gnu.org/licenses/>.  */
      19                 :             : 
      20                 :             : /* This file contains the "reorder blocks" pass, which changes the control
      21                 :             :    flow of a function to encounter fewer branches; the "partition blocks"
      22                 :             :    pass, which divides the basic blocks into "hot" and "cold" partitions,
      23                 :             :    which are kept separate; and the "duplicate computed gotos" pass, which
      24                 :             :    duplicates blocks ending in an indirect jump.
      25                 :             : 
      26                 :             :    There are two algorithms for "reorder blocks": the "simple" algorithm,
      27                 :             :    which just rearranges blocks, trying to minimize the number of executed
      28                 :             :    unconditional branches; and the "software trace cache" algorithm, which
      29                 :             :    also copies code, and in general tries a lot harder to have long linear
      30                 :             :    pieces of machine code executed.  This algorithm is described next.  */
      31                 :             : 
      32                 :             : /* This (greedy) algorithm constructs traces in several rounds.
      33                 :             :    The construction starts from "seeds".  The seed for the first round
      34                 :             :    is the entry point of the function.  When there are more than one seed,
      35                 :             :    the one with the lowest key in the heap is selected first (see bb_to_key).
      36                 :             :    Then the algorithm repeatedly adds the most probable successor to the end
      37                 :             :    of a trace.  Finally it connects the traces.
      38                 :             : 
      39                 :             :    There are two parameters: Branch Threshold and Exec Threshold.
      40                 :             :    If the probability of an edge to a successor of the current basic block is
      41                 :             :    lower than Branch Threshold or its count is lower than Exec Threshold,
      42                 :             :    then the successor will be the seed in one of the next rounds.
      43                 :             :    Each round has these parameters lower than the previous one.
      44                 :             :    The last round has to have these parameters set to zero so that the
      45                 :             :    remaining blocks are picked up.
      46                 :             : 
      47                 :             :    The algorithm selects the most probable successor from all unvisited
      48                 :             :    successors and successors that have been added to this trace.
      49                 :             :    The other successors (that has not been "sent" to the next round) will be
      50                 :             :    other seeds for this round and the secondary traces will start from them.
      51                 :             :    If the successor has not been visited in this trace, it is added to the
      52                 :             :    trace (however, there is some heuristic for simple branches).
      53                 :             :    If the successor has been visited in this trace, a loop has been found.
      54                 :             :    If the loop has many iterations, the loop is rotated so that the source
      55                 :             :    block of the most probable edge going out of the loop is the last block
      56                 :             :    of the trace.
      57                 :             :    If the loop has few iterations and there is no edge from the last block of
      58                 :             :    the loop going out of the loop, the loop header is duplicated.
      59                 :             : 
      60                 :             :    When connecting traces, the algorithm first checks whether there is an edge
      61                 :             :    from the last block of a trace to the first block of another trace.
      62                 :             :    When there are still some unconnected traces it checks whether there exists
      63                 :             :    a basic block BB such that BB is a successor of the last block of a trace
      64                 :             :    and BB is a predecessor of the first block of another trace.  In this case,
      65                 :             :    BB is duplicated, added at the end of the first trace and the traces are
      66                 :             :    connected through it.
      67                 :             :    The rest of traces are simply connected so there will be a jump to the
      68                 :             :    beginning of the rest of traces.
      69                 :             : 
      70                 :             :    The above description is for the full algorithm, which is used when the
      71                 :             :    function is optimized for speed.  When the function is optimized for size,
      72                 :             :    in order to reduce long jumps and connect more fallthru edges, the
      73                 :             :    algorithm is modified as follows:
      74                 :             :    (1) Break long traces to short ones.  A trace is broken at a block that has
      75                 :             :    multiple predecessors/ successors during trace discovery.  When connecting
      76                 :             :    traces, only connect Trace n with Trace n + 1.  This change reduces most
      77                 :             :    long jumps compared with the above algorithm.
      78                 :             :    (2) Ignore the edge probability and count for fallthru edges.
      79                 :             :    (3) Keep the original order of blocks when there is no chance to fall
      80                 :             :    through.  We rely on the results of cfg_cleanup.
      81                 :             : 
      82                 :             :    To implement the change for code size optimization, block's index is
      83                 :             :    selected as the key and all traces are found in one round.
      84                 :             : 
      85                 :             :    References:
      86                 :             : 
      87                 :             :    "Software Trace Cache"
      88                 :             :    A. Ramirez, J. Larriba-Pey, C. Navarro, J. Torrellas and M. Valero; 1999
      89                 :             :    http://citeseer.nj.nec.com/15361.html
      90                 :             : 
      91                 :             : */
      92                 :             : 
      93                 :             : #include "config.h"
      94                 :             : #include "system.h"
      95                 :             : #include "coretypes.h"
      96                 :             : #include "backend.h"
      97                 :             : #include "target.h"
      98                 :             : #include "rtl.h"
      99                 :             : #include "tree.h"
     100                 :             : #include "cfghooks.h"
     101                 :             : #include "df.h"
     102                 :             : #include "memmodel.h"
     103                 :             : #include "optabs.h"
     104                 :             : #include "regs.h"
     105                 :             : #include "emit-rtl.h"
     106                 :             : #include "output.h"
     107                 :             : #include "expr.h"
     108                 :             : #include "tree-pass.h"
     109                 :             : #include "cfgrtl.h"
     110                 :             : #include "cfganal.h"
     111                 :             : #include "cfgbuild.h"
     112                 :             : #include "cfgcleanup.h"
     113                 :             : #include "bb-reorder.h"
     114                 :             : #include "except.h"
     115                 :             : #include "alloc-pool.h"
     116                 :             : #include "fibonacci_heap.h"
     117                 :             : #include "stringpool.h"
     118                 :             : #include "attribs.h"
     119                 :             : #include "common/common-target.h"
     120                 :             : 
     121                 :             : /* The number of rounds.  In most cases there will only be 4 rounds, but
     122                 :             :    when partitioning hot and cold basic blocks into separate sections of
     123                 :             :    the object file there will be an extra round.  */
     124                 :             : #define N_ROUNDS 5
     125                 :             : 
     126                 :             : struct target_bb_reorder default_target_bb_reorder;
     127                 :             : #if SWITCHABLE_TARGET
     128                 :             : struct target_bb_reorder *this_target_bb_reorder = &default_target_bb_reorder;
     129                 :             : #endif
     130                 :             : 
     131                 :             : #define uncond_jump_length \
     132                 :             :   (this_target_bb_reorder->x_uncond_jump_length)
     133                 :             : 
     134                 :             : /* Branch thresholds in thousandths (per mille) of the REG_BR_PROB_BASE.  */
     135                 :             : static const int branch_threshold[N_ROUNDS] = {400, 200, 100, 0, 0};
     136                 :             : 
     137                 :             : /* Exec thresholds in thousandths (per mille) of the count of bb 0.  */
     138                 :             : static const int exec_threshold[N_ROUNDS] = {500, 200, 50, 0, 0};
     139                 :             : 
     140                 :             : /* If edge count is lower than DUPLICATION_THRESHOLD per mille of entry
     141                 :             :    block the edge destination is not duplicated while connecting traces.  */
     142                 :             : #define DUPLICATION_THRESHOLD 100
     143                 :             : 
     144                 :             : typedef fibonacci_heap <long, basic_block_def> bb_heap_t;
     145                 :             : typedef fibonacci_node <long, basic_block_def> bb_heap_node_t;
     146                 :             : 
     147                 :             : /* Structure to hold needed information for each basic block.  */
     148                 :             : struct bbro_basic_block_data
     149                 :             : {
     150                 :             :   /* Which trace is the bb start of (-1 means it is not a start of any).  */
     151                 :             :   int start_of_trace;
     152                 :             : 
     153                 :             :   /* Which trace is the bb end of (-1 means it is not an end of any).  */
     154                 :             :   int end_of_trace;
     155                 :             : 
     156                 :             :   /* Which trace is the bb in?  */
     157                 :             :   int in_trace;
     158                 :             : 
     159                 :             :   /* Which trace was this bb visited in?  */
     160                 :             :   int visited;
     161                 :             : 
     162                 :             :   /* Cached maximum frequency of interesting incoming edges.
     163                 :             :      Minus one means not yet computed.  */
     164                 :             :   int priority;
     165                 :             : 
     166                 :             :   /* Which heap is BB in (if any)?  */
     167                 :             :   bb_heap_t *heap;
     168                 :             : 
     169                 :             :   /* Which heap node is BB in (if any)?  */
     170                 :             :   bb_heap_node_t *node;
     171                 :             : };
     172                 :             : 
     173                 :             : /* The current size of the following dynamic array.  */
     174                 :             : static int array_size;
     175                 :             : 
     176                 :             : /* The array which holds needed information for basic blocks.  */
     177                 :             : static bbro_basic_block_data *bbd;
     178                 :             : 
     179                 :             : /* To avoid frequent reallocation the size of arrays is greater than needed,
     180                 :             :    the number of elements is (not less than) 1.25 * size_wanted.  */
     181                 :             : #define GET_ARRAY_SIZE(X) ((((X) / 4) + 1) * 5)
     182                 :             : 
     183                 :             : /* Free the memory and set the pointer to NULL.  */
     184                 :             : #define FREE(P) (gcc_assert (P), free (P), P = 0)
     185                 :             : 
     186                 :             : /* Structure for holding information about a trace.  */
     187                 :             : struct trace
     188                 :             : {
     189                 :             :   /* First and last basic block of the trace.  */
     190                 :             :   basic_block first, last;
     191                 :             : 
     192                 :             :   /* The round of the STC creation which this trace was found in.  */
     193                 :             :   int round;
     194                 :             : 
     195                 :             :   /* The length (i.e. the number of basic blocks) of the trace.  */
     196                 :             :   int length;
     197                 :             : };
     198                 :             : 
     199                 :             : /* Maximum count of one of the entry blocks.  */
     200                 :             : static profile_count max_entry_count;
     201                 :             : 
     202                 :             : /* Local function prototypes.  */
     203                 :             : static void find_traces_1_round (int, profile_count, struct trace *, int *,
     204                 :             :                                  int, bb_heap_t **, int);
     205                 :             : static basic_block copy_bb (basic_block, edge, basic_block, int);
     206                 :             : static long bb_to_key (basic_block);
     207                 :             : static bool better_edge_p (const_basic_block, const_edge, profile_probability,
     208                 :             :                            profile_count, profile_probability, profile_count,
     209                 :             :                            const_edge);
     210                 :             : static bool copy_bb_p (const_basic_block, int);
     211                 :             : 
     212                 :             : /* Return the trace number in which BB was visited.  */
     213                 :             : 
     214                 :             : static int
     215                 :    33410615 : bb_visited_trace (const_basic_block bb)
     216                 :             : {
     217                 :    33410615 :   gcc_assert (bb->index < array_size);
     218                 :    33410615 :   return bbd[bb->index].visited;
     219                 :             : }
     220                 :             : 
     221                 :             : /* This function marks BB that it was visited in trace number TRACE.  */
     222                 :             : 
     223                 :             : static void
     224                 :     8942837 : mark_bb_visited (basic_block bb, int trace)
     225                 :             : {
     226                 :     8942837 :   bbd[bb->index].visited = trace;
     227                 :     8942837 :   if (bbd[bb->index].heap)
     228                 :             :     {
     229                 :      356505 :       bbd[bb->index].heap->delete_node (bbd[bb->index].node);
     230                 :      356505 :       bbd[bb->index].heap = NULL;
     231                 :      356505 :       bbd[bb->index].node = NULL;
     232                 :             :     }
     233                 :     8942837 : }
     234                 :             : 
     235                 :             : /* Check to see if bb should be pushed into the next round of trace
     236                 :             :    collections or not.  Reasons for pushing the block forward are 1).
     237                 :             :    If the block is cold, we are doing partitioning, and there will be
     238                 :             :    another round (cold partition blocks are not supposed to be
     239                 :             :    collected into traces until the very last round); or 2). There will
     240                 :             :    be another round, and the basic block is not "hot enough" for the
     241                 :             :    current round of trace collection.  */
     242                 :             : 
     243                 :             : static bool
     244                 :     7846028 : push_to_next_round_p (const_basic_block bb, int round, int number_of_rounds,
     245                 :             :                       profile_count count_th)
     246                 :             : {
     247                 :     7846028 :   bool there_exists_another_round;
     248                 :     7846028 :   bool block_not_hot_enough;
     249                 :             : 
     250                 :     7846028 :   there_exists_another_round = round < number_of_rounds - 1;
     251                 :             : 
     252                 :     7846028 :   block_not_hot_enough = (bb->count < count_th
     253                 :     7846028 :                           || probably_never_executed_bb_p (cfun, bb));
     254                 :             : 
     255                 :     7846028 :   if (there_exists_another_round
     256                 :     7846028 :       && block_not_hot_enough)
     257                 :             :     return true;
     258                 :             :   else
     259                 :     4629566 :     return false;
     260                 :             : }
     261                 :             : 
     262                 :             : /* Find the traces for Software Trace Cache.  Chain each trace through
     263                 :             :    RBI()->next.  Store the number of traces to N_TRACES and description of
     264                 :             :    traces to TRACES.  */
     265                 :             : 
     266                 :             : static void
     267                 :      570343 : find_traces (int *n_traces, struct trace *traces)
     268                 :             : {
     269                 :      570343 :   int i;
     270                 :      570343 :   int number_of_rounds;
     271                 :      570343 :   edge e;
     272                 :      570343 :   edge_iterator ei;
     273                 :      570343 :   bb_heap_t *heap = new bb_heap_t (LONG_MIN);
     274                 :             : 
     275                 :             :   /* Add one extra round of trace collection when partitioning hot/cold
     276                 :             :      basic blocks into separate sections.  The last round is for all the
     277                 :             :      cold blocks (and ONLY the cold blocks).  */
     278                 :             : 
     279                 :      570343 :   number_of_rounds = N_ROUNDS - 1;
     280                 :             : 
     281                 :             :   /* Insert entry points of function into heap.  */
     282                 :      570343 :   max_entry_count = profile_count::zero ();
     283                 :     1140686 :   FOR_EACH_EDGE (e, ei, ENTRY_BLOCK_PTR_FOR_FN (cfun)->succs)
     284                 :             :     {
     285                 :      570343 :       bbd[e->dest->index].heap = heap;
     286                 :      570343 :       bbd[e->dest->index].node = heap->insert (bb_to_key (e->dest), e->dest);
     287                 :      570343 :       if (e->dest->count > max_entry_count)
     288                 :      569516 :         max_entry_count = e->dest->count;
     289                 :             :     }
     290                 :             : 
     291                 :             :   /* Find the traces.  */
     292                 :     2851715 :   for (i = 0; i < number_of_rounds; i++)
     293                 :             :     {
     294                 :     2281372 :       profile_count count_threshold;
     295                 :             : 
     296                 :     2281372 :       if (dump_file)
     297                 :          92 :         fprintf (dump_file, "STC - round %d\n", i + 1);
     298                 :             : 
     299                 :     2281372 :       count_threshold = max_entry_count.apply_scale (exec_threshold[i], 1000);
     300                 :             : 
     301                 :     2281372 :       find_traces_1_round (REG_BR_PROB_BASE * branch_threshold[i] / 1000,
     302                 :             :                            count_threshold, traces, n_traces, i, &heap,
     303                 :             :                            number_of_rounds);
     304                 :             :     }
     305                 :      570343 :   delete heap;
     306                 :             : 
     307                 :      570343 :   if (dump_file)
     308                 :             :     {
     309                 :         111 :       for (i = 0; i < *n_traces; i++)
     310                 :             :         {
     311                 :          88 :           basic_block bb;
     312                 :          88 :           fprintf (dump_file, "Trace %d (round %d):  ", i + 1,
     313                 :          88 :                    traces[i].round + 1);
     314                 :          88 :           for (bb = traces[i].first;
     315                 :         184 :                bb != traces[i].last;
     316                 :          96 :                bb = (basic_block) bb->aux)
     317                 :             :             {
     318                 :          96 :               fprintf (dump_file, "%d [", bb->index);
     319                 :          96 :               bb->count.dump (dump_file);
     320                 :          96 :               fprintf (dump_file, "] ");
     321                 :             :             }
     322                 :          88 :           fprintf (dump_file, "%d [", bb->index);
     323                 :          88 :           bb->count.dump (dump_file);
     324                 :          88 :           fprintf (dump_file, "]\n");
     325                 :             :         }
     326                 :          23 :       fflush (dump_file);
     327                 :             :     }
     328                 :      570343 : }
     329                 :             : 
     330                 :             : /* Rotate loop whose back edge is BACK_EDGE in the tail of trace TRACE
     331                 :             :    (with sequential number TRACE_N).  */
     332                 :             : 
     333                 :             : static basic_block
     334                 :      140320 : rotate_loop (edge back_edge, struct trace *trace, int trace_n)
     335                 :             : {
     336                 :      140320 :   basic_block bb;
     337                 :             : 
     338                 :             :   /* Information about the best end (end after rotation) of the loop.  */
     339                 :      140320 :   basic_block best_bb = NULL;
     340                 :      140320 :   edge best_edge = NULL;
     341                 :      140320 :   profile_count best_count = profile_count::uninitialized ();
     342                 :             :   /* The best edge is preferred when its destination is not visited yet
     343                 :             :      or is a start block of some trace.  */
     344                 :      140320 :   bool is_preferred = false;
     345                 :             : 
     346                 :             :   /* Find the most frequent edge that goes out from current trace.  */
     347                 :      140320 :   bb = back_edge->dest;
     348                 :      563617 :   do
     349                 :             :     {
     350                 :      563617 :       edge e;
     351                 :      563617 :       edge_iterator ei;
     352                 :             : 
     353                 :     1582513 :       FOR_EACH_EDGE (e, ei, bb->succs)
     354                 :     1018896 :         if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
     355                 :     1018896 :             && bb_visited_trace (e->dest) != trace_n
     356                 :             :             && (e->flags & EDGE_CAN_FALLTHRU)
     357                 :     1422254 :             && !(e->flags & EDGE_COMPLEX))
     358                 :             :         {
     359                 :      363953 :           if (is_preferred)
     360                 :             :             {
     361                 :             :               /* The best edge is preferred.  */
     362                 :      221583 :               if (!bb_visited_trace (e->dest)
     363                 :      221583 :                   || bbd[e->dest->index].start_of_trace >= 0)
     364                 :             :                 {
     365                 :             :                   /* The current edge E is also preferred.  */
     366                 :      218119 :                   if (e->count () > best_count)
     367                 :             :                     {
     368                 :       53669 :                       best_count = e->count ();
     369                 :       53669 :                       best_edge = e;
     370                 :       53669 :                       best_bb = bb;
     371                 :             :                     }
     372                 :             :                 }
     373                 :             :             }
     374                 :             :           else
     375                 :             :             {
     376                 :      142370 :               if (!bb_visited_trace (e->dest)
     377                 :      142370 :                   || bbd[e->dest->index].start_of_trace >= 0)
     378                 :             :                 {
     379                 :             :                   /* The current edge E is preferred.  */
     380                 :      139156 :                   is_preferred = true;
     381                 :      139156 :                   best_count = e->count ();
     382                 :      139156 :                   best_edge = e;
     383                 :      139156 :                   best_bb = bb;
     384                 :             :                 }
     385                 :             :               else
     386                 :             :                 {
     387                 :        3214 :                   if (!best_edge || e->count () > best_count)
     388                 :             :                     {
     389                 :        2571 :                       best_count = e->count ();
     390                 :        2571 :                       best_edge = e;
     391                 :        2571 :                       best_bb = bb;
     392                 :             :                     }
     393                 :             :                 }
     394                 :             :             }
     395                 :             :         }
     396                 :      563617 :       bb = (basic_block) bb->aux;
     397                 :             :     }
     398                 :      563617 :   while (bb != back_edge->dest);
     399                 :             : 
     400                 :      140320 :   if (best_bb)
     401                 :             :     {
     402                 :             :       /* Rotate the loop so that the BEST_EDGE goes out from the last block of
     403                 :             :          the trace.  */
     404                 :      140248 :       if (back_edge->dest == trace->first)
     405                 :             :         {
     406                 :        5117 :           trace->first = (basic_block) best_bb->aux;
     407                 :             :         }
     408                 :             :       else
     409                 :             :         {
     410                 :             :           basic_block prev_bb;
     411                 :             : 
     412                 :             :           for (prev_bb = trace->first;
     413                 :      430464 :                prev_bb->aux != back_edge->dest;
     414                 :             :                prev_bb = (basic_block) prev_bb->aux)
     415                 :             :             ;
     416                 :      135131 :           prev_bb->aux = best_bb->aux;
     417                 :             : 
     418                 :             :           /* Try to get rid of uncond jump to cond jump.  */
     419                 :      135131 :           if (single_succ_p (prev_bb))
     420                 :             :             {
     421                 :      109842 :               basic_block header = single_succ (prev_bb);
     422                 :             : 
     423                 :             :               /* Duplicate HEADER if it is a small block containing cond jump
     424                 :             :                  in the end.  */
     425                 :      212826 :               if (any_condjump_p (BB_END (header)) && copy_bb_p (header, 0)
     426                 :      109842 :                   && !CROSSING_JUMP_P (BB_END (header)))
     427                 :           0 :                 copy_bb (header, single_succ_edge (prev_bb), prev_bb, trace_n);
     428                 :             :             }
     429                 :             :         }
     430                 :             :     }
     431                 :             :   else
     432                 :             :     {
     433                 :             :       /* We have not found suitable loop tail so do no rotation.  */
     434                 :          72 :       best_bb = back_edge->src;
     435                 :             :     }
     436                 :      140320 :   best_bb->aux = NULL;
     437                 :      140320 :   return best_bb;
     438                 :             : }
     439                 :             : 
     440                 :             : /* One round of finding traces.  Find traces for BRANCH_TH and EXEC_TH i.e. do
     441                 :             :    not include basic blocks whose probability is lower than BRANCH_TH or whose
     442                 :             :    count is lower than EXEC_TH into traces (or whose count is lower than
     443                 :             :    COUNT_TH).  Store the new traces into TRACES and modify the number of
     444                 :             :    traces *N_TRACES.  Set the round (which the trace belongs to) to ROUND.
     445                 :             :    The function expects starting basic blocks to be in *HEAP and will delete
     446                 :             :    *HEAP and store starting points for the next round into new *HEAP.  */
     447                 :             : 
     448                 :             : static void
     449                 :     2281372 : find_traces_1_round (int branch_th, profile_count count_th,
     450                 :             :                      struct trace *traces, int *n_traces, int round,
     451                 :             :                      bb_heap_t **heap, int number_of_rounds)
     452                 :             : {
     453                 :             :   /* Heap for discarded basic blocks which are possible starting points for
     454                 :             :      the next round.  */
     455                 :     2281372 :   bb_heap_t *new_heap = new bb_heap_t (LONG_MIN);
     456                 :     2281372 :   bool for_size = optimize_function_for_size_p (cfun);
     457                 :             : 
     458                 :     7658834 :   while (!(*heap)->empty ())
     459                 :             :     {
     460                 :     5377462 :       basic_block bb;
     461                 :     5377462 :       struct trace *trace;
     462                 :     5377462 :       edge best_edge, e;
     463                 :     5377462 :       long key;
     464                 :     5377462 :       edge_iterator ei;
     465                 :             : 
     466                 :     5377462 :       bb = (*heap)->extract_min ();
     467                 :     5377462 :       bbd[bb->index].heap = NULL;
     468                 :     5377462 :       bbd[bb->index].node = NULL;
     469                 :             : 
     470                 :     5377462 :       if (dump_file)
     471                 :         102 :         fprintf (dump_file, "Getting bb %d\n", bb->index);
     472                 :             : 
     473                 :             :       /* If the BB's count is too low, send BB to the next round.  When
     474                 :             :          partitioning hot/cold blocks into separate sections, make sure all
     475                 :             :          the cold blocks (and ONLY the cold blocks) go into the (extra) final
     476                 :             :          round.  When optimizing for size, do not push to next round.  */
     477                 :             : 
     478                 :     5377462 :       if (!for_size
     479                 :     5377462 :           && push_to_next_round_p (bb, round, number_of_rounds,
     480                 :             :                                    count_th))
     481                 :             :         {
     482                 :     1446749 :           int key = bb_to_key (bb);
     483                 :     1446749 :           bbd[bb->index].heap = new_heap;
     484                 :     1446749 :           bbd[bb->index].node = new_heap->insert (key, bb);
     485                 :             : 
     486                 :     1446749 :           if (dump_file)
     487                 :          14 :             fprintf (dump_file,
     488                 :             :                      "  Possible start point of next round: %d (key: %d)\n",
     489                 :             :                      bb->index, key);
     490                 :     1446749 :           continue;
     491                 :     1446749 :         }
     492                 :             : 
     493                 :     3930713 :       trace = traces + *n_traces;
     494                 :     3930713 :       trace->first = bb;
     495                 :     3930713 :       trace->round = round;
     496                 :     3930713 :       trace->length = 0;
     497                 :     3930713 :       bbd[bb->index].in_trace = *n_traces;
     498                 :     3930713 :       (*n_traces)++;
     499                 :             : 
     500                 :     8691339 :       do
     501                 :             :         {
     502                 :     8691339 :           bool ends_in_call;
     503                 :             : 
     504                 :             :           /* The probability and count of the best edge.  */
     505                 :     8691339 :           profile_probability best_prob = profile_probability::uninitialized ();
     506                 :     8691339 :           profile_count best_count = profile_count::uninitialized ();
     507                 :             : 
     508                 :     8691339 :           best_edge = NULL;
     509                 :     8691339 :           mark_bb_visited (bb, *n_traces);
     510                 :     8691339 :           trace->length++;
     511                 :             : 
     512                 :     8691339 :           if (dump_file)
     513                 :         184 :             fprintf (dump_file, "Basic block %d was visited in trace %d\n",
     514                 :             :                      bb->index, *n_traces);
     515                 :             : 
     516                 :     8691339 :           ends_in_call = block_ends_with_call_p (bb);
     517                 :             : 
     518                 :             :           /* Select the successor that will be placed after BB.  */
     519                 :    21781552 :           FOR_EACH_EDGE (e, ei, bb->succs)
     520                 :             :             {
     521                 :    13090213 :               gcc_assert (!(e->flags & EDGE_FAKE));
     522                 :             : 
     523                 :    13090213 :               if (e->dest == EXIT_BLOCK_PTR_FOR_FN (cfun))
     524                 :     7922062 :                 continue;
     525                 :             : 
     526                 :    12443127 :               if (bb_visited_trace (e->dest)
     527                 :    12443127 :                   && bb_visited_trace (e->dest) != *n_traces)
     528                 :     2319661 :                 continue;
     529                 :             : 
     530                 :             :               /* If partitioning hot/cold basic blocks, don't consider edges
     531                 :             :                  that cross section boundaries.  */
     532                 :    10123466 :               if (BB_PARTITION (e->dest) != BB_PARTITION (bb))
     533                 :      383471 :                 continue;
     534                 :             : 
     535                 :     9739995 :               profile_probability prob = e->probability;
     536                 :     9739995 :               profile_count count = e->dest->count;
     537                 :             : 
     538                 :             :               /* The only sensible preference for a call instruction is the
     539                 :             :                  fallthru edge.  Don't bother selecting anything else.  */
     540                 :     9739995 :               if (ends_in_call)
     541                 :             :                 {
     542                 :      938523 :                   if (e->flags & EDGE_CAN_FALLTHRU)
     543                 :             :                     {
     544                 :      490723 :                       best_edge = e;
     545                 :      490723 :                       best_prob = prob;
     546                 :      490723 :                       best_count = count;
     547                 :             :                     }
     548                 :      938523 :                   continue;
     549                 :             :                 }
     550                 :             : 
     551                 :             :               /* Edge that cannot be fallthru or improbable or infrequent
     552                 :             :                  successor (i.e. it is unsuitable successor).  When optimizing
     553                 :             :                  for size, ignore the probability and count.  */
     554                 :    11787707 :               if (!(e->flags & EDGE_CAN_FALLTHRU) || (e->flags & EDGE_COMPLEX)
     555                 :     8588304 :                   || !prob.initialized_p ()
     556                 :    17388456 :                   || ((prob.to_reg_br_prob_base () < branch_th
     557                 :     8586984 :                       || e->count () < count_th) && (!for_size)))
     558                 :     2986235 :                 continue;
     559                 :             : 
     560                 :     5815237 :               if (better_edge_p (bb, e, prob, count, best_prob, best_count,
     561                 :             :                                  best_edge))
     562                 :             :                 {
     563                 :     5330640 :                   best_edge = e;
     564                 :     5330640 :                   best_prob = prob;
     565                 :     5330640 :                   best_count = count;
     566                 :             :                 }
     567                 :             :             }
     568                 :             : 
     569                 :             :           /* If the best destination has multiple predecessors and can be
     570                 :             :              duplicated cheaper than a jump, don't allow it to be added to
     571                 :             :              a trace; we'll duplicate it when connecting the traces later.
     572                 :             :              However, we need to check that this duplication wouldn't leave
     573                 :             :              the best destination with only crossing predecessors, because
     574                 :             :              this would change its effective partition from hot to cold.  */
     575                 :     8691339 :           if (best_edge
     576                 :     5366704 :               && EDGE_COUNT (best_edge->dest->preds) >= 2
     577                 :    10982497 :               && copy_bb_p (best_edge->dest, 0))
     578                 :             :             {
     579                 :       62095 :               bool only_crossing_preds = true;
     580                 :       62095 :               edge e;
     581                 :       62095 :               edge_iterator ei;
     582                 :       87236 :               FOR_EACH_EDGE (e, ei, best_edge->dest->preds)
     583                 :       87201 :                 if (e != best_edge && !(e->flags & EDGE_CROSSING))
     584                 :             :                   {
     585                 :             :                     only_crossing_preds = false;
     586                 :             :                     break;
     587                 :             :                   }
     588                 :       62095 :               if (!only_crossing_preds)
     589                 :       62060 :                 best_edge = NULL;
     590                 :             :             }
     591                 :             : 
     592                 :             :           /* If the best destination has multiple successors or predecessors,
     593                 :             :              don't allow it to be added when optimizing for size.  This makes
     594                 :             :              sure predecessors with smaller index are handled before the best
     595                 :             :              destination.  It breaks long trace and reduces long jumps.
     596                 :             : 
     597                 :             :              Take if-then-else as an example.
     598                 :             :                 A
     599                 :             :                / \
     600                 :             :               B   C
     601                 :             :                \ /
     602                 :             :                 D
     603                 :             :              If we do not remove the best edge B->D/C->D, the final order might
     604                 :             :              be A B D ... C.  C is at the end of the program.  If D's successors
     605                 :             :              and D are complicated, might need long jumps for A->C and C->D.
     606                 :             :              Similar issue for order: A C D ... B.
     607                 :             : 
     608                 :             :              After removing the best edge, the final result will be ABCD/ ACBD.
     609                 :             :              It does not add jump compared with the previous order.  But it
     610                 :             :              reduces the possibility of long jumps.  */
     611                 :     8691339 :           if (best_edge && for_size
     612                 :     8691339 :               && (EDGE_COUNT (best_edge->dest->succs) > 1
     613                 :      123725 :                  || EDGE_COUNT (best_edge->dest->preds) > 1))
     614                 :             :             best_edge = NULL;
     615                 :             : 
     616                 :             :           /* Add all non-selected successors to the heaps.  */
     617                 :    21781552 :           FOR_EACH_EDGE (e, ei, bb->succs)
     618                 :             :             {
     619                 :    21213931 :               if (e == best_edge
     620                 :     7965679 :                   || e->dest == EXIT_BLOCK_PTR_FOR_FN (cfun)
     621                 :    20408806 :                   || bb_visited_trace (e->dest))
     622                 :     8123718 :                 continue;
     623                 :             : 
     624                 :     4966495 :               key = bb_to_key (e->dest);
     625                 :             : 
     626                 :     4966495 :               if (bbd[e->dest->index].heap)
     627                 :             :                 {
     628                 :             :                   /* E->DEST is already in some heap.  */
     629                 :     1249620 :                   if (key != bbd[e->dest->index].node->get_key ())
     630                 :             :                     {
     631                 :           0 :                       if (dump_file)
     632                 :             :                         {
     633                 :           0 :                           fprintf (dump_file,
     634                 :             :                                    "Changing key for bb %d from %ld to %ld.\n",
     635                 :             :                                    e->dest->index,
     636                 :           0 :                                    (long) bbd[e->dest->index].node->get_key (),
     637                 :             :                                    key);
     638                 :             :                         }
     639                 :           0 :                       bbd[e->dest->index].heap->replace_key
     640                 :           0 :                         (bbd[e->dest->index].node, key);
     641                 :             :                     }
     642                 :             :                 }
     643                 :             :               else
     644                 :             :                 {
     645                 :     3716875 :                   bb_heap_t *which_heap = *heap;
     646                 :             : 
     647                 :     3716875 :                   profile_probability prob = e->probability;
     648                 :             : 
     649                 :     3716875 :                   if (!(e->flags & EDGE_CAN_FALLTHRU)
     650                 :     3716875 :                       || (e->flags & EDGE_COMPLEX)
     651                 :     3395411 :                       || !prob.initialized_p ()
     652                 :     3394228 :                       || prob.to_reg_br_prob_base () < branch_th
     653                 :     5485178 :                       || e->count () < count_th)
     654                 :             :                     {
     655                 :             :                       /* When partitioning hot/cold basic blocks, make sure
     656                 :             :                          the cold blocks (and only the cold blocks) all get
     657                 :             :                          pushed to the last round of trace collection.  When
     658                 :             :                          optimizing for size, do not push to next round.  */
     659                 :             : 
     660                 :     2785219 :                       if (!for_size && push_to_next_round_p (e->dest, round,
     661                 :             :                                                              number_of_rounds,
     662                 :             :                                                              count_th))
     663                 :             :                         which_heap = new_heap;
     664                 :             :                     }
     665                 :             : 
     666                 :     3716875 :                   bbd[e->dest->index].heap = which_heap;
     667                 :     3716875 :                   bbd[e->dest->index].node = which_heap->insert (key, e->dest);
     668                 :             : 
     669                 :     3716875 :                   if (dump_file)
     670                 :             :                     {
     671                 :          83 :                       fprintf (dump_file,
     672                 :             :                                "  Possible start of %s round: %d (key: %ld)\n",
     673                 :             :                                (which_heap == new_heap) ? "next" : "this",
     674                 :          83 :                                e->dest->index, (long) key);
     675                 :             :                     }
     676                 :             : 
     677                 :             :                 }
     678                 :             :             }
     679                 :             : 
     680                 :     8691339 :           if (best_edge) /* Suitable successor was found.  */
     681                 :             :             {
     682                 :     5124534 :               if (bb_visited_trace (best_edge->dest) == *n_traces)
     683                 :             :                 {
     684                 :             :                   /* We do nothing with one basic block loops.  */
     685                 :      363908 :                   if (best_edge->dest != bb)
     686                 :             :                     {
     687                 :      515724 :                       if (best_edge->count ()
     688                 :      171908 :                           > best_edge->dest->count.apply_scale (4, 5))
     689                 :             :                         {
     690                 :             :                           /* The loop has at least 4 iterations.  If the loop
     691                 :             :                              header is not the first block of the function
     692                 :             :                              we can rotate the loop.  */
     693                 :             : 
     694                 :      140384 :                           if (best_edge->dest
     695                 :      140384 :                               != ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb)
     696                 :             :                             {
     697                 :      140320 :                               if (dump_file)
     698                 :             :                                 {
     699                 :           0 :                                   fprintf (dump_file,
     700                 :             :                                            "Rotating loop %d - %d\n",
     701                 :             :                                            best_edge->dest->index, bb->index);
     702                 :             :                                 }
     703                 :      140320 :                               bb->aux = best_edge->dest;
     704                 :      140320 :                               bbd[best_edge->dest->index].in_trace =
     705                 :      140320 :                                                              (*n_traces) - 1;
     706                 :      140320 :                               bb = rotate_loop (best_edge, trace, *n_traces);
     707                 :             :                             }
     708                 :             :                         }
     709                 :             :                       else
     710                 :             :                         {
     711                 :             :                           /* The loop has less than 4 iterations.  */
     712                 :             : 
     713                 :       31524 :                           if (single_succ_p (bb)
     714                 :       42650 :                               && copy_bb_p (best_edge->dest,
     715                 :             :                                             optimize_edge_for_speed_p
     716                 :       11126 :                                             (best_edge)))
     717                 :             :                             {
     718                 :        6267 :                               bb = copy_bb (best_edge->dest, best_edge, bb,
     719                 :             :                                             *n_traces);
     720                 :        6267 :                               trace->length++;
     721                 :             :                             }
     722                 :             :                         }
     723                 :             :                     }
     724                 :             : 
     725                 :             :                   /* Terminate the trace.  */
     726                 :      363908 :                   break;
     727                 :             :                 }
     728                 :             :               else
     729                 :             :                 {
     730                 :             :                   /* Check for a situation
     731                 :             : 
     732                 :             :                     A
     733                 :             :                    /|
     734                 :             :                   B |
     735                 :             :                    \|
     736                 :             :                     C
     737                 :             : 
     738                 :             :                   where
     739                 :             :                   AB->count () + BC->count () >= AC->count ().
     740                 :             :                   (i.e. 2 * B->count >= AC->count )
     741                 :             :                   Best ordering is then A B C.
     742                 :             : 
     743                 :             :                   When optimizing for size, A B C is always the best order.
     744                 :             : 
     745                 :             :                   This situation is created for example by:
     746                 :             : 
     747                 :             :                   if (A) B;
     748                 :             :                   C;
     749                 :             : 
     750                 :             :                   */
     751                 :             : 
     752                 :    13129358 :                   FOR_EACH_EDGE (e, ei, bb->succs)
     753                 :     8383114 :                     if (e != best_edge
     754                 :             :                         && (e->flags & EDGE_CAN_FALLTHRU)
     755                 :     3634249 :                         && !(e->flags & EDGE_COMPLEX)
     756                 :     3169917 :                         && !bb_visited_trace (e->dest)
     757                 :     2828655 :                         && single_pred_p (e->dest)
     758                 :     1496236 :                         && !(e->flags & EDGE_CROSSING)
     759                 :     9275038 :                         && single_succ_p (e->dest)
     760                 :      906306 :                         && (single_succ_edge (e->dest)->flags
     761                 :      906306 :                             & EDGE_CAN_FALLTHRU)
     762                 :      808162 :                         && !(single_succ_edge (e->dest)->flags & EDGE_COMPLEX)
     763                 :      808162 :                         && single_succ (e->dest) == best_edge->dest
     764                 :     8680965 :                         && (e->dest->count * 2
     765                 :     8680965 :                             >= best_edge->count () || for_size))
     766                 :             :                       {
     767                 :       14382 :                         best_edge = e;
     768                 :       14382 :                         if (dump_file)
     769                 :           0 :                           fprintf (dump_file, "Selecting BB %d\n",
     770                 :           0 :                                    best_edge->dest->index);
     771                 :             :                         break;
     772                 :             :                       }
     773                 :             : 
     774                 :     4760626 :                   bb->aux = best_edge->dest;
     775                 :     4760626 :                   bbd[best_edge->dest->index].in_trace = (*n_traces) - 1;
     776                 :     4760626 :                   bb = best_edge->dest;
     777                 :             :                 }
     778                 :             :             }
     779                 :             :         }
     780                 :     8327431 :       while (best_edge);
     781                 :     3930713 :       trace->last = bb;
     782                 :     3930713 :       bbd[trace->first->index].start_of_trace = *n_traces - 1;
     783                 :     3930713 :       if (bbd[trace->last->index].end_of_trace != *n_traces - 1)
     784                 :             :         {
     785                 :     3930713 :           bbd[trace->last->index].end_of_trace = *n_traces - 1;
     786                 :             :           /* Update the cached maximum frequency for interesting predecessor
     787                 :             :              edges for successors of the new trace end.  */
     788                 :     8661849 :           FOR_EACH_EDGE (e, ei, trace->last->succs)
     789                 :     4731136 :             if (EDGE_FREQUENCY (e) > bbd[e->dest->index].priority)
     790                 :     3624211 :               bbd[e->dest->index].priority = EDGE_FREQUENCY (e);
     791                 :             :         }
     792                 :             : 
     793                 :             :       /* The trace is terminated so we have to recount the keys in heap
     794                 :             :          (some block can have a lower key because now one of its predecessors
     795                 :             :          is an end of the trace).  */
     796                 :     8661849 :       FOR_EACH_EDGE (e, ei, bb->succs)
     797                 :             :         {
     798                 :     7753795 :           if (e->dest == EXIT_BLOCK_PTR_FOR_FN (cfun)
     799                 :     4731136 :               || bb_visited_trace (e->dest))
     800                 :     3022659 :             continue;
     801                 :             : 
     802                 :     1708477 :           if (bbd[e->dest->index].heap)
     803                 :             :             {
     804                 :     1708477 :               key = bb_to_key (e->dest);
     805                 :     1708477 :               if (key != bbd[e->dest->index].node->get_key ())
     806                 :             :                 {
     807                 :     1109682 :                   if (dump_file)
     808                 :             :                     {
     809                 :          41 :                       fprintf (dump_file,
     810                 :             :                                "Changing key for bb %d from %ld to %ld.\n",
     811                 :             :                                e->dest->index,
     812                 :          41 :                                (long) bbd[e->dest->index].node->get_key (), key);
     813                 :             :                     }
     814                 :     1109682 :                   bbd[e->dest->index].heap->replace_key
     815                 :     1109682 :                     (bbd[e->dest->index].node, key);
     816                 :             :                 }
     817                 :             :             }
     818                 :             :         }
     819                 :             :     }
     820                 :             : 
     821                 :     2281372 :   delete (*heap);
     822                 :             : 
     823                 :             :   /* "Return" the new heap.  */
     824                 :     2281372 :   *heap = new_heap;
     825                 :     2281372 : }
     826                 :             : 
     827                 :             : /* Create a duplicate of the basic block OLD_BB and redirect edge E to it, add
     828                 :             :    it to trace after BB, mark OLD_BB visited and update pass' data structures
     829                 :             :    (TRACE is a number of trace which OLD_BB is duplicated to).  */
     830                 :             : 
     831                 :             : static basic_block
     832                 :      251498 : copy_bb (basic_block old_bb, edge e, basic_block bb, int trace)
     833                 :             : {
     834                 :      251498 :   basic_block new_bb;
     835                 :             : 
     836                 :      251498 :   new_bb = duplicate_block (old_bb, e, bb);
     837                 :      251498 :   BB_COPY_PARTITION (new_bb, old_bb);
     838                 :             : 
     839                 :      251498 :   gcc_assert (e->dest == new_bb);
     840                 :             : 
     841                 :      251498 :   if (dump_file)
     842                 :           0 :     fprintf (dump_file,
     843                 :             :              "Duplicated bb %d (created bb %d)\n",
     844                 :             :              old_bb->index, new_bb->index);
     845                 :             : 
     846                 :      251498 :   if (new_bb->index >= array_size
     847                 :      251159 :       || last_basic_block_for_fn (cfun) > array_size)
     848                 :             :     {
     849                 :         339 :       int i;
     850                 :         339 :       int new_size;
     851                 :             : 
     852                 :         339 :       new_size = MAX (last_basic_block_for_fn (cfun), new_bb->index + 1);
     853                 :         339 :       new_size = GET_ARRAY_SIZE (new_size);
     854                 :         339 :       bbd = XRESIZEVEC (bbro_basic_block_data, bbd, new_size);
     855                 :       31304 :       for (i = array_size; i < new_size; i++)
     856                 :             :         {
     857                 :       30965 :           bbd[i].start_of_trace = -1;
     858                 :       30965 :           bbd[i].end_of_trace = -1;
     859                 :       30965 :           bbd[i].in_trace = -1;
     860                 :       30965 :           bbd[i].visited = 0;
     861                 :       30965 :           bbd[i].priority = -1;
     862                 :       30965 :           bbd[i].heap = NULL;
     863                 :       30965 :           bbd[i].node = NULL;
     864                 :             :         }
     865                 :         339 :       array_size = new_size;
     866                 :             : 
     867                 :         339 :       if (dump_file)
     868                 :             :         {
     869                 :           0 :           fprintf (dump_file,
     870                 :             :                    "Growing the dynamic array to %d elements.\n",
     871                 :             :                    array_size);
     872                 :             :         }
     873                 :             :     }
     874                 :             : 
     875                 :      251498 :   gcc_assert (!bb_visited_trace (e->dest));
     876                 :      251498 :   mark_bb_visited (new_bb, trace);
     877                 :      251498 :   new_bb->aux = bb->aux;
     878                 :      251498 :   bb->aux = new_bb;
     879                 :             : 
     880                 :      251498 :   bbd[new_bb->index].in_trace = trace;
     881                 :             : 
     882                 :      251498 :   return new_bb;
     883                 :             : }
     884                 :             : 
     885                 :             : /* Compute and return the key (for the heap) of the basic block BB.  */
     886                 :             : 
     887                 :             : static long
     888                 :     8692064 : bb_to_key (basic_block bb)
     889                 :             : {
     890                 :     8692064 :   edge e;
     891                 :     8692064 :   edge_iterator ei;
     892                 :             : 
     893                 :             :   /* Use index as key to align with its original order.  */
     894                 :     8692064 :   if (optimize_function_for_size_p (cfun))
     895                 :      554566 :     return bb->index;
     896                 :             : 
     897                 :             :   /* Do not start in probably never executed blocks.  */
     898                 :             : 
     899                 :     8137498 :   if (BB_PARTITION (bb) == BB_COLD_PARTITION
     900                 :     8137498 :       || probably_never_executed_bb_p (cfun, bb))
     901                 :     1777926 :     return BB_FREQ_MAX;
     902                 :             : 
     903                 :             :   /* Prefer blocks whose predecessor is an end of some trace
     904                 :             :      or whose predecessor edge is EDGE_DFS_BACK.  */
     905                 :     6359572 :   int priority = bbd[bb->index].priority;
     906                 :     6359572 :   if (priority == -1)
     907                 :             :     {
     908                 :     3603661 :       priority = 0;
     909                 :     8768478 :       FOR_EACH_EDGE (e, ei, bb->preds)
     910                 :             :         {
     911                 :     5164817 :           if ((e->src != ENTRY_BLOCK_PTR_FOR_FN (cfun)
     912                 :     4627789 :                && bbd[e->src->index].end_of_trace >= 0)
     913                 :     5164817 :               || (e->flags & EDGE_DFS_BACK))
     914                 :             :             {
     915                 :       37583 :               int edge_freq = EDGE_FREQUENCY (e);
     916                 :             : 
     917                 :       37583 :               if (edge_freq > priority)
     918                 :     5164817 :                 priority = edge_freq;
     919                 :             :             }
     920                 :             :         }
     921                 :     3603661 :       bbd[bb->index].priority = priority;
     922                 :             :     }
     923                 :             : 
     924                 :     6359572 :   if (priority)
     925                 :             :     /* The block with priority should have significantly lower key.  */
     926                 :     1344530 :     return -(100 * BB_FREQ_MAX + 100 * priority + bb->count.to_frequency (cfun));
     927                 :             : 
     928                 :     5015042 :   return -bb->count.to_frequency (cfun);
     929                 :             : }
     930                 :             : 
     931                 :             : /* Return true when the edge E from basic block BB is better than the temporary
     932                 :             :    best edge (details are in function).  The probability of edge E is PROB. The
     933                 :             :    count of the successor is COUNT.  The current best probability is
     934                 :             :    BEST_PROB, the best count is BEST_COUNT.
     935                 :             :    The edge is considered to be equivalent when PROB does not differ much from
     936                 :             :    BEST_PROB; similarly for count.  */
     937                 :             : 
     938                 :             : static bool
     939                 :     5815237 : better_edge_p (const_basic_block bb, const_edge e, profile_probability prob,
     940                 :             :                profile_count count, profile_probability best_prob,
     941                 :             :                profile_count best_count, const_edge cur_best_edge)
     942                 :             : {
     943                 :     5815237 :   bool is_better_edge;
     944                 :             : 
     945                 :             :   /* The BEST_* values do not have to be best, but can be a bit smaller than
     946                 :             :      maximum values.  */
     947                 :     5815237 :   profile_probability diff_prob = best_prob / 10;
     948                 :             : 
     949                 :             :   /* The smaller one is better to keep the original order.  */
     950                 :     5815237 :   if (optimize_function_for_size_p (cfun))
     951                 :      362133 :     return !cur_best_edge
     952                 :      678457 :            || cur_best_edge->dest->index > e->dest->index;
     953                 :             : 
     954                 :             :   /* Those edges are so expensive that continuing a trace is not useful
     955                 :             :      performance wise.  */
     956                 :     5453104 :   if (e->flags & (EDGE_ABNORMAL | EDGE_EH))
     957                 :             :     return false;
     958                 :             : 
     959                 :    10611411 :   if (prob > best_prob + diff_prob
     960                 :     5158307 :       || (!best_prob.initialized_p ()
     961                 :     9550226 :           && prob > profile_probability::guessed_never ()))
     962                 :             :     /* The edge has higher probability than the temporary best edge.  */
     963                 :             :     is_better_edge = true;
     964                 :      811903 :   else if (prob < best_prob - diff_prob)
     965                 :             :     /* The edge has lower probability than the temporary best edge.  */
     966                 :             :     is_better_edge = false;
     967                 :             :   else
     968                 :             :     {
     969                 :      254219 :       profile_count diff_count = best_count / 10;
     970                 :      254219 :       if (count < best_count - diff_count
     971                 :      254219 :           || (!best_count.initialized_p ()
     972                 :       64481 :               && count.nonzero_p ()))
     973                 :             :         /* The edge and the temporary best edge  have almost equivalent
     974                 :             :            probabilities.  The higher countuency of a successor now means
     975                 :             :            that there is another edge going into that successor.
     976                 :             :            This successor has lower countuency so it is better.  */
     977                 :             :         is_better_edge = true;
     978                 :      189738 :       else if (count > best_count + diff_count)
     979                 :             :         /* This successor has higher countuency so it is worse.  */
     980                 :             :         is_better_edge = false;
     981                 :       98517 :       else if (e->dest->prev_bb == bb)
     982                 :             :         /* The edges have equivalent probabilities and the successors
     983                 :             :            have equivalent frequencies.  Select the previous successor.  */
     984                 :             :         is_better_edge = true;
     985                 :             :       else
     986                 :      254219 :         is_better_edge = false;
     987                 :             :     }
     988                 :             : 
     989                 :             :   return is_better_edge;
     990                 :             : }
     991                 :             : 
     992                 :             : /* Return true when the edge E is better than the temporary best edge
     993                 :             :    CUR_BEST_EDGE.  If SRC_INDEX_P is true, the function compares the src bb of
     994                 :             :    E and CUR_BEST_EDGE; otherwise it will compare the dest bb.
     995                 :             :    BEST_LEN is the trace length of src (or dest) bb in CUR_BEST_EDGE.
     996                 :             :    TRACES record the information about traces.
     997                 :             :    When optimizing for size, the edge with smaller index is better.
     998                 :             :    When optimizing for speed, the edge with bigger probability or longer trace
     999                 :             :    is better.  */
    1000                 :             : 
    1001                 :             : static bool
    1002                 :     1604019 : connect_better_edge_p (const_edge e, bool src_index_p, int best_len,
    1003                 :             :                        const_edge cur_best_edge, struct trace *traces)
    1004                 :             : {
    1005                 :     1604019 :   int e_index;
    1006                 :     1604019 :   int b_index;
    1007                 :     1604019 :   bool is_better_edge;
    1008                 :             : 
    1009                 :     1604019 :   if (!cur_best_edge)
    1010                 :             :     return true;
    1011                 :             : 
    1012                 :      406446 :   if (optimize_function_for_size_p (cfun))
    1013                 :             :     {
    1014                 :       43985 :       e_index = src_index_p ? e->src->index : e->dest->index;
    1015                 :       43985 :       b_index = src_index_p ? cur_best_edge->src->index
    1016                 :       41830 :                               : cur_best_edge->dest->index;
    1017                 :             :       /* The smaller one is better to keep the original order.  */
    1018                 :       43985 :       return b_index > e_index;
    1019                 :             :     }
    1020                 :             : 
    1021                 :      362461 :   if (src_index_p)
    1022                 :             :     {
    1023                 :       33765 :       e_index = e->src->index;
    1024                 :             : 
    1025                 :             :       /* We are looking for predecessor, so probabilities are not that
    1026                 :             :          informative.  We do not want to connect A to B because A has
    1027                 :             :          only one successor (probability is 100%) while there is edge
    1028                 :             :          A' to B where probability is 90% but which is much more frequent.  */
    1029                 :       33765 :       if (e->count () > cur_best_edge->count ())
    1030                 :             :         /* The edge has higher probability than the temporary best edge.  */
    1031                 :             :         is_better_edge = true;
    1032                 :       24097 :       else if (e->count () < cur_best_edge->count ())
    1033                 :             :         /* The edge has lower probability than the temporary best edge.  */
    1034                 :             :         is_better_edge = false;
    1035                 :        9696 :       else if (e->probability > cur_best_edge->probability)
    1036                 :             :         /* The edge has higher probability than the temporary best edge.  */
    1037                 :             :         is_better_edge = true;
    1038                 :        9084 :       else if (e->probability < cur_best_edge->probability)
    1039                 :             :         /* The edge has lower probability than the temporary best edge.  */
    1040                 :             :         is_better_edge = false;
    1041                 :        8040 :       else if (traces[bbd[e_index].end_of_trace].length > best_len)
    1042                 :             :         /* The edge and the temporary best edge have equivalent probabilities.
    1043                 :             :            The edge with longer trace is better.  */
    1044                 :             :         is_better_edge = true;
    1045                 :             :       else
    1046                 :             :         is_better_edge = false;
    1047                 :             :     }
    1048                 :             :   else
    1049                 :             :     {
    1050                 :      328696 :       e_index = e->dest->index;
    1051                 :             : 
    1052                 :      328696 :       if (e->probability > cur_best_edge->probability)
    1053                 :             :         /* The edge has higher probability than the temporary best edge.  */
    1054                 :             :         is_better_edge = true;
    1055                 :      207417 :       else if (e->probability < cur_best_edge->probability)
    1056                 :             :         /* The edge has lower probability than the temporary best edge.  */
    1057                 :             :         is_better_edge = false;
    1058                 :       87980 :       else if (traces[bbd[e_index].start_of_trace].length > best_len)
    1059                 :             :         /* The edge and the temporary best edge have equivalent probabilities.
    1060                 :             :            The edge with longer trace is better.  */
    1061                 :             :         is_better_edge = true;
    1062                 :             :       else
    1063                 :             :         is_better_edge = false;
    1064                 :             :     }
    1065                 :             : 
    1066                 :             :   return is_better_edge;
    1067                 :             : }
    1068                 :             : 
    1069                 :             : /* Connect traces in array TRACES, N_TRACES is the count of traces.  */
    1070                 :             : 
    1071                 :             : static void
    1072                 :      570343 : connect_traces (int n_traces, struct trace *traces)
    1073                 :             : {
    1074                 :      570343 :   int i;
    1075                 :      570343 :   bool *connected;
    1076                 :      570343 :   bool two_passes;
    1077                 :      570343 :   int last_trace;
    1078                 :      570343 :   int current_pass;
    1079                 :      570343 :   int current_partition;
    1080                 :      570343 :   profile_count count_threshold;
    1081                 :      570343 :   bool for_size = optimize_function_for_size_p (cfun);
    1082                 :             : 
    1083                 :      570343 :   count_threshold = max_entry_count.apply_scale (DUPLICATION_THRESHOLD, 1000);
    1084                 :             : 
    1085                 :      570343 :   connected = XCNEWVEC (bool, n_traces);
    1086                 :      570343 :   last_trace = -1;
    1087                 :      570343 :   current_pass = 1;
    1088                 :      570343 :   current_partition = BB_PARTITION (traces[0].first);
    1089                 :      570343 :   two_passes = false;
    1090                 :             : 
    1091                 :      570343 :   if (crtl->has_bb_partition)
    1092                 :      469361 :     for (i = 0; i < n_traces && !two_passes; i++)
    1093                 :      409557 :       if (BB_PARTITION (traces[0].first)
    1094                 :      409557 :           != BB_PARTITION (traces[i].first))
    1095                 :       59802 :         two_passes = true;
    1096                 :             : 
    1097                 :     5051460 :   for (i = 0; i < n_traces || (two_passes && current_pass == 1) ; i++)
    1098                 :             :     {
    1099                 :     4481117 :       int t = i;
    1100                 :     4481117 :       int t2;
    1101                 :     4481117 :       edge e, best;
    1102                 :     4481117 :       int best_len;
    1103                 :             : 
    1104                 :     4481117 :       if (i >= n_traces)
    1105                 :             :         {
    1106                 :       59802 :           gcc_assert (two_passes && current_pass == 1);
    1107                 :       59802 :           i = 0;
    1108                 :       59802 :           t = i;
    1109                 :       59802 :           current_pass = 2;
    1110                 :       59802 :           if (current_partition == BB_HOT_PARTITION)
    1111                 :             :             current_partition = BB_COLD_PARTITION;
    1112                 :             :           else
    1113                 :           0 :             current_partition = BB_HOT_PARTITION;
    1114                 :             :         }
    1115                 :             : 
    1116                 :     4481117 :       if (connected[t])
    1117                 :     1772808 :         continue;
    1118                 :             : 
    1119                 :     2863490 :       if (two_passes
    1120                 :      555281 :           && BB_PARTITION (traces[t].first) != current_partition)
    1121                 :      155181 :         continue;
    1122                 :             : 
    1123                 :     2708309 :       connected[t] = true;
    1124                 :             : 
    1125                 :             :       /* Find the predecessor traces.  */
    1126                 :     2794434 :       for (t2 = t; t2 > 0;)
    1127                 :             :         {
    1128                 :     2224091 :           edge_iterator ei;
    1129                 :     2224091 :           best = NULL;
    1130                 :     2224091 :           best_len = 0;
    1131                 :     5593114 :           FOR_EACH_EDGE (e, ei, traces[t2].first->preds)
    1132                 :             :             {
    1133                 :     3369023 :               int si = e->src->index;
    1134                 :             : 
    1135                 :     3369023 :               if (e->src != ENTRY_BLOCK_PTR_FOR_FN (cfun)
    1136                 :             :                   && (e->flags & EDGE_CAN_FALLTHRU)
    1137                 :     3369023 :                   && !(e->flags & EDGE_COMPLEX)
    1138                 :     2645173 :                   && bbd[si].end_of_trace >= 0
    1139                 :      440502 :                   && !connected[bbd[si].end_of_trace]
    1140                 :      122060 :                   && (BB_PARTITION (e->src) == current_partition)
    1141                 :     3491068 :                   && connect_better_edge_p (e, true, best_len, best, traces))
    1142                 :             :                 {
    1143                 :       97247 :                   best = e;
    1144                 :       97247 :                   best_len = traces[bbd[si].end_of_trace].length;
    1145                 :             :                 }
    1146                 :             :             }
    1147                 :     2224091 :           if (best)
    1148                 :             :             {
    1149                 :       86125 :               best->src->aux = best->dest;
    1150                 :       86125 :               t2 = bbd[best->src->index].end_of_trace;
    1151                 :       86125 :               connected[t2] = true;
    1152                 :             : 
    1153                 :       86125 :               if (dump_file)
    1154                 :             :                 {
    1155                 :           0 :                   fprintf (dump_file, "Connection: %d %d\n",
    1156                 :             :                            best->src->index, best->dest->index);
    1157                 :             :                 }
    1158                 :             :             }
    1159                 :             :           else
    1160                 :             :             break;
    1161                 :             :         }
    1162                 :             : 
    1163                 :     2708309 :       if (last_trace >= 0)
    1164                 :     2137966 :         traces[last_trace].last->aux = traces[t2].first;
    1165                 :             :       last_trace = t;
    1166                 :             : 
    1167                 :             :       /* Find the successor traces.  */
    1168                 :     4980867 :       while (1)
    1169                 :             :         {
    1170                 :             :           /* Find the continuation of the chain.  */
    1171                 :     3844588 :           edge_iterator ei;
    1172                 :     3844588 :           best = NULL;
    1173                 :     3844588 :           best_len = 0;
    1174                 :     8455250 :           FOR_EACH_EDGE (e, ei, traces[t].last->succs)
    1175                 :             :             {
    1176                 :     4610662 :               int di = e->dest->index;
    1177                 :             : 
    1178                 :     4610662 :               if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
    1179                 :             :                   && (e->flags & EDGE_CAN_FALLTHRU)
    1180                 :     3963576 :                   && !(e->flags & EDGE_COMPLEX)
    1181                 :     3675624 :                   && bbd[di].start_of_trace >= 0
    1182                 :     1989230 :                   && !connected[bbd[di].start_of_trace]
    1183                 :     1497502 :                   && (BB_PARTITION (e->dest) == current_partition)
    1184                 :     6092636 :                   && connect_better_edge_p (e, false, best_len, best, traces))
    1185                 :             :                 {
    1186                 :     1290076 :                   best = e;
    1187                 :     1290076 :                   best_len = traces[bbd[di].start_of_trace].length;
    1188                 :             :                 }
    1189                 :             :             }
    1190                 :             : 
    1191                 :     3844588 :           if (for_size)
    1192                 :             :             {
    1193                 :      247095 :               if (!best)
    1194                 :             :                 /* Stop finding the successor traces.  */
    1195                 :             :                 break;
    1196                 :             : 
    1197                 :             :               /* It is OK to connect block n with block n + 1 or a block
    1198                 :             :                  before n.  For others, only connect to the loop header.  */
    1199                 :      176684 :               if (best->dest->index > (traces[t].last->index + 1))
    1200                 :             :                 {
    1201                 :       18932 :                   int count = EDGE_COUNT (best->dest->preds);
    1202                 :             : 
    1203                 :      177659 :                   FOR_EACH_EDGE (e, ei, best->dest->preds)
    1204                 :      158727 :                     if (e->flags & EDGE_DFS_BACK)
    1205                 :        4483 :                       count--;
    1206                 :             : 
    1207                 :             :                   /* If dest has multiple predecessors, skip it.  We expect
    1208                 :             :                      that one predecessor with smaller index connects with it
    1209                 :             :                      later.  */
    1210                 :       18932 :                   if (count != 1) 
    1211                 :             :                     break;
    1212                 :             :                 }
    1213                 :             : 
    1214                 :             :               /* Only connect Trace n with Trace n + 1.  It is conservative
    1215                 :             :                  to keep the order as close as possible to the original order.
    1216                 :             :                  It also helps to reduce long jumps.  */
    1217                 :      163643 :               if (last_trace != bbd[best->dest->index].start_of_trace - 1)
    1218                 :             :                 break;
    1219                 :             : 
    1220                 :      161565 :               if (dump_file)
    1221                 :           5 :                 fprintf (dump_file, "Connection: %d %d\n",
    1222                 :           5 :                          best->src->index, best->dest->index);
    1223                 :             : 
    1224                 :      161565 :               t = bbd[best->dest->index].start_of_trace;
    1225                 :      161565 :               traces[last_trace].last->aux = traces[t].first;
    1226                 :      161565 :               connected[t] = true;
    1227                 :      161565 :               last_trace = t;
    1228                 :             :             }
    1229                 :     3597493 :           else if (best)
    1230                 :             :             {
    1231                 :      934764 :               if (dump_file)
    1232                 :             :                 {
    1233                 :          45 :                   fprintf (dump_file, "Connection: %d %d\n",
    1234                 :          45 :                            best->src->index, best->dest->index);
    1235                 :             :                 }
    1236                 :      934764 :               t = bbd[best->dest->index].start_of_trace;
    1237                 :      934764 :               traces[last_trace].last->aux = traces[t].first;
    1238                 :      934764 :               connected[t] = true;
    1239                 :      934764 :               last_trace = t;
    1240                 :             :             }
    1241                 :             :           else
    1242                 :             :             {
    1243                 :             :               /* Try to connect the traces by duplication of 1 block.  */
    1244                 :     2662729 :               edge e2;
    1245                 :     2662729 :               basic_block next_bb = NULL;
    1246                 :     2662729 :               bool try_copy = false;
    1247                 :             : 
    1248                 :     5196108 :               FOR_EACH_EDGE (e, ei, traces[t].last->succs)
    1249                 :     2533379 :                 if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
    1250                 :             :                     && (e->flags & EDGE_CAN_FALLTHRU)
    1251                 :     1920602 :                     && !(e->flags & EDGE_COMPLEX)
    1252                 :     6092726 :                     && (!best || e->probability > best->probability))
    1253                 :             :                   {
    1254                 :     1624408 :                     edge_iterator ei;
    1255                 :     1624408 :                     edge best2 = NULL;
    1256                 :     1624408 :                     int best2_len = 0;
    1257                 :             : 
    1258                 :             :                     /* If the destination is a start of a trace which is only
    1259                 :             :                        one block long, then no need to search the successor
    1260                 :             :                        blocks of the trace.  Accept it.  */
    1261                 :     1624408 :                     if (bbd[e->dest->index].start_of_trace >= 0
    1262                 :      366250 :                         && traces[bbd[e->dest->index].start_of_trace].length
    1263                 :             :                            == 1)
    1264                 :             :                       {
    1265                 :      268120 :                         best = e;
    1266                 :      268120 :                         try_copy = true;
    1267                 :      268120 :                         continue;
    1268                 :             :                       }
    1269                 :             : 
    1270                 :     3476895 :                     FOR_EACH_EDGE (e2, ei, e->dest->succs)
    1271                 :             :                       {
    1272                 :     2120607 :                         int di = e2->dest->index;
    1273                 :             : 
    1274                 :     2120607 :                         if (e2->dest == EXIT_BLOCK_PTR_FOR_FN (cfun)
    1275                 :     2120607 :                             || ((e2->flags & EDGE_CAN_FALLTHRU)
    1276                 :     1862870 :                                 && !(e2->flags & EDGE_COMPLEX)
    1277                 :     1745713 :                                 && bbd[di].start_of_trace >= 0
    1278                 :      634182 :                                 && !connected[bbd[di].start_of_trace]
    1279                 :      213381 :                                 && BB_PARTITION (e2->dest) == current_partition
    1280                 :      201018 :                                 && e2->count () >= count_threshold
    1281                 :       97034 :                                 && (!best2
    1282                 :         763 :                                     || e2->probability > best2->probability
    1283                 :     1766523 :                                     || (e2->probability == best2->probability
    1284                 :         125 :                                         && traces[bbd[di].start_of_trace].length
    1285                 :             :                                            > best2_len))))
    1286                 :             :                           {
    1287                 :      354209 :                             best = e;
    1288                 :      354209 :                             best2 = e2;
    1289                 :      354209 :                             if (e2->dest != EXIT_BLOCK_PTR_FOR_FN (cfun))
    1290                 :       96472 :                               best2_len = traces[bbd[di].start_of_trace].length;
    1291                 :             :                             else
    1292                 :             :                               best2_len = INT_MAX;
    1293                 :             :                             next_bb = e2->dest;
    1294                 :             :                             try_copy = true;
    1295                 :             :                           }
    1296                 :             :                       }
    1297                 :             :                   }
    1298                 :             : 
    1299                 :             :               /* Copy tiny blocks always; copy larger blocks only when the
    1300                 :             :                  edge is traversed frequently enough.  */
    1301                 :     2662729 :               if (try_copy
    1302                 :      615905 :                   && BB_PARTITION (best->src) == BB_PARTITION (best->dest)
    1303                 :     3276841 :                   && copy_bb_p (best->dest,
    1304                 :      614112 :                                 optimize_edge_for_speed_p (best)
    1305                 :      976877 :                                 && (!best->count ().initialized_p ()
    1306                 :     2780185 :                                     || best->count () >= count_threshold)))
    1307                 :             :                 {
    1308                 :      245231 :                   basic_block new_bb;
    1309                 :             : 
    1310                 :      245231 :                   if (dump_file)
    1311                 :             :                     {
    1312                 :           0 :                       fprintf (dump_file, "Connection: %d %d ",
    1313                 :           0 :                                traces[t].last->index, best->dest->index);
    1314                 :           0 :                       if (!next_bb)
    1315                 :           0 :                         fputc ('\n', dump_file);
    1316                 :           0 :                       else if (next_bb == EXIT_BLOCK_PTR_FOR_FN (cfun))
    1317                 :           0 :                         fprintf (dump_file, "exit\n");
    1318                 :             :                       else
    1319                 :           0 :                         fprintf (dump_file, "%d\n", next_bb->index);
    1320                 :             :                     }
    1321                 :             : 
    1322                 :      245231 :                   new_bb = copy_bb (best->dest, best, traces[t].last, t);
    1323                 :      245231 :                   traces[t].last = new_bb;
    1324                 :      245231 :                   if (next_bb && next_bb != EXIT_BLOCK_PTR_FOR_FN (cfun))
    1325                 :             :                     {
    1326                 :       39950 :                       t = bbd[next_bb->index].start_of_trace;
    1327                 :       39950 :                       traces[last_trace].last->aux = traces[t].first;
    1328                 :       39950 :                       connected[t] = true;
    1329                 :       39950 :                       last_trace = t;
    1330                 :             :                     }
    1331                 :             :                   else
    1332                 :             :                     break;      /* Stop finding the successor traces.  */
    1333                 :             :                 }
    1334                 :             :               else
    1335                 :             :                 break;  /* Stop finding the successor traces.  */
    1336                 :             :             }
    1337                 :     1136279 :         }
    1338                 :             :     }
    1339                 :             : 
    1340                 :      570343 :   if (dump_file)
    1341                 :             :     {
    1342                 :          23 :       basic_block bb;
    1343                 :             : 
    1344                 :          23 :       fprintf (dump_file, "Final order:\n");
    1345                 :         207 :       for (bb = traces[0].first; bb; bb = (basic_block) bb->aux)
    1346                 :         184 :         fprintf (dump_file, "%d ", bb->index);
    1347                 :          23 :       fprintf (dump_file, "\n");
    1348                 :          23 :       fflush (dump_file);
    1349                 :             :     }
    1350                 :             : 
    1351                 :      570343 :   FREE (connected);
    1352                 :      570343 : }
    1353                 :             : 
    1354                 :             : /* Return true when BB can and should be copied. CODE_MAY_GROW is true
    1355                 :             :    when code size is allowed to grow by duplication.  */
    1356                 :             : 
    1357                 :             : static bool
    1358                 :     3019380 : copy_bb_p (const_basic_block bb, int code_may_grow)
    1359                 :             : {
    1360                 :     3019380 :   unsigned int size = 0;
    1361                 :     3019380 :   unsigned int max_size = uncond_jump_length;
    1362                 :     3019380 :   rtx_insn *insn;
    1363                 :             : 
    1364                 :     5722677 :   if (EDGE_COUNT (bb->preds) < 2)
    1365                 :             :     return false;
    1366                 :     3016890 :   if (!can_duplicate_block_p (bb))
    1367                 :             :     return false;
    1368                 :             : 
    1369                 :             :   /* Avoid duplicating blocks which have many successors (PR/13430).  */
    1370                 :     3015644 :   if (EDGE_COUNT (bb->succs) > 8)
    1371                 :             :     return false;
    1372                 :             : 
    1373                 :     3015640 :   if (code_may_grow && optimize_bb_for_speed_p (bb))
    1374                 :      260399 :     max_size *= param_max_grow_copy_bb_insns;
    1375                 :             : 
    1376                 :    19596447 :   FOR_BB_INSNS (bb, insn)
    1377                 :             :     {
    1378                 :    19282854 :       if (INSN_P (insn))
    1379                 :             :         {
    1380                 :    12388272 :           size += get_attr_min_length (insn);
    1381                 :    12388272 :           if (size > max_size)
    1382                 :             :             break;
    1383                 :             :         }
    1384                 :             :     }
    1385                 :             : 
    1386                 :     3015640 :   if (size <= max_size)
    1387                 :             :     return true;
    1388                 :             : 
    1389                 :     2702047 :   if (dump_file)
    1390                 :             :     {
    1391                 :          57 :       fprintf (dump_file,
    1392                 :             :                "Block %d can't be copied because its size = %u.\n",
    1393                 :          57 :                bb->index, size);
    1394                 :             :     }
    1395                 :             : 
    1396                 :             :   return false;
    1397                 :             : }
    1398                 :             : 
    1399                 :             : /* Return the length of unconditional jump instruction.  */
    1400                 :             : 
    1401                 :             : int
    1402                 :      730151 : get_uncond_jump_length (void)
    1403                 :             : {
    1404                 :      730151 :   unsigned int length;
    1405                 :             : 
    1406                 :      730151 :   start_sequence ();
    1407                 :      730151 :   rtx_code_label *label = emit_label (gen_label_rtx ());
    1408                 :      730151 :   rtx_insn *jump = emit_jump_insn (targetm.gen_jump (label));
    1409                 :      730151 :   length = get_attr_min_length (jump);
    1410                 :      730151 :   end_sequence ();
    1411                 :             : 
    1412                 :      730151 :   gcc_assert (length < INT_MAX);
    1413                 :      730151 :   return length;
    1414                 :             : }
    1415                 :             : 
    1416                 :             : /* Create a forwarder block to OLD_BB starting with NEW_LABEL and in the
    1417                 :             :    other partition wrt OLD_BB.  */
    1418                 :             : 
    1419                 :             : static basic_block
    1420                 :        5117 : create_eh_forwarder_block (rtx_code_label *new_label, basic_block old_bb)
    1421                 :             : {
    1422                 :             :   /* Split OLD_BB, so that EH pads have always only incoming EH edges,
    1423                 :             :      bb_has_eh_pred bbs are treated specially by DF infrastructure.  */
    1424                 :        5117 :   old_bb = split_block_after_labels (old_bb)->dest;
    1425                 :             : 
    1426                 :             :   /* Put the new label and a jump in the new basic block.  */
    1427                 :        5117 :   rtx_insn *label = emit_label (new_label);
    1428                 :        5117 :   rtx_code_label *old_label = block_label (old_bb);
    1429                 :        5117 :   rtx_insn *jump = emit_jump_insn (targetm.gen_jump (old_label));
    1430                 :        5117 :   JUMP_LABEL (jump) = old_label;
    1431                 :             : 
    1432                 :             :   /* Create the new basic block and put it in last position.  */
    1433                 :        5117 :   basic_block last_bb = EXIT_BLOCK_PTR_FOR_FN (cfun)->prev_bb;
    1434                 :        5117 :   basic_block new_bb = create_basic_block (label, jump, last_bb);
    1435                 :        5117 :   new_bb->aux = last_bb->aux;
    1436                 :        5117 :   new_bb->count = old_bb->count;
    1437                 :        5117 :   last_bb->aux = new_bb;
    1438                 :             : 
    1439                 :        5117 :   emit_barrier_after_bb (new_bb);
    1440                 :             : 
    1441                 :        5117 :   make_single_succ_edge (new_bb, old_bb, 0);
    1442                 :             : 
    1443                 :             :   /* Make sure the new basic block is in the other partition.  */
    1444                 :        5117 :   unsigned new_partition = BB_PARTITION (old_bb);
    1445                 :        5117 :   new_partition ^= BB_HOT_PARTITION | BB_COLD_PARTITION;
    1446                 :        5117 :   BB_SET_PARTITION (new_bb, new_partition);
    1447                 :             : 
    1448                 :        5117 :   return new_bb;
    1449                 :             : }
    1450                 :             : 
    1451                 :             : /* The common landing pad in block OLD_BB has edges from both partitions.
    1452                 :             :    Add a new landing pad that will just jump to the old one and split the
    1453                 :             :    edges so that no EH edge crosses partitions.  */
    1454                 :             : 
    1455                 :             : static void
    1456                 :           0 : sjlj_fix_up_crossing_landing_pad (basic_block old_bb)
    1457                 :             : {
    1458                 :           0 :   const unsigned lp_len = cfun->eh->lp_array->length ();
    1459                 :           0 :   edge_iterator ei;
    1460                 :           0 :   edge e;
    1461                 :             : 
    1462                 :             :   /* Generate the new common landing-pad label.  */
    1463                 :           0 :   rtx_code_label *new_label = gen_label_rtx ();
    1464                 :           0 :   LABEL_PRESERVE_P (new_label) = 1;
    1465                 :             : 
    1466                 :             :   /* Create the forwarder block.  */
    1467                 :           0 :   basic_block new_bb = create_eh_forwarder_block (new_label, old_bb);
    1468                 :             : 
    1469                 :             :   /* Create the map from old to new lp index and initialize it.  */
    1470                 :           0 :   unsigned *index_map = (unsigned *) alloca (lp_len * sizeof (unsigned));
    1471                 :           0 :   memset (index_map, 0, lp_len * sizeof (unsigned));
    1472                 :             : 
    1473                 :             :   /* Fix up the edges.  */
    1474                 :           0 :   for (ei = ei_start (old_bb->preds); (e = ei_safe_edge (ei)) != NULL; )
    1475                 :           0 :     if (e->src != new_bb && BB_PARTITION (e->src) == BB_PARTITION (new_bb))
    1476                 :             :       {
    1477                 :           0 :         rtx_insn *insn = BB_END (e->src);
    1478                 :           0 :         rtx note = find_reg_note (insn, REG_EH_REGION, NULL_RTX);
    1479                 :             : 
    1480                 :           0 :         gcc_assert (note != NULL);
    1481                 :           0 :         const unsigned old_index = INTVAL (XEXP (note, 0));
    1482                 :             : 
    1483                 :             :         /* Generate the new landing-pad structure.  */
    1484                 :           0 :         if (index_map[old_index] == 0)
    1485                 :             :           {
    1486                 :           0 :             eh_landing_pad old_lp = (*cfun->eh->lp_array)[old_index];
    1487                 :           0 :             eh_landing_pad new_lp = gen_eh_landing_pad (old_lp->region);
    1488                 :           0 :             new_lp->post_landing_pad = old_lp->post_landing_pad;
    1489                 :           0 :             new_lp->landing_pad = new_label;
    1490                 :           0 :             index_map[old_index] = new_lp->index;
    1491                 :             :           }
    1492                 :           0 :         XEXP (note, 0) = GEN_INT (index_map[old_index]);
    1493                 :             : 
    1494                 :             :         /* Adjust the edge to the new destination.  */
    1495                 :           0 :         redirect_edge_succ (e, new_bb);
    1496                 :           0 :       }
    1497                 :             :     else
    1498                 :           0 :       ei_next (&ei);
    1499                 :           0 : }
    1500                 :             : 
    1501                 :             : /* The landing pad OLD_LP, in block OLD_BB, has edges from both partitions.
    1502                 :             :    Add a new landing pad that will just jump to the old one and split the
    1503                 :             :    edges so that no EH edge crosses partitions.  */
    1504                 :             : 
    1505                 :             : static void
    1506                 :        5117 : dw2_fix_up_crossing_landing_pad (eh_landing_pad old_lp, basic_block old_bb)
    1507                 :             : {
    1508                 :        5117 :   eh_landing_pad new_lp;
    1509                 :        5117 :   edge_iterator ei;
    1510                 :        5117 :   edge e;
    1511                 :             : 
    1512                 :             :   /* Generate the new landing-pad structure.  */
    1513                 :        5117 :   new_lp = gen_eh_landing_pad (old_lp->region);
    1514                 :        5117 :   new_lp->post_landing_pad = old_lp->post_landing_pad;
    1515                 :        5117 :   new_lp->landing_pad = gen_label_rtx ();
    1516                 :        5117 :   LABEL_PRESERVE_P (new_lp->landing_pad) = 1;
    1517                 :             : 
    1518                 :             :   /* Create the forwarder block.  */
    1519                 :        5117 :   basic_block new_bb = create_eh_forwarder_block (new_lp->landing_pad, old_bb);
    1520                 :             : 
    1521                 :             :   /* Fix up the edges.  */
    1522                 :       43662 :   for (ei = ei_start (old_bb->preds); (e = ei_safe_edge (ei)) != NULL; )
    1523                 :       38545 :     if (e->src != new_bb && BB_PARTITION (e->src) == BB_PARTITION (new_bb))
    1524                 :             :       {
    1525                 :       29769 :         rtx_insn *insn = BB_END (e->src);
    1526                 :       29769 :         rtx note = find_reg_note (insn, REG_EH_REGION, NULL_RTX);
    1527                 :             : 
    1528                 :       29769 :         gcc_assert (note != NULL);
    1529                 :       29769 :         gcc_checking_assert (INTVAL (XEXP (note, 0)) == old_lp->index);
    1530                 :       29769 :         XEXP (note, 0) = GEN_INT (new_lp->index);
    1531                 :             : 
    1532                 :             :         /* Adjust the edge to the new destination.  */
    1533                 :       29769 :         redirect_edge_succ (e, new_bb);
    1534                 :       29769 :       }
    1535                 :             :     else
    1536                 :        8776 :       ei_next (&ei);
    1537                 :        5117 : }
    1538                 :             : 
    1539                 :             : 
    1540                 :             : /* Ensure that all hot bbs are included in a hot path through the
    1541                 :             :    procedure. This is done by calling this function twice, once
    1542                 :             :    with WALK_UP true (to look for paths from the entry to hot bbs) and
    1543                 :             :    once with WALK_UP false (to look for paths from hot bbs to the exit).
    1544                 :             :    Returns the updated value of COLD_BB_COUNT and adds newly-hot bbs
    1545                 :             :    to BBS_IN_HOT_PARTITION.  */
    1546                 :             : 
    1547                 :             : static unsigned int
    1548                 :      119980 : sanitize_hot_paths (bool walk_up, unsigned int cold_bb_count,
    1549                 :             :                     vec<basic_block> *bbs_in_hot_partition)
    1550                 :             : {
    1551                 :             :   /* Callers check this.  */
    1552                 :      119980 :   gcc_checking_assert (cold_bb_count);
    1553                 :             : 
    1554                 :             :   /* Keep examining hot bbs while we still have some left to check
    1555                 :             :      and there are remaining cold bbs.  */
    1556                 :      119980 :   vec<basic_block> hot_bbs_to_check = bbs_in_hot_partition->copy ();
    1557                 :      119980 :   while (! hot_bbs_to_check.is_empty ()
    1558                 :     4818959 :          && cold_bb_count)
    1559                 :             :     {
    1560                 :     2349404 :       basic_block bb = hot_bbs_to_check.pop ();
    1561                 :     2349404 :       vec<edge, va_gc> *edges = walk_up ? bb->preds : bb->succs;
    1562                 :     2349404 :       edge e;
    1563                 :     2349404 :       edge_iterator ei;
    1564                 :     2349404 :       profile_probability highest_probability
    1565                 :     2349404 :                                  = profile_probability::uninitialized ();
    1566                 :     2349404 :       profile_count highest_count = profile_count::uninitialized ();
    1567                 :     2349404 :       bool found = false;
    1568                 :             : 
    1569                 :             :       /* Walk the preds/succs and check if there is at least one already
    1570                 :             :          marked hot. Keep track of the most frequent pred/succ so that we
    1571                 :             :          can mark it hot if we don't find one.  */
    1572                 :     2816666 :       FOR_EACH_EDGE (e, ei, edges)
    1573                 :             :         {
    1574                 :     2779989 :           basic_block reach_bb = walk_up ? e->src : e->dest;
    1575                 :             : 
    1576                 :     2779989 :           if (e->flags & EDGE_DFS_BACK)
    1577                 :       90875 :             continue;
    1578                 :             : 
    1579                 :             :           /* Do not expect profile insanities when profile was not adjusted.  */
    1580                 :     2700901 :           if (e->probability == profile_probability::never ()
    1581                 :     2331673 :               || e->count () == profile_count::zero ())
    1582                 :      358816 :             continue;
    1583                 :             : 
    1584                 :     2330298 :           if (BB_PARTITION (reach_bb) != BB_COLD_PARTITION)
    1585                 :             :           {
    1586                 :             :             found = true;
    1587                 :             :             break;
    1588                 :             :           }
    1589                 :             :           /* The following loop will look for the hottest edge via
    1590                 :             :              the edge count, if it is non-zero, then fallback to
    1591                 :             :              the edge probability.  */
    1592                 :       17571 :           if (!(e->count () > highest_count))
    1593                 :       17565 :             highest_count = e->count ();
    1594                 :       17571 :           if (!highest_probability.initialized_p ()
    1595                 :      484833 :               || e->probability > highest_probability)
    1596                 :       17561 :             highest_probability = e->probability;
    1597                 :             :         }
    1598                 :             : 
    1599                 :             :       /* If bb is reached by (or reaches, in the case of !WALK_UP) another hot
    1600                 :             :          block (or unpartitioned, e.g. the entry block) then it is ok. If not,
    1601                 :             :          then the most frequent pred (or succ) needs to be adjusted.  In the
    1602                 :             :          case where multiple preds/succs have the same frequency (e.g. a
    1603                 :             :          50-50 branch), then both will be adjusted.  */
    1604                 :     2349404 :       if (found)
    1605                 :     2312727 :         continue;
    1606                 :             : 
    1607                 :       70923 :       FOR_EACH_EDGE (e, ei, edges)
    1608                 :             :         {
    1609                 :       34246 :           if (e->flags & EDGE_DFS_BACK)
    1610                 :       29980 :             continue;
    1611                 :             :           /* Do not expect profile insanities when profile was not adjusted.  */
    1612                 :       21512 :           if (e->probability == profile_probability::never ()
    1613                 :        4344 :               || e->count () == profile_count::zero ())
    1614                 :       17207 :             continue;
    1615                 :             :           /* Select the hottest edge using the edge count, if it is non-zero,
    1616                 :             :              then fallback to the edge probability.  */
    1617                 :        4266 :           if (highest_count.initialized_p ())
    1618                 :             :             {
    1619                 :        4266 :               if (!(e->count () >= highest_count))
    1620                 :           0 :                 continue;
    1621                 :             :             }
    1622                 :           0 :           else if (!(e->probability >= highest_probability))
    1623                 :           0 :             continue;
    1624                 :             : 
    1625                 :        4266 :           basic_block reach_bb = walk_up ? e->src : e->dest;
    1626                 :             : 
    1627                 :             :           /* We have a hot bb with an immediate dominator that is cold.
    1628                 :             :              The dominator needs to be re-marked hot.  */
    1629                 :        4266 :           BB_SET_PARTITION (reach_bb, BB_HOT_PARTITION);
    1630                 :        4266 :           if (dump_file)
    1631                 :           0 :             fprintf (dump_file, "Promoting bb %i to hot partition to sanitize "
    1632                 :             :                      "profile of bb %i in %s walk\n", reach_bb->index,
    1633                 :             :                      bb->index, walk_up ? "backward" : "forward");
    1634                 :        4266 :           cold_bb_count--;
    1635                 :             : 
    1636                 :             :           /* Now we need to examine newly-hot reach_bb to see if it is also
    1637                 :             :              dominated by a cold bb.  */
    1638                 :        4266 :           bbs_in_hot_partition->safe_push (reach_bb);
    1639                 :        4266 :           hot_bbs_to_check.safe_push (reach_bb);
    1640                 :             :         }
    1641                 :             :     }
    1642                 :      119980 :   hot_bbs_to_check.release ();
    1643                 :             : 
    1644                 :      119980 :   return cold_bb_count;
    1645                 :             : }
    1646                 :             : 
    1647                 :             : 
    1648                 :             : /* Find the basic blocks that are rarely executed and need to be moved to
    1649                 :             :    a separate section of the .o file (to cut down on paging and improve
    1650                 :             :    cache locality).  Return a vector of all edges that cross.  */
    1651                 :             : 
    1652                 :             : static vec<edge>
    1653                 :      251771 : find_rarely_executed_basic_blocks_and_crossing_edges (void)
    1654                 :             : {
    1655                 :      251771 :   vec<edge> crossing_edges = vNULL;
    1656                 :      251771 :   basic_block bb;
    1657                 :      251771 :   edge e;
    1658                 :      251771 :   edge_iterator ei;
    1659                 :      251771 :   unsigned int cold_bb_count = 0;
    1660                 :      251771 :   auto_vec<basic_block> bbs_in_hot_partition;
    1661                 :             : 
    1662                 :      251771 :   propagate_unlikely_bbs_forward ();
    1663                 :             : 
    1664                 :             :   /* Mark which partition (hot/cold) each basic block belongs in.  */
    1665                 :     4306032 :   FOR_EACH_BB_FN (bb, cfun)
    1666                 :             :     {
    1667                 :     4054261 :       bool cold_bb = false;
    1668                 :             : 
    1669                 :     4054261 :       if (probably_never_executed_bb_p (cfun, bb))
    1670                 :             :         {
    1671                 :      251039 :           cold_bb = true;
    1672                 :             : 
    1673                 :             :           /* Handle profile insanities created by upstream optimizations
    1674                 :             :              by also checking the incoming edge weights. If there is a non-cold
    1675                 :             :              incoming edge, conservatively prevent this block from being split
    1676                 :             :              into the cold section.  */
    1677                 :      251039 :           if (!bb->count.precise_p ())
    1678                 :         129 :             FOR_EACH_EDGE (e, ei, bb->preds)
    1679                 :          73 :               if (!probably_never_executed_edge_p (cfun, e))
    1680                 :             :                 {
    1681                 :             :                   cold_bb = false;
    1682                 :             :                   break;
    1683                 :             :                 }
    1684                 :             :         }
    1685                 :          56 :       if (cold_bb)
    1686                 :             :         {
    1687                 :      251039 :           BB_SET_PARTITION (bb, BB_COLD_PARTITION);
    1688                 :      251039 :           cold_bb_count++;
    1689                 :             :         }
    1690                 :             :       else
    1691                 :             :         {
    1692                 :     3803222 :           BB_SET_PARTITION (bb, BB_HOT_PARTITION);
    1693                 :     3803222 :           bbs_in_hot_partition.safe_push (bb);
    1694                 :             :         }
    1695                 :             :     }
    1696                 :             : 
    1697                 :             :   /* Ensure that hot bbs are included along a hot path from the entry to exit.
    1698                 :             :      Several different possibilities may include cold bbs along all paths
    1699                 :             :      to/from a hot bb. One is that there are edge weight insanities
    1700                 :             :      due to optimization phases that do not properly update basic block profile
    1701                 :             :      counts. The second is that the entry of the function may not be hot, because
    1702                 :             :      it is entered fewer times than the number of profile training runs, but there
    1703                 :             :      is a loop inside the function that causes blocks within the function to be
    1704                 :             :      above the threshold for hotness. This is fixed by walking up from hot bbs
    1705                 :             :      to the entry block, and then down from hot bbs to the exit, performing
    1706                 :             :      partitioning fixups as necessary.  */
    1707                 :      251771 :   if (cold_bb_count)
    1708                 :             :     {
    1709                 :       59990 :       mark_dfs_back_edges ();
    1710                 :       59990 :       cold_bb_count = sanitize_hot_paths (true, cold_bb_count,
    1711                 :             :                                           &bbs_in_hot_partition);
    1712                 :       59990 :       if (cold_bb_count)
    1713                 :       59990 :         sanitize_hot_paths (false, cold_bb_count, &bbs_in_hot_partition);
    1714                 :             : 
    1715                 :       59990 :       hash_set <basic_block> set;
    1716                 :       59990 :       find_bbs_reachable_by_hot_paths (&set);
    1717                 :     1486454 :       FOR_EACH_BB_FN (bb, cfun)
    1718                 :     1426464 :         if (!set.contains (bb))
    1719                 :      246773 :           BB_SET_PARTITION (bb, BB_COLD_PARTITION);
    1720                 :       59990 :     }
    1721                 :             : 
    1722                 :             :   /* The format of .gcc_except_table does not allow landing pads to
    1723                 :             :      be in a different partition as the throw.  Fix this by either
    1724                 :             :      moving the landing pads or inserting forwarder landing pads.  */
    1725                 :      251771 :   if (cfun->eh->lp_array)
    1726                 :             :     {
    1727                 :      251771 :       const bool sjlj
    1728                 :      251771 :         = (targetm_common.except_unwind_info (&global_options) == UI_SJLJ);
    1729                 :      251771 :       unsigned i;
    1730                 :      251771 :       eh_landing_pad lp;
    1731                 :             : 
    1732                 :      818602 :       FOR_EACH_VEC_ELT (*cfun->eh->lp_array, i, lp)
    1733                 :             :         {
    1734                 :      566831 :           bool all_same, all_diff;
    1735                 :             : 
    1736                 :      566831 :           if (lp == NULL
    1737                 :       77604 :               || lp->landing_pad == NULL_RTX
    1738                 :       77604 :               || !LABEL_P (lp->landing_pad))
    1739                 :      489280 :             continue;
    1740                 :             : 
    1741                 :       77551 :           all_same = all_diff = true;
    1742                 :       77551 :           bb = BLOCK_FOR_INSN (lp->landing_pad);
    1743                 :      263603 :           FOR_EACH_EDGE (e, ei, bb->preds)
    1744                 :             :             {
    1745                 :      186052 :               gcc_assert (e->flags & EDGE_EH);
    1746                 :      186052 :               if (BB_PARTITION (bb) == BB_PARTITION (e->src))
    1747                 :             :                 all_diff = false;
    1748                 :             :               else
    1749                 :      134318 :                 all_same = false;
    1750                 :             :             }
    1751                 :             : 
    1752                 :       77551 :           if (all_same)
    1753                 :             :             ;
    1754                 :       61210 :           else if (all_diff)
    1755                 :             :             {
    1756                 :       56093 :               int which = BB_PARTITION (bb);
    1757                 :       56093 :               which ^= BB_HOT_PARTITION | BB_COLD_PARTITION;
    1758                 :       56093 :               BB_SET_PARTITION (bb, which);
    1759                 :             :             }
    1760                 :        5117 :           else if (sjlj)
    1761                 :           0 :             sjlj_fix_up_crossing_landing_pad (bb);
    1762                 :             :           else
    1763                 :        5117 :             dw2_fix_up_crossing_landing_pad (lp, bb);
    1764                 :             : 
    1765                 :             :           /* There is a single, common landing pad in SJLJ mode.  */
    1766                 :       77551 :           if (sjlj)
    1767                 :             :             break;
    1768                 :             :         }
    1769                 :             :     }
    1770                 :             : 
    1771                 :             :   /* Mark every edge that crosses between sections.  */
    1772                 :     4316266 :   FOR_EACH_BB_FN (bb, cfun)
    1773                 :    10238228 :     FOR_EACH_EDGE (e, ei, bb->succs)
    1774                 :             :       {
    1775                 :     6173733 :         unsigned int flags = e->flags;
    1776                 :             : 
    1777                 :             :         /* We should never have EDGE_CROSSING set yet.  */
    1778                 :     6173733 :         gcc_checking_assert ((flags & EDGE_CROSSING) == 0);
    1779                 :             : 
    1780                 :     6173733 :         if (e->src != ENTRY_BLOCK_PTR_FOR_FN (cfun)
    1781                 :     6173733 :             && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
    1782                 :     5888425 :             && BB_PARTITION (e->src) != BB_PARTITION (e->dest))
    1783                 :             :           {
    1784                 :      403760 :             crossing_edges.safe_push (e);
    1785                 :      403760 :             flags |= EDGE_CROSSING;
    1786                 :             :           }
    1787                 :             : 
    1788                 :             :         /* Now that we've split eh edges as appropriate, allow landing pads
    1789                 :             :            to be merged with the post-landing pads.  */
    1790                 :     6173733 :         flags &= ~EDGE_PRESERVE;
    1791                 :             : 
    1792                 :     6173733 :         e->flags = flags;
    1793                 :             :       }
    1794                 :             : 
    1795                 :      251771 :   return crossing_edges;
    1796                 :      251771 : }
    1797                 :             : 
    1798                 :             : /* Set the flag EDGE_CAN_FALLTHRU for edges that can be fallthru.  */
    1799                 :             : 
    1800                 :             : static void
    1801                 :      602520 : set_edge_can_fallthru_flag (void)
    1802                 :             : {
    1803                 :      602520 :   basic_block bb;
    1804                 :             : 
    1805                 :     9764063 :   FOR_EACH_BB_FN (bb, cfun)
    1806                 :             :     {
    1807                 :     9161543 :       edge e;
    1808                 :     9161543 :       edge_iterator ei;
    1809                 :             : 
    1810                 :    22885639 :       FOR_EACH_EDGE (e, ei, bb->succs)
    1811                 :             :         {
    1812                 :    13724096 :           e->flags &= ~EDGE_CAN_FALLTHRU;
    1813                 :             : 
    1814                 :             :           /* The FALLTHRU edge is also CAN_FALLTHRU edge.  */
    1815                 :    13724096 :           if (e->flags & EDGE_FALLTHRU)
    1816                 :     7877260 :             e->flags |= EDGE_CAN_FALLTHRU;
    1817                 :             :         }
    1818                 :             : 
    1819                 :             :       /* If the BB ends with an invertible condjump all (2) edges are
    1820                 :             :          CAN_FALLTHRU edges.  */
    1821                 :     9161543 :       if (EDGE_COUNT (bb->succs) != 2)
    1822                 :     4719708 :         continue;
    1823                 :     4963637 :       if (!any_condjump_p (BB_END (bb)))
    1824                 :      521802 :         continue;
    1825                 :             : 
    1826                 :     4441835 :       rtx_jump_insn *bb_end_jump = as_a <rtx_jump_insn *> (BB_END (bb));
    1827                 :     4441835 :       if (!invert_jump (bb_end_jump, JUMP_LABEL (bb_end_jump), 0))
    1828                 :           0 :         continue;
    1829                 :     4441835 :       invert_jump (bb_end_jump, JUMP_LABEL (bb_end_jump), 0);
    1830                 :     4441835 :       EDGE_SUCC (bb, 0)->flags |= EDGE_CAN_FALLTHRU;
    1831                 :     4441835 :       EDGE_SUCC (bb, 1)->flags |= EDGE_CAN_FALLTHRU;
    1832                 :             :     }
    1833                 :      602520 : }
    1834                 :             : 
    1835                 :             : /* If any destination of a crossing edge does not have a label, add label;
    1836                 :             :    Convert any easy fall-through crossing edges to unconditional jumps.  */
    1837                 :             : 
    1838                 :             : static void
    1839                 :       59813 : add_labels_and_missing_jumps (vec<edge> crossing_edges)
    1840                 :             : {
    1841                 :       59813 :   size_t i;
    1842                 :       59813 :   edge e;
    1843                 :             : 
    1844                 :      463573 :   FOR_EACH_VEC_ELT (crossing_edges, i, e)
    1845                 :             :     {
    1846                 :      403760 :       basic_block src = e->src;
    1847                 :      403760 :       basic_block dest = e->dest;
    1848                 :      403760 :       rtx_jump_insn *new_jump;
    1849                 :             : 
    1850                 :      403760 :       if (dest == EXIT_BLOCK_PTR_FOR_FN (cfun))
    1851                 :           0 :         continue;
    1852                 :             : 
    1853                 :             :       /* Make sure dest has a label.  */
    1854                 :      403760 :       rtx_code_label *label = block_label (dest);
    1855                 :             : 
    1856                 :             :       /* Nothing to do for non-fallthru edges.  */
    1857                 :      403760 :       if (src == ENTRY_BLOCK_PTR_FOR_FN (cfun))
    1858                 :           0 :         continue;
    1859                 :      403760 :       if ((e->flags & EDGE_FALLTHRU) == 0)
    1860                 :      238506 :         continue;
    1861                 :             : 
    1862                 :             :       /* If the block does not end with a control flow insn, then we
    1863                 :             :          can trivially add a jump to the end to fixup the crossing.
    1864                 :             :          Otherwise the jump will have to go in a new bb, which will
    1865                 :             :          be handled by fix_up_fall_thru_edges function.  */
    1866                 :      165254 :       if (control_flow_insn_p (BB_END (src)))
    1867                 :      105290 :         continue;
    1868                 :             : 
    1869                 :             :       /* Make sure there's only one successor.  */
    1870                 :      119928 :       gcc_assert (single_succ_p (src));
    1871                 :             : 
    1872                 :       59964 :       new_jump = emit_jump_insn_after (targetm.gen_jump (label), BB_END (src));
    1873                 :       59964 :       BB_END (src) = new_jump;
    1874                 :       59964 :       JUMP_LABEL (new_jump) = label;
    1875                 :       59964 :       LABEL_NUSES (label) += 1;
    1876                 :             : 
    1877                 :       59964 :       emit_barrier_after_bb (src);
    1878                 :             : 
    1879                 :             :       /* Mark edge as non-fallthru.  */
    1880                 :       59964 :       e->flags &= ~EDGE_FALLTHRU;
    1881                 :             :     }
    1882                 :       59813 : }
    1883                 :             : 
    1884                 :             : /* Find any bb's where the fall-through edge is a crossing edge (note that
    1885                 :             :    these bb's must also contain a conditional jump or end with a call
    1886                 :             :    instruction; we've already dealt with fall-through edges for blocks
    1887                 :             :    that didn't have a conditional jump or didn't end with call instruction
    1888                 :             :    in the call to add_labels_and_missing_jumps).  Convert the fall-through
    1889                 :             :    edge to non-crossing edge by inserting a new bb to fall-through into.
    1890                 :             :    The new bb will contain an unconditional jump (crossing edge) to the
    1891                 :             :    original fall through destination.  */
    1892                 :             : 
    1893                 :             : static void
    1894                 :       59813 : fix_up_fall_thru_edges (void)
    1895                 :             : {
    1896                 :       59813 :   basic_block cur_bb;
    1897                 :             : 
    1898                 :     1492028 :   FOR_EACH_BB_FN (cur_bb, cfun)
    1899                 :             :     {
    1900                 :     1432215 :       edge succ1;
    1901                 :     1432215 :       edge succ2;
    1902                 :     1432215 :       edge fall_thru = NULL;
    1903                 :     1432215 :       edge cond_jump = NULL;
    1904                 :             : 
    1905                 :     1432215 :       fall_thru = NULL;
    1906                 :     1432215 :       if (EDGE_COUNT (cur_bb->succs) > 0)
    1907                 :     1340227 :         succ1 = EDGE_SUCC (cur_bb, 0);
    1908                 :             :       else
    1909                 :             :         succ1 = NULL;
    1910                 :             : 
    1911                 :     1432215 :       if (EDGE_COUNT (cur_bb->succs) > 1)
    1912                 :      850818 :         succ2 = EDGE_SUCC (cur_bb, 1);
    1913                 :             :       else
    1914                 :             :         succ2 = NULL;
    1915                 :             : 
    1916                 :             :       /* Find the fall-through edge.  */
    1917                 :             : 
    1918                 :     1432215 :       if (succ1
    1919                 :     1340227 :           && (succ1->flags & EDGE_FALLTHRU))
    1920                 :             :         {
    1921                 :             :           fall_thru = succ1;
    1922                 :             :           cond_jump = succ2;
    1923                 :             :         }
    1924                 :      741822 :       else if (succ2
    1925                 :      562744 :                && (succ2->flags & EDGE_FALLTHRU))
    1926                 :             :         {
    1927                 :             :           fall_thru = succ2;
    1928                 :             :           cond_jump = succ1;
    1929                 :             :         }
    1930                 :     1433514 :       else if (succ2 && EDGE_COUNT (cur_bb->succs) > 2)
    1931                 :        1299 :         fall_thru = find_fallthru_edge (cur_bb->succs);
    1932                 :             : 
    1933                 :     1253103 :       if (fall_thru && (fall_thru->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)))
    1934                 :             :         {
    1935                 :             :           /* Check to see if the fall-thru edge is a crossing edge.  */
    1936                 :             : 
    1937                 :     1196894 :           if (fall_thru->flags & EDGE_CROSSING)
    1938                 :             :             {
    1939                 :             :               /* The fall_thru edge crosses; now check the cond jump edge, if
    1940                 :             :                  it exists.  */
    1941                 :             : 
    1942                 :      105290 :               bool cond_jump_crosses = true;
    1943                 :      105290 :               int invert_worked = 0;
    1944                 :      105290 :               rtx_insn *old_jump = BB_END (cur_bb);
    1945                 :             : 
    1946                 :             :               /* Find the jump instruction, if there is one.  */
    1947                 :             : 
    1948                 :      105290 :               if (cond_jump)
    1949                 :             :                 {
    1950                 :      105137 :                   if (!(cond_jump->flags & EDGE_CROSSING))
    1951                 :      104918 :                     cond_jump_crosses = false;
    1952                 :             : 
    1953                 :             :                   /* We know the fall-thru edge crosses; if the cond
    1954                 :             :                      jump edge does NOT cross, and its destination is the
    1955                 :             :                      next block in the bb order, invert the jump
    1956                 :             :                      (i.e. fix it so the fall through does not cross and
    1957                 :             :                      the cond jump does).  */
    1958                 :             : 
    1959                 :      104918 :                   if (!cond_jump_crosses)
    1960                 :             :                     {
    1961                 :             :                       /* Find label in fall_thru block. We've already added
    1962                 :             :                          any missing labels, so there must be one.  */
    1963                 :             : 
    1964                 :      104918 :                       rtx_code_label *fall_thru_label
    1965                 :      104918 :                         = block_label (fall_thru->dest);
    1966                 :             : 
    1967                 :      104918 :                       if (old_jump && fall_thru_label)
    1968                 :             :                         {
    1969                 :      104918 :                           rtx_jump_insn *old_jump_insn
    1970                 :      208179 :                             = dyn_cast <rtx_jump_insn *> (old_jump);
    1971                 :      102889 :                           if (old_jump_insn)
    1972                 :      102889 :                             invert_worked = invert_jump (old_jump_insn,
    1973                 :             :                                                          fall_thru_label, 0);
    1974                 :             :                         }
    1975                 :             : 
    1976                 :      102889 :                       if (invert_worked)
    1977                 :             :                         {
    1978                 :      102879 :                           fall_thru->flags &= ~EDGE_FALLTHRU;
    1979                 :      102879 :                           cond_jump->flags |= EDGE_FALLTHRU;
    1980                 :      102879 :                           update_br_prob_note (cur_bb);
    1981                 :      102879 :                           std::swap (fall_thru, cond_jump);
    1982                 :      102879 :                           cond_jump->flags |= EDGE_CROSSING;
    1983                 :      102879 :                           fall_thru->flags &= ~EDGE_CROSSING;
    1984                 :             :                         }
    1985                 :             :                     }
    1986                 :             :                 }
    1987                 :             : 
    1988                 :      105290 :               if (cond_jump_crosses || !invert_worked)
    1989                 :             :                 {
    1990                 :             :                   /* This is the case where both edges out of the basic
    1991                 :             :                      block are crossing edges. Here we will fix up the
    1992                 :             :                      fall through edge. The jump edge will be taken care
    1993                 :             :                      of later.  The EDGE_CROSSING flag of fall_thru edge
    1994                 :             :                      is unset before the call to force_nonfallthru
    1995                 :             :                      function because if a new basic-block is created
    1996                 :             :                      this edge remains in the current section boundary
    1997                 :             :                      while the edge between new_bb and the fall_thru->dest
    1998                 :             :                      becomes EDGE_CROSSING.  */
    1999                 :             : 
    2000                 :        2411 :                   fall_thru->flags &= ~EDGE_CROSSING;
    2001                 :        2411 :                   unsigned old_count = EDGE_COUNT (cur_bb->succs);
    2002                 :        2411 :                   basic_block new_bb = force_nonfallthru (fall_thru);
    2003                 :             : 
    2004                 :        2411 :                   if (new_bb)
    2005                 :             :                     {
    2006                 :        2411 :                       new_bb->aux = cur_bb->aux;
    2007                 :        2411 :                       cur_bb->aux = new_bb;
    2008                 :             : 
    2009                 :             :                       /* This is done by force_nonfallthru_and_redirect.  */
    2010                 :        2411 :                       gcc_assert (BB_PARTITION (new_bb)
    2011                 :             :                                   == BB_PARTITION (cur_bb));
    2012                 :             : 
    2013                 :        2411 :                       edge e = single_succ_edge (new_bb);
    2014                 :        2411 :                       e->flags |= EDGE_CROSSING;
    2015                 :        4822 :                       if (EDGE_COUNT (cur_bb->succs) > old_count)
    2016                 :             :                         {
    2017                 :             :                           /* If asm goto has a crossing fallthrough edge
    2018                 :             :                              and at least one of the labels to the same bb,
    2019                 :             :                              force_nonfallthru can result in the fallthrough
    2020                 :             :                              edge being redirected and a new edge added for the
    2021                 :             :                              label or more labels to e->dest.  As we've
    2022                 :             :                              temporarily cleared EDGE_CROSSING flag on the
    2023                 :             :                              fallthrough edge, we need to restore it again.
    2024                 :             :                              See PR108596.  */
    2025                 :           3 :                           rtx_insn *j = BB_END (cur_bb);
    2026                 :           3 :                           gcc_checking_assert (JUMP_P (j)
    2027                 :             :                                                && (asm_noperands (PATTERN (j))
    2028                 :             :                                                    > 0));
    2029                 :           3 :                           edge e2 = find_edge (cur_bb, e->dest);
    2030                 :           3 :                           if (e2)
    2031                 :           3 :                             e2->flags |= EDGE_CROSSING;
    2032                 :             :                         }
    2033                 :             :                     }
    2034                 :             :                   else
    2035                 :             :                     {
    2036                 :             :                       /* If a new basic-block was not created; restore
    2037                 :             :                          the EDGE_CROSSING flag.  */
    2038                 :           0 :                       fall_thru->flags |= EDGE_CROSSING;
    2039                 :             :                     }
    2040                 :             : 
    2041                 :             :                   /* Add barrier after new jump */
    2042                 :        2411 :                   emit_barrier_after_bb (new_bb ? new_bb : cur_bb);
    2043                 :             :                 }
    2044                 :             :             }
    2045                 :             :         }
    2046                 :             :     }
    2047                 :       59813 : }
    2048                 :             : 
    2049                 :             : /* This function checks the destination block of a "crossing jump" to
    2050                 :             :    see if it has any crossing predecessors that begin with a code label
    2051                 :             :    and end with an unconditional jump.  If so, it returns that predecessor
    2052                 :             :    block.  (This is to avoid creating lots of new basic blocks that all
    2053                 :             :    contain unconditional jumps to the same destination).  */
    2054                 :             : 
    2055                 :             : static basic_block
    2056                 :           0 : find_jump_block (basic_block jump_dest)
    2057                 :             : {
    2058                 :           0 :   basic_block source_bb = NULL;
    2059                 :           0 :   edge e;
    2060                 :           0 :   rtx_insn *insn;
    2061                 :           0 :   edge_iterator ei;
    2062                 :             : 
    2063                 :           0 :   FOR_EACH_EDGE (e, ei, jump_dest->preds)
    2064                 :           0 :     if (e->flags & EDGE_CROSSING)
    2065                 :             :       {
    2066                 :           0 :         basic_block src = e->src;
    2067                 :             : 
    2068                 :             :         /* Check each predecessor to see if it has a label, and contains
    2069                 :             :            only one executable instruction, which is an unconditional jump.
    2070                 :             :            If so, we can use it.  */
    2071                 :             : 
    2072                 :           0 :         if (LABEL_P (BB_HEAD (src)))
    2073                 :           0 :           for (insn = BB_HEAD (src);
    2074                 :           0 :                !INSN_P (insn) && insn != NEXT_INSN (BB_END (src));
    2075                 :           0 :                insn = NEXT_INSN (insn))
    2076                 :             :             {
    2077                 :           0 :               if (INSN_P (insn)
    2078                 :             :                   && insn == BB_END (src)
    2079                 :             :                   && JUMP_P (insn)
    2080                 :             :                   && !any_condjump_p (insn))
    2081                 :             :                 {
    2082                 :             :                   source_bb = src;
    2083                 :             :                   break;
    2084                 :             :                 }
    2085                 :             :             }
    2086                 :             : 
    2087                 :             :         if (source_bb)
    2088                 :             :           break;
    2089                 :             :       }
    2090                 :             : 
    2091                 :           0 :   return source_bb;
    2092                 :             : }
    2093                 :             : 
    2094                 :             : /* Find all BB's with conditional jumps that are crossing edges;
    2095                 :             :    insert a new bb and make the conditional jump branch to the new
    2096                 :             :    bb instead (make the new bb same color so conditional branch won't
    2097                 :             :    be a 'crossing' edge).  Insert an unconditional jump from the
    2098                 :             :    new bb to the original destination of the conditional jump.  */
    2099                 :             : 
    2100                 :             : static void
    2101                 :           0 : fix_crossing_conditional_branches (void)
    2102                 :             : {
    2103                 :           0 :   basic_block cur_bb;
    2104                 :           0 :   basic_block new_bb;
    2105                 :           0 :   basic_block dest;
    2106                 :           0 :   edge succ1;
    2107                 :           0 :   edge succ2;
    2108                 :           0 :   edge crossing_edge;
    2109                 :           0 :   edge new_edge;
    2110                 :           0 :   rtx set_src;
    2111                 :           0 :   rtx old_label = NULL_RTX;
    2112                 :           0 :   rtx_code_label *new_label;
    2113                 :             : 
    2114                 :           0 :   FOR_EACH_BB_FN (cur_bb, cfun)
    2115                 :             :     {
    2116                 :           0 :       crossing_edge = NULL;
    2117                 :           0 :       if (EDGE_COUNT (cur_bb->succs) > 0)
    2118                 :           0 :         succ1 = EDGE_SUCC (cur_bb, 0);
    2119                 :             :       else
    2120                 :             :         succ1 = NULL;
    2121                 :             : 
    2122                 :           0 :       if (EDGE_COUNT (cur_bb->succs) > 1)
    2123                 :           0 :         succ2 = EDGE_SUCC (cur_bb, 1);
    2124                 :             :       else
    2125                 :             :         succ2 = NULL;
    2126                 :             : 
    2127                 :             :       /* We already took care of fall-through edges, so only one successor
    2128                 :             :          can be a crossing edge.  */
    2129                 :             : 
    2130                 :           0 :       if (succ1 && (succ1->flags & EDGE_CROSSING))
    2131                 :             :         crossing_edge = succ1;
    2132                 :           0 :       else if (succ2 && (succ2->flags & EDGE_CROSSING))
    2133                 :             :         crossing_edge = succ2;
    2134                 :             : 
    2135                 :             :       if (crossing_edge)
    2136                 :             :         {
    2137                 :           0 :           rtx_insn *old_jump = BB_END (cur_bb);
    2138                 :             : 
    2139                 :             :           /* Check to make sure the jump instruction is a
    2140                 :             :              conditional jump.  */
    2141                 :             : 
    2142                 :           0 :           set_src = NULL_RTX;
    2143                 :             : 
    2144                 :           0 :           if (any_condjump_p (old_jump))
    2145                 :             :             {
    2146                 :           0 :               if (GET_CODE (PATTERN (old_jump)) == SET)
    2147                 :           0 :                 set_src = SET_SRC (PATTERN (old_jump));
    2148                 :           0 :               else if (GET_CODE (PATTERN (old_jump)) == PARALLEL)
    2149                 :             :                 {
    2150                 :           0 :                   set_src = XVECEXP (PATTERN (old_jump), 0,0);
    2151                 :           0 :                   if (GET_CODE (set_src) == SET)
    2152                 :           0 :                     set_src = SET_SRC (set_src);
    2153                 :             :                   else
    2154                 :             :                     set_src = NULL_RTX;
    2155                 :             :                 }
    2156                 :             :             }
    2157                 :             : 
    2158                 :           0 :           if (set_src && (GET_CODE (set_src) == IF_THEN_ELSE))
    2159                 :             :             {
    2160                 :           0 :               rtx_jump_insn *old_jump_insn =
    2161                 :           0 :                         as_a <rtx_jump_insn *> (old_jump);
    2162                 :             : 
    2163                 :           0 :               if (GET_CODE (XEXP (set_src, 1)) == PC)
    2164                 :           0 :                 old_label = XEXP (set_src, 2);
    2165                 :           0 :               else if (GET_CODE (XEXP (set_src, 2)) == PC)
    2166                 :           0 :                 old_label = XEXP (set_src, 1);
    2167                 :             : 
    2168                 :             :               /* Check to see if new bb for jumping to that dest has
    2169                 :             :                  already been created; if so, use it; if not, create
    2170                 :             :                  a new one.  */
    2171                 :             : 
    2172                 :           0 :               new_bb = find_jump_block (crossing_edge->dest);
    2173                 :             : 
    2174                 :           0 :               if (new_bb)
    2175                 :           0 :                 new_label = block_label (new_bb);
    2176                 :             :               else
    2177                 :             :                 {
    2178                 :           0 :                   basic_block last_bb;
    2179                 :           0 :                   rtx_code_label *old_jump_target;
    2180                 :           0 :                   rtx_jump_insn *new_jump;
    2181                 :             : 
    2182                 :             :                   /* Create new basic block to be dest for
    2183                 :             :                      conditional jump.  */
    2184                 :             : 
    2185                 :             :                   /* Put appropriate instructions in new bb.  */
    2186                 :             : 
    2187                 :           0 :                   new_label = gen_label_rtx ();
    2188                 :           0 :                   emit_label (new_label);
    2189                 :             : 
    2190                 :           0 :                   gcc_assert (GET_CODE (old_label) == LABEL_REF);
    2191                 :           0 :                   old_jump_target = old_jump_insn->jump_target ();
    2192                 :           0 :                   new_jump = as_a <rtx_jump_insn *>
    2193                 :           0 :                     (emit_jump_insn (targetm.gen_jump (old_jump_target)));
    2194                 :           0 :                   new_jump->set_jump_target (old_jump_target);
    2195                 :             : 
    2196                 :           0 :                   last_bb = EXIT_BLOCK_PTR_FOR_FN (cfun)->prev_bb;
    2197                 :           0 :                   new_bb = create_basic_block (new_label, new_jump, last_bb);
    2198                 :           0 :                   new_bb->aux = last_bb->aux;
    2199                 :           0 :                   last_bb->aux = new_bb;
    2200                 :             : 
    2201                 :           0 :                   emit_barrier_after_bb (new_bb);
    2202                 :             : 
    2203                 :             :                   /* Make sure new bb is in same partition as source
    2204                 :             :                      of conditional branch.  */
    2205                 :           0 :                   BB_COPY_PARTITION (new_bb, cur_bb);
    2206                 :             :                 }
    2207                 :             : 
    2208                 :             :               /* Make old jump branch to new bb.  */
    2209                 :             : 
    2210                 :           0 :               redirect_jump (old_jump_insn, new_label, 0);
    2211                 :             : 
    2212                 :             :               /* Remove crossing_edge as predecessor of 'dest'.  */
    2213                 :             : 
    2214                 :           0 :               dest = crossing_edge->dest;
    2215                 :             : 
    2216                 :           0 :               redirect_edge_succ (crossing_edge, new_bb);
    2217                 :             : 
    2218                 :             :               /* Make a new edge from new_bb to old dest; new edge
    2219                 :             :                  will be a successor for new_bb and a predecessor
    2220                 :             :                  for 'dest'.  */
    2221                 :             : 
    2222                 :           0 :               if (EDGE_COUNT (new_bb->succs) == 0)
    2223                 :           0 :                 new_edge = make_single_succ_edge (new_bb, dest, 0);
    2224                 :             :               else
    2225                 :           0 :                 new_edge = EDGE_SUCC (new_bb, 0);
    2226                 :             : 
    2227                 :           0 :               crossing_edge->flags &= ~EDGE_CROSSING;
    2228                 :           0 :               new_edge->flags |= EDGE_CROSSING;
    2229                 :             :             }
    2230                 :             :         }
    2231                 :             :     }
    2232                 :           0 : }
    2233                 :             : 
    2234                 :             : /* Find any unconditional branches that cross between hot and cold
    2235                 :             :    sections.  Convert them into indirect jumps instead.  */
    2236                 :             : 
    2237                 :             : static void
    2238                 :           0 : fix_crossing_unconditional_branches (void)
    2239                 :             : {
    2240                 :           0 :   basic_block cur_bb;
    2241                 :           0 :   rtx_insn *last_insn;
    2242                 :           0 :   rtx label;
    2243                 :           0 :   rtx label_addr;
    2244                 :           0 :   rtx_insn *indirect_jump_sequence;
    2245                 :           0 :   rtx_insn *jump_insn = NULL;
    2246                 :           0 :   rtx new_reg;
    2247                 :           0 :   rtx_insn *cur_insn;
    2248                 :           0 :   edge succ;
    2249                 :             : 
    2250                 :           0 :   FOR_EACH_BB_FN (cur_bb, cfun)
    2251                 :             :     {
    2252                 :           0 :       last_insn = BB_END (cur_bb);
    2253                 :             : 
    2254                 :           0 :       if (EDGE_COUNT (cur_bb->succs) < 1)
    2255                 :           0 :         continue;
    2256                 :             : 
    2257                 :           0 :       succ = EDGE_SUCC (cur_bb, 0);
    2258                 :             : 
    2259                 :             :       /* Check to see if bb ends in a crossing (unconditional) jump.  At
    2260                 :             :          this point, no crossing jumps should be conditional.  */
    2261                 :             : 
    2262                 :           0 :       if (JUMP_P (last_insn)
    2263                 :           0 :           && (succ->flags & EDGE_CROSSING))
    2264                 :             :         {
    2265                 :           0 :           gcc_assert (!any_condjump_p (last_insn));
    2266                 :             : 
    2267                 :             :           /* Make sure the jump is not already an indirect or table jump.  */
    2268                 :             : 
    2269                 :           0 :           if (!computed_jump_p (last_insn)
    2270                 :           0 :               && !tablejump_p (last_insn, NULL, NULL)
    2271                 :           0 :               && asm_noperands (PATTERN (last_insn)) < 0)
    2272                 :             :             {
    2273                 :             :               /* We have found a "crossing" unconditional branch.  Now
    2274                 :             :                  we must convert it to an indirect jump.  First create
    2275                 :             :                  reference of label, as target for jump.  */
    2276                 :             : 
    2277                 :           0 :               label = JUMP_LABEL (last_insn);
    2278                 :           0 :               label_addr = gen_rtx_LABEL_REF (Pmode, label);
    2279                 :           0 :               LABEL_NUSES (label) += 1;
    2280                 :             : 
    2281                 :             :               /* Get a register to use for the indirect jump.  */
    2282                 :             : 
    2283                 :           0 :               new_reg = gen_reg_rtx (Pmode);
    2284                 :             : 
    2285                 :             :               /* Generate indirect the jump sequence.  */
    2286                 :             : 
    2287                 :           0 :               start_sequence ();
    2288                 :           0 :               emit_move_insn (new_reg, label_addr);
    2289                 :           0 :               emit_indirect_jump (new_reg);
    2290                 :           0 :               indirect_jump_sequence = get_insns ();
    2291                 :           0 :               end_sequence ();
    2292                 :             : 
    2293                 :             :               /* Make sure every instruction in the new jump sequence has
    2294                 :             :                  its basic block set to be cur_bb.  */
    2295                 :             : 
    2296                 :           0 :               for (cur_insn = indirect_jump_sequence; cur_insn;
    2297                 :           0 :                    cur_insn = NEXT_INSN (cur_insn))
    2298                 :             :                 {
    2299                 :           0 :                   if (!BARRIER_P (cur_insn))
    2300                 :           0 :                     BLOCK_FOR_INSN (cur_insn) = cur_bb;
    2301                 :           0 :                   if (JUMP_P (cur_insn))
    2302                 :           0 :                     jump_insn = cur_insn;
    2303                 :             :                 }
    2304                 :             : 
    2305                 :             :               /* Insert the new (indirect) jump sequence immediately before
    2306                 :             :                  the unconditional jump, then delete the unconditional jump.  */
    2307                 :             : 
    2308                 :           0 :               emit_insn_before (indirect_jump_sequence, last_insn);
    2309                 :           0 :               delete_insn (last_insn);
    2310                 :             : 
    2311                 :           0 :               JUMP_LABEL (jump_insn) = label;
    2312                 :           0 :               LABEL_NUSES (label)++;
    2313                 :             : 
    2314                 :             :               /* Make BB_END for cur_bb be the jump instruction (NOT the
    2315                 :             :                  barrier instruction at the end of the sequence...).  */
    2316                 :             : 
    2317                 :           0 :               BB_END (cur_bb) = jump_insn;
    2318                 :             :             }
    2319                 :             :         }
    2320                 :             :     }
    2321                 :           0 : }
    2322                 :             : 
    2323                 :             : /* Update CROSSING_JUMP_P flags on all jump insns.  */
    2324                 :             : 
    2325                 :             : static void
    2326                 :       59813 : update_crossing_jump_flags (void)
    2327                 :             : {
    2328                 :       59813 :   basic_block bb;
    2329                 :       59813 :   edge e;
    2330                 :       59813 :   edge_iterator ei;
    2331                 :             : 
    2332                 :     1492028 :   FOR_EACH_BB_FN (bb, cfun)
    2333                 :     2952718 :     FOR_EACH_EDGE (e, ei, bb->succs)
    2334                 :     1923767 :       if (e->flags & EDGE_CROSSING)
    2335                 :             :         {
    2336                 :      403264 :           if (JUMP_P (BB_END (bb)))
    2337                 :      403129 :             CROSSING_JUMP_P (BB_END (bb)) = 1;
    2338                 :             :           break;
    2339                 :             :         }
    2340                 :       59813 : }
    2341                 :             : 
    2342                 :             : /* Reorder basic blocks using the software trace cache (STC) algorithm.  */
    2343                 :             : 
    2344                 :             : static void
    2345                 :      570343 : reorder_basic_blocks_software_trace_cache (void)
    2346                 :             : {
    2347                 :      570343 :   if (dump_file)
    2348                 :          23 :     fprintf (dump_file, "\nReordering with the STC algorithm.\n\n");
    2349                 :             : 
    2350                 :      570343 :   int n_traces;
    2351                 :      570343 :   int i;
    2352                 :      570343 :   struct trace *traces;
    2353                 :             : 
    2354                 :             :   /* We are estimating the length of uncond jump insn only once since the code
    2355                 :             :      for getting the insn length always returns the minimal length now.  */
    2356                 :      570343 :   if (uncond_jump_length == 0)
    2357                 :        8687 :     uncond_jump_length = get_uncond_jump_length ();
    2358                 :             : 
    2359                 :             :   /* We need to know some information for each basic block.  */
    2360                 :      570343 :   array_size = GET_ARRAY_SIZE (last_basic_block_for_fn (cfun));
    2361                 :      570343 :   bbd = XNEWVEC (bbro_basic_block_data, array_size);
    2362                 :    15426408 :   for (i = 0; i < array_size; i++)
    2363                 :             :     {
    2364                 :    14856065 :       bbd[i].start_of_trace = -1;
    2365                 :    14856065 :       bbd[i].end_of_trace = -1;
    2366                 :    14856065 :       bbd[i].in_trace = -1;
    2367                 :    14856065 :       bbd[i].visited = 0;
    2368                 :    14856065 :       bbd[i].priority = -1;
    2369                 :    14856065 :       bbd[i].heap = NULL;
    2370                 :    14856065 :       bbd[i].node = NULL;
    2371                 :             :     }
    2372                 :             : 
    2373                 :      570343 :   traces = XNEWVEC (struct trace, n_basic_blocks_for_fn (cfun));
    2374                 :      570343 :   n_traces = 0;
    2375                 :      570343 :   find_traces (&n_traces, traces);
    2376                 :      570343 :   connect_traces (n_traces, traces);
    2377                 :      570343 :   FREE (traces);
    2378                 :      570343 :   FREE (bbd);
    2379                 :      570343 : }
    2380                 :             : 
    2381                 :             : /* Order edges by execution frequency, higher first.  */
    2382                 :             : 
    2383                 :             : static int
    2384                 :    17543058 : edge_order (const void *ve1, const void *ve2)
    2385                 :             : {
    2386                 :    17543058 :   edge e1 = *(const edge *) ve1;
    2387                 :    17543058 :   edge e2 = *(const edge *) ve2;
    2388                 :    17543058 :   profile_count c1 = e1->count ();
    2389                 :    17543058 :   profile_count c2 = e2->count ();
    2390                 :             :   /* Since profile_count::operator< does not establish a strict weak order
    2391                 :             :      in presence of uninitialized counts, use 'max': this makes them appear
    2392                 :             :      as if having execution frequency less than any initialized count.  */
    2393                 :    17543058 :   profile_count m = c1.max (c2);
    2394                 :    35086116 :   return (m == c2) - (m == c1);
    2395                 :             : }
    2396                 :             : 
    2397                 :             : /* Reorder basic blocks using the "simple" algorithm.  This tries to
    2398                 :             :    maximize the dynamic number of branches that are fallthrough, without
    2399                 :             :    copying instructions.  The algorithm is greedy, looking at the most
    2400                 :             :    frequently executed branch first.  */
    2401                 :             : 
    2402                 :             : static void
    2403                 :       32177 : reorder_basic_blocks_simple (void)
    2404                 :             : {
    2405                 :       32177 :   if (dump_file)
    2406                 :           8 :     fprintf (dump_file, "\nReordering with the \"simple\" algorithm.\n\n");
    2407                 :             : 
    2408                 :       32177 :   edge *edges = new edge[2 * n_basic_blocks_for_fn (cfun)];
    2409                 :             : 
    2410                 :             :   /* First, collect all edges that can be optimized by reordering blocks:
    2411                 :             :      simple jumps and conditional jumps, as well as the function entry edge.  */
    2412                 :             : 
    2413                 :       32177 :   int n = 0;
    2414                 :       32177 :   edges[n++] = EDGE_SUCC (ENTRY_BLOCK_PTR_FOR_FN (cfun), 0);
    2415                 :             : 
    2416                 :       32177 :   basic_block bb;
    2417                 :      502381 :   FOR_EACH_BB_FN (bb, cfun)
    2418                 :             :     {
    2419                 :      470204 :       rtx_insn *end = BB_END (bb);
    2420                 :             : 
    2421                 :      470204 :       if (computed_jump_p (end) || tablejump_p (end, NULL, NULL))
    2422                 :         372 :         continue;
    2423                 :             : 
    2424                 :             :       /* We cannot optimize asm goto.  */
    2425                 :      469832 :       if (JUMP_P (end) && extract_asm_operands (end))
    2426                 :           0 :         continue;
    2427                 :             : 
    2428                 :      469832 :       if (single_succ_p (bb))
    2429                 :      138197 :         edges[n++] = EDGE_SUCC (bb, 0);
    2430                 :      331635 :       else if (any_condjump_p (end))
    2431                 :             :         {
    2432                 :      241390 :           edge e0 = EDGE_SUCC (bb, 0);
    2433                 :      241390 :           edge e1 = EDGE_SUCC (bb, 1);
    2434                 :             :           /* When optimizing for size it is best to keep the original
    2435                 :             :              fallthrough edges.  */
    2436                 :      241390 :           if (e1->flags & EDGE_FALLTHRU)
    2437                 :      134108 :             std::swap (e0, e1);
    2438                 :      241390 :           edges[n++] = e0;
    2439                 :      241390 :           edges[n++] = e1;
    2440                 :             :         }
    2441                 :             :     }
    2442                 :             : 
    2443                 :             :   /* Sort the edges, the most desirable first.  When optimizing for size
    2444                 :             :      all edges are equally desirable.  */
    2445                 :             : 
    2446                 :       32177 :   if (optimize_function_for_speed_p (cfun))
    2447                 :       31993 :     gcc_stablesort (edges, n, sizeof *edges, edge_order);
    2448                 :             : 
    2449                 :             :   /* Now decide which of those edges to make fallthrough edges.  We set
    2450                 :             :      BB_VISITED if a block already has a fallthrough successor assigned
    2451                 :             :      to it.  We make ->AUX of an endpoint point to the opposite endpoint
    2452                 :             :      of a sequence of blocks that fall through, and ->AUX will be NULL
    2453                 :             :      for a block that is in such a sequence but not an endpoint anymore.
    2454                 :             : 
    2455                 :             :      To start with, everything points to itself, nothing is assigned yet.  */
    2456                 :             : 
    2457                 :      566735 :   FOR_ALL_BB_FN (bb, cfun)
    2458                 :             :     {
    2459                 :      534558 :       bb->aux = bb;
    2460                 :      534558 :       bb->flags &= ~BB_VISITED;
    2461                 :             :     }
    2462                 :             : 
    2463                 :       32177 :   EXIT_BLOCK_PTR_FOR_FN (cfun)->aux = 0;
    2464                 :             : 
    2465                 :             :   /* Now for all edges, the most desirable first, see if that edge can
    2466                 :             :      connect two sequences.  If it can, update AUX and BB_VISITED; if it
    2467                 :             :      cannot, zero out the edge in the table.  */
    2468                 :             : 
    2469                 :      685331 :   for (int j = 0; j < n; j++)
    2470                 :             :     {
    2471                 :      653154 :       edge e = edges[j];
    2472                 :             : 
    2473                 :      653154 :       basic_block tail_a = e->src;
    2474                 :      653154 :       basic_block head_b = e->dest;
    2475                 :      653154 :       basic_block head_a = (basic_block) tail_a->aux;
    2476                 :      653154 :       basic_block tail_b = (basic_block) head_b->aux;
    2477                 :             : 
    2478                 :             :       /* An edge cannot connect two sequences if:
    2479                 :             :          - it crosses partitions;
    2480                 :             :          - its src is not a current endpoint;
    2481                 :             :          - its dest is not a current endpoint;
    2482                 :             :          - or, it would create a loop.  */
    2483                 :             : 
    2484                 :      653154 :       if (e->flags & EDGE_CROSSING
    2485                 :      653139 :           || tail_a->flags & BB_VISITED
    2486                 :      443098 :           || !tail_b
    2487                 :      378825 :           || (!(head_b->flags & BB_VISITED) && head_b != tail_b)
    2488                 :      352090 :           || tail_a == tail_b)
    2489                 :             :         {
    2490                 :      331921 :           edges[j] = 0;
    2491                 :      331921 :           continue;
    2492                 :             :         }
    2493                 :             : 
    2494                 :      321233 :       tail_a->aux = 0;
    2495                 :      321233 :       head_b->aux = 0;
    2496                 :      321233 :       head_a->aux = tail_b;
    2497                 :      321233 :       tail_b->aux = head_a;
    2498                 :      321233 :       tail_a->flags |= BB_VISITED;
    2499                 :             :     }
    2500                 :             : 
    2501                 :             :   /* Put the pieces together, in the same order that the start blocks of
    2502                 :             :      the sequences already had.  The hot/cold partitioning gives a little
    2503                 :             :      complication: as a first pass only do this for blocks in the same
    2504                 :             :      partition as the start block, and (if there is anything left to do)
    2505                 :             :      in a second pass handle the other partition.  */
    2506                 :             : 
    2507                 :       32177 :   basic_block last_tail = (basic_block) ENTRY_BLOCK_PTR_FOR_FN (cfun)->aux;
    2508                 :             : 
    2509                 :       32177 :   int current_partition
    2510                 :       32177 :     = BB_PARTITION (last_tail == ENTRY_BLOCK_PTR_FOR_FN (cfun)
    2511                 :             :                     ? EDGE_SUCC (ENTRY_BLOCK_PTR_FOR_FN (cfun), 0)->dest
    2512                 :             :                     : last_tail);
    2513                 :       32177 :   bool need_another_pass = true;
    2514                 :             : 
    2515                 :       64363 :   for (int pass = 0; pass < 2 && need_another_pass; pass++)
    2516                 :             :     {
    2517                 :       32186 :       need_another_pass = false;
    2518                 :             : 
    2519                 :      502449 :       FOR_EACH_BB_FN (bb, cfun)
    2520                 :      470263 :         if ((bb->flags & BB_VISITED && bb->aux) || bb->aux == bb)
    2521                 :             :           {
    2522                 :      148991 :             if (BB_PARTITION (bb) != current_partition)
    2523                 :             :               {
    2524                 :          20 :                 need_another_pass = true;
    2525                 :          20 :                 continue;
    2526                 :             :               }
    2527                 :             : 
    2528                 :      148971 :             last_tail->aux = bb;
    2529                 :      148971 :             last_tail = (basic_block) bb->aux;
    2530                 :             :           }
    2531                 :             : 
    2532                 :       32186 :       current_partition ^= BB_HOT_PARTITION | BB_COLD_PARTITION;
    2533                 :             :     }
    2534                 :             : 
    2535                 :       32177 :   last_tail->aux = 0;
    2536                 :             : 
    2537                 :             :   /* Finally, link all the chosen fallthrough edges.  */
    2538                 :             : 
    2539                 :      685331 :   for (int j = 0; j < n; j++)
    2540                 :      653154 :     if (edges[j])
    2541                 :      321233 :       edges[j]->src->aux = edges[j]->dest;
    2542                 :             : 
    2543                 :       32177 :   delete[] edges;
    2544                 :             : 
    2545                 :             :   /* If the entry edge no longer falls through we have to make a new
    2546                 :             :      block so it can do so again.  */
    2547                 :             : 
    2548                 :       32177 :   edge e = EDGE_SUCC (ENTRY_BLOCK_PTR_FOR_FN (cfun), 0);
    2549                 :       32177 :   if (e->dest != ENTRY_BLOCK_PTR_FOR_FN (cfun)->aux)
    2550                 :             :     {
    2551                 :           7 :       force_nonfallthru (e);
    2552                 :           7 :       e->src->aux = ENTRY_BLOCK_PTR_FOR_FN (cfun)->aux;
    2553                 :             :     }
    2554                 :       32177 : }
    2555                 :             : 
    2556                 :             : /* Reorder basic blocks.  The main entry point to this file.  */
    2557                 :             : 
    2558                 :             : static void
    2559                 :      977861 : reorder_basic_blocks (void)
    2560                 :             : {
    2561                 :      977861 :   gcc_assert (current_ir_type () == IR_RTL_CFGLAYOUT);
    2562                 :             : 
    2563                 :      977861 :   if (n_basic_blocks_for_fn (cfun) <= NUM_FIXED_BLOCKS + 1)
    2564                 :             :     return;
    2565                 :             : 
    2566                 :      602520 :   set_edge_can_fallthru_flag ();
    2567                 :      602520 :   mark_dfs_back_edges ();
    2568                 :             : 
    2569                 :      602520 :   switch (flag_reorder_blocks_algorithm)
    2570                 :             :     {
    2571                 :       32177 :     case REORDER_BLOCKS_ALGORITHM_SIMPLE:
    2572                 :       32177 :       reorder_basic_blocks_simple ();
    2573                 :       32177 :       break;
    2574                 :             : 
    2575                 :      570343 :     case REORDER_BLOCKS_ALGORITHM_STC:
    2576                 :      570343 :       reorder_basic_blocks_software_trace_cache ();
    2577                 :      570343 :       break;
    2578                 :             : 
    2579                 :           0 :     default:
    2580                 :           0 :       gcc_unreachable ();
    2581                 :             :     }
    2582                 :             : 
    2583                 :      602520 :   relink_block_chain (/*stay_in_cfglayout_mode=*/true);
    2584                 :             : 
    2585                 :      602520 :   if (dump_file)
    2586                 :             :     {
    2587                 :          31 :       if (dump_flags & TDF_DETAILS)
    2588                 :           0 :         dump_reg_info (dump_file);
    2589                 :          31 :       dump_flow_info (dump_file, dump_flags);
    2590                 :             :     }
    2591                 :             : 
    2592                 :             :   /* Signal that rtl_verify_flow_info_1 can now verify that there
    2593                 :             :      is at most one switch between hot/cold sections.  */
    2594                 :      602520 :   crtl->bb_reorder_complete = true;
    2595                 :             : }
    2596                 :             : 
    2597                 :             : /* Determine which partition the first basic block in the function
    2598                 :             :    belongs to, then find the first basic block in the current function
    2599                 :             :    that belongs to a different section, and insert a
    2600                 :             :    NOTE_INSN_SWITCH_TEXT_SECTIONS note immediately before it in the
    2601                 :             :    instruction stream.  When writing out the assembly code,
    2602                 :             :    encountering this note will make the compiler switch between the
    2603                 :             :    hot and cold text sections.  */
    2604                 :             : 
    2605                 :             : void
    2606                 :       59813 : insert_section_boundary_note (void)
    2607                 :             : {
    2608                 :       59813 :   basic_block bb;
    2609                 :       59813 :   bool switched_sections = false;
    2610                 :       59813 :   int current_partition = 0;
    2611                 :             : 
    2612                 :       59813 :   if (!crtl->has_bb_partition)
    2613                 :             :     return;
    2614                 :             : 
    2615                 :     1524651 :   FOR_EACH_BB_FN (bb, cfun)
    2616                 :             :     {
    2617                 :     1464838 :       if (!current_partition)
    2618                 :       59813 :         current_partition = BB_PARTITION (bb);
    2619                 :     1464838 :       if (BB_PARTITION (bb) != current_partition)
    2620                 :             :         {
    2621                 :       59811 :           gcc_assert (!switched_sections);
    2622                 :       59811 :           switched_sections = true;
    2623                 :       59811 :           emit_note_before (NOTE_INSN_SWITCH_TEXT_SECTIONS, BB_HEAD (bb));
    2624                 :       59811 :           current_partition = BB_PARTITION (bb);
    2625                 :             :         }
    2626                 :             :     }
    2627                 :             : 
    2628                 :             :   /* Make sure crtl->has_bb_partition matches reality even if bbpart finds
    2629                 :             :      some hot and some cold basic blocks, but later one of those kinds is
    2630                 :             :      optimized away.  */
    2631                 :       59813 :   crtl->has_bb_partition = switched_sections;
    2632                 :             : }
    2633                 :             : 
    2634                 :             : namespace {
    2635                 :             : 
    2636                 :             : const pass_data pass_data_reorder_blocks =
    2637                 :             : {
    2638                 :             :   RTL_PASS, /* type */
    2639                 :             :   "bbro", /* name */
    2640                 :             :   OPTGROUP_NONE, /* optinfo_flags */
    2641                 :             :   TV_REORDER_BLOCKS, /* tv_id */
    2642                 :             :   0, /* properties_required */
    2643                 :             :   0, /* properties_provided */
    2644                 :             :   0, /* properties_destroyed */
    2645                 :             :   0, /* todo_flags_start */
    2646                 :             :   0, /* todo_flags_finish */
    2647                 :             : };
    2648                 :             : 
    2649                 :             : class pass_reorder_blocks : public rtl_opt_pass
    2650                 :             : {
    2651                 :             : public:
    2652                 :      285617 :   pass_reorder_blocks (gcc::context *ctxt)
    2653                 :      571234 :     : rtl_opt_pass (pass_data_reorder_blocks, ctxt)
    2654                 :             :   {}
    2655                 :             : 
    2656                 :             :   /* opt_pass methods: */
    2657                 :     1419745 :   bool gate (function *) final override
    2658                 :             :     {
    2659                 :     1419745 :       if (targetm.cannot_modify_jumps_p ())
    2660                 :             :         return false;
    2661                 :     1419745 :       return (optimize > 0
    2662                 :     1419745 :               && (flag_reorder_blocks || flag_reorder_blocks_and_partition));
    2663                 :             :     }
    2664                 :             : 
    2665                 :             :   unsigned int execute (function *) final override;
    2666                 :             : 
    2667                 :             : }; // class pass_reorder_blocks
    2668                 :             : 
    2669                 :             : unsigned int
    2670                 :      977861 : pass_reorder_blocks::execute (function *fun)
    2671                 :             : {
    2672                 :      977861 :   basic_block bb;
    2673                 :             : 
    2674                 :             :   /* Last attempt to optimize CFG, as scheduling, peepholing and insn
    2675                 :             :      splitting possibly introduced more crossjumping opportunities.  */
    2676                 :      977861 :   cfg_layout_initialize (CLEANUP_EXPENSIVE);
    2677                 :             : 
    2678                 :      977861 :   reorder_basic_blocks ();
    2679                 :      977861 :   cleanup_cfg (CLEANUP_EXPENSIVE | CLEANUP_NO_PARTITIONING);
    2680                 :             : 
    2681                 :    10509842 :   FOR_EACH_BB_FN (bb, fun)
    2682                 :     9531981 :     if (bb->next_bb != EXIT_BLOCK_PTR_FOR_FN (fun))
    2683                 :     8554120 :       bb->aux = bb->next_bb;
    2684                 :      977861 :   cfg_layout_finalize ();
    2685                 :             : 
    2686                 :    10669557 :   FOR_EACH_BB_FN (bb, fun)
    2687                 :     9691696 :     df_recompute_luids (bb);
    2688                 :      977861 :   return 0;
    2689                 :             : }
    2690                 :             : 
    2691                 :             : } // anon namespace
    2692                 :             : 
    2693                 :             : rtl_opt_pass *
    2694                 :      285617 : make_pass_reorder_blocks (gcc::context *ctxt)
    2695                 :             : {
    2696                 :      285617 :   return new pass_reorder_blocks (ctxt);
    2697                 :             : }
    2698                 :             : 
    2699                 :             : /* Duplicate a block (that we already know ends in a computed jump) into its
    2700                 :             :    predecessors, where possible.  Return whether anything is changed.  */
    2701                 :             : static bool
    2702                 :         885 : maybe_duplicate_computed_goto (basic_block bb, int max_size)
    2703                 :             : {
    2704                 :             :   /* Make sure that the block is small enough.  */
    2705                 :         885 :   rtx_insn *insn;
    2706                 :        8562 :   FOR_BB_INSNS (bb, insn)
    2707                 :        8327 :     if (INSN_P (insn))
    2708                 :             :       {
    2709                 :        5886 :         max_size -= get_attr_min_length (insn);
    2710                 :        5886 :         if (max_size < 0)
    2711                 :             :            return false;
    2712                 :             :       }
    2713                 :             : 
    2714                 :         235 :   bool changed = false;
    2715                 :         235 :   edge e;
    2716                 :         235 :   edge_iterator ei;
    2717                 :         794 :   for (ei = ei_start (bb->preds); (e = ei_safe_edge (ei)); )
    2718                 :             :     {
    2719                 :         559 :       basic_block pred = e->src;
    2720                 :             : 
    2721                 :             :       /* Do not duplicate BB into PRED if we cannot merge a copy of BB
    2722                 :             :          with PRED.  */
    2723                 :         559 :       if (!single_succ_p (pred)
    2724                 :         207 :           || e->flags & EDGE_COMPLEX
    2725                 :         207 :           || pred->index < NUM_FIXED_BLOCKS
    2726                 :         185 :           || (JUMP_P (BB_END (pred)) && !simplejump_p (BB_END (pred)))
    2727                 :         744 :           || (JUMP_P (BB_END (pred)) && CROSSING_JUMP_P (BB_END (pred))))
    2728                 :             :         {
    2729                 :         376 :           ei_next (&ei);
    2730                 :         376 :           continue;
    2731                 :             :         }
    2732                 :             : 
    2733                 :         183 :       if (dump_file)
    2734                 :           0 :         fprintf (dump_file, "Duplicating computed goto bb %d into bb %d\n",
    2735                 :           0 :                  bb->index, e->src->index);
    2736                 :             : 
    2737                 :             :       /* Remember if PRED can be duplicated; if so, the copy of BB merged
    2738                 :             :          with PRED can be duplicated as well.  */
    2739                 :         183 :       bool can_dup_more = can_duplicate_block_p (pred);
    2740                 :             : 
    2741                 :             :       /* Make a copy of BB, merge it into PRED.  */
    2742                 :         183 :       basic_block copy = duplicate_block (bb, e, NULL);
    2743                 :         183 :       emit_barrier_after_bb (copy);
    2744                 :         183 :       reorder_insns_nobb (BB_HEAD (copy), BB_END (copy), BB_END (pred));
    2745                 :         183 :       merge_blocks (pred, copy);
    2746                 :             : 
    2747                 :         183 :       changed = true;
    2748                 :             : 
    2749                 :             :       /* Try to merge the resulting merged PRED into further predecessors.  */
    2750                 :         183 :       if (can_dup_more)
    2751                 :         183 :         maybe_duplicate_computed_goto (pred, max_size);
    2752                 :             :     }
    2753                 :             : 
    2754                 :             :   return changed;
    2755                 :             : }
    2756                 :             : 
    2757                 :             : /* Duplicate the blocks containing computed gotos.  This basically unfactors
    2758                 :             :    computed gotos that were factored early on in the compilation process to
    2759                 :             :    speed up edge based data flow.  We used to not unfactor them again, which
    2760                 :             :    can seriously pessimize code with many computed jumps in the source code,
    2761                 :             :    such as interpreters.  See e.g. PR15242.  */
    2762                 :             : static void
    2763                 :      845499 : duplicate_computed_gotos (function *fun)
    2764                 :             : {
    2765                 :             :   /* We are estimating the length of uncond jump insn only once
    2766                 :             :      since the code for getting the insn length always returns
    2767                 :             :      the minimal length now.  */
    2768                 :      845499 :   if (uncond_jump_length == 0)
    2769                 :      101976 :     uncond_jump_length = get_uncond_jump_length ();
    2770                 :             : 
    2771                 :             :   /* Never copy a block larger than this.  */
    2772                 :      845499 :   int max_size
    2773                 :      845499 :     = uncond_jump_length * param_max_goto_duplication_insns;
    2774                 :             : 
    2775                 :      845499 :   bool changed = false;
    2776                 :             : 
    2777                 :             :   /* Try to duplicate all blocks that end in a computed jump and that
    2778                 :             :      can be duplicated at all.  */
    2779                 :      845499 :   basic_block bb;
    2780                 :    10127438 :   FOR_EACH_BB_FN (bb, fun)
    2781                 :     9281939 :     if (computed_jump_p (BB_END (bb)) && can_duplicate_block_p (bb))
    2782                 :         702 :       changed |= maybe_duplicate_computed_goto (bb, max_size);
    2783                 :             : 
    2784                 :             :   /* Some blocks may have become unreachable.  */
    2785                 :      845499 :   if (changed)
    2786                 :          83 :     cleanup_cfg (0);
    2787                 :             : 
    2788                 :             :   /* Duplicating blocks will redirect edges and may cause hot blocks
    2789                 :             :     previously reached by both hot and cold blocks to become dominated
    2790                 :             :     only by cold blocks.  */
    2791                 :          83 :   if (changed)
    2792                 :          83 :     fixup_partitions ();
    2793                 :      845499 : }
    2794                 :             : 
    2795                 :             : namespace {
    2796                 :             : 
    2797                 :             : const pass_data pass_data_duplicate_computed_gotos =
    2798                 :             : {
    2799                 :             :   RTL_PASS, /* type */
    2800                 :             :   "compgotos", /* name */
    2801                 :             :   OPTGROUP_NONE, /* optinfo_flags */
    2802                 :             :   TV_REORDER_BLOCKS, /* tv_id */
    2803                 :             :   0, /* properties_required */
    2804                 :             :   0, /* properties_provided */
    2805                 :             :   0, /* properties_destroyed */
    2806                 :             :   0, /* todo_flags_start */
    2807                 :             :   0, /* todo_flags_finish */
    2808                 :             : };
    2809                 :             : 
    2810                 :             : class pass_duplicate_computed_gotos : public rtl_opt_pass
    2811                 :             : {
    2812                 :             : public:
    2813                 :      285617 :   pass_duplicate_computed_gotos (gcc::context *ctxt)
    2814                 :      571234 :     : rtl_opt_pass (pass_data_duplicate_computed_gotos, ctxt)
    2815                 :             :   {}
    2816                 :             : 
    2817                 :             :   /* opt_pass methods: */
    2818                 :             :   bool gate (function *) final override;
    2819                 :             :   unsigned int execute (function *) final override;
    2820                 :             : 
    2821                 :             : }; // class pass_duplicate_computed_gotos
    2822                 :             : 
    2823                 :             : bool
    2824                 :     1419745 : pass_duplicate_computed_gotos::gate (function *fun)
    2825                 :             : {
    2826                 :     1419745 :   if (targetm.cannot_modify_jumps_p ())
    2827                 :             :     return false;
    2828                 :     1419745 :   return (optimize > 0
    2829                 :      977863 :           && flag_expensive_optimizations
    2830                 :     2324484 :           && ! optimize_function_for_size_p (fun));
    2831                 :             : }
    2832                 :             : 
    2833                 :             : unsigned int
    2834                 :      845499 : pass_duplicate_computed_gotos::execute (function *fun)
    2835                 :             : {
    2836                 :      845499 :   duplicate_computed_gotos (fun);
    2837                 :             : 
    2838                 :      845499 :   return 0;
    2839                 :             : }
    2840                 :             : 
    2841                 :             : } // anon namespace
    2842                 :             : 
    2843                 :             : rtl_opt_pass *
    2844                 :      285617 : make_pass_duplicate_computed_gotos (gcc::context *ctxt)
    2845                 :             : {
    2846                 :      285617 :   return new pass_duplicate_computed_gotos (ctxt);
    2847                 :             : }
    2848                 :             : 
    2849                 :             : /* This function is the main 'entrance' for the optimization that
    2850                 :             :    partitions hot and cold basic blocks into separate sections of the
    2851                 :             :    .o file (to improve performance and cache locality).  Ideally it
    2852                 :             :    would be called after all optimizations that rearrange the CFG have
    2853                 :             :    been called.  However part of this optimization may introduce new
    2854                 :             :    register usage, so it must be called before register allocation has
    2855                 :             :    occurred.  This means that this optimization is actually called
    2856                 :             :    well before the optimization that reorders basic blocks (see
    2857                 :             :    function above).
    2858                 :             : 
    2859                 :             :    This optimization checks the feedback information to determine
    2860                 :             :    which basic blocks are hot/cold, updates flags on the basic blocks
    2861                 :             :    to indicate which section they belong in.  This information is
    2862                 :             :    later used for writing out sections in the .o file.  Because hot
    2863                 :             :    and cold sections can be arbitrarily large (within the bounds of
    2864                 :             :    memory), far beyond the size of a single function, it is necessary
    2865                 :             :    to fix up all edges that cross section boundaries, to make sure the
    2866                 :             :    instructions used can actually span the required distance.  The
    2867                 :             :    fixes are described below.
    2868                 :             : 
    2869                 :             :    Fall-through edges must be changed into jumps; it is not safe or
    2870                 :             :    legal to fall through across a section boundary.  Whenever a
    2871                 :             :    fall-through edge crossing a section boundary is encountered, a new
    2872                 :             :    basic block is inserted (in the same section as the fall-through
    2873                 :             :    source), and the fall through edge is redirected to the new basic
    2874                 :             :    block.  The new basic block contains an unconditional jump to the
    2875                 :             :    original fall-through target.  (If the unconditional jump is
    2876                 :             :    insufficient to cross section boundaries, that is dealt with a
    2877                 :             :    little later, see below).
    2878                 :             : 
    2879                 :             :    In order to deal with architectures that have short conditional
    2880                 :             :    branches (which cannot span all of memory) we take any conditional
    2881                 :             :    jump that attempts to cross a section boundary and add a level of
    2882                 :             :    indirection: it becomes a conditional jump to a new basic block, in
    2883                 :             :    the same section.  The new basic block contains an unconditional
    2884                 :             :    jump to the original target, in the other section.
    2885                 :             : 
    2886                 :             :    For those architectures whose unconditional branch is also
    2887                 :             :    incapable of reaching all of memory, those unconditional jumps are
    2888                 :             :    converted into indirect jumps, through a register.
    2889                 :             : 
    2890                 :             :    IMPORTANT NOTE: This optimization causes some messy interactions
    2891                 :             :    with the cfg cleanup optimizations; those optimizations want to
    2892                 :             :    merge blocks wherever possible, and to collapse indirect jump
    2893                 :             :    sequences (change "A jumps to B jumps to C" directly into "A jumps
    2894                 :             :    to C").  Those optimizations can undo the jump fixes that
    2895                 :             :    partitioning is required to make (see above), in order to ensure
    2896                 :             :    that jumps attempting to cross section boundaries are really able
    2897                 :             :    to cover whatever distance the jump requires (on many architectures
    2898                 :             :    conditional or unconditional jumps are not able to reach all of
    2899                 :             :    memory).  Therefore tests have to be inserted into each such
    2900                 :             :    optimization to make sure that it does not undo stuff necessary to
    2901                 :             :    cross partition boundaries.  This would be much less of a problem
    2902                 :             :    if we could perform this optimization later in the compilation, but
    2903                 :             :    unfortunately the fact that we may need to create indirect jumps
    2904                 :             :    (through registers) requires that this optimization be performed
    2905                 :             :    before register allocation.
    2906                 :             : 
    2907                 :             :    Hot and cold basic blocks are partitioned and put in separate
    2908                 :             :    sections of the .o file, to reduce paging and improve cache
    2909                 :             :    performance (hopefully).  This can result in bits of code from the
    2910                 :             :    same function being widely separated in the .o file.  However this
    2911                 :             :    is not obvious to the current bb structure.  Therefore we must take
    2912                 :             :    care to ensure that: 1). There are no fall_thru edges that cross
    2913                 :             :    between sections; 2). For those architectures which have "short"
    2914                 :             :    conditional branches, all conditional branches that attempt to
    2915                 :             :    cross between sections are converted to unconditional branches;
    2916                 :             :    and, 3). For those architectures which have "short" unconditional
    2917                 :             :    branches, all unconditional branches that attempt to cross between
    2918                 :             :    sections are converted to indirect jumps.
    2919                 :             : 
    2920                 :             :    The code for fixing up fall_thru edges that cross between hot and
    2921                 :             :    cold basic blocks does so by creating new basic blocks containing
    2922                 :             :    unconditional branches to the appropriate label in the "other"
    2923                 :             :    section.  The new basic block is then put in the same (hot or cold)
    2924                 :             :    section as the original conditional branch, and the fall_thru edge
    2925                 :             :    is modified to fall into the new basic block instead.  By adding
    2926                 :             :    this level of indirection we end up with only unconditional branches
    2927                 :             :    crossing between hot and cold sections.
    2928                 :             : 
    2929                 :             :    Conditional branches are dealt with by adding a level of indirection.
    2930                 :             :    A new basic block is added in the same (hot/cold) section as the
    2931                 :             :    conditional branch, and the conditional branch is retargeted to the
    2932                 :             :    new basic block.  The new basic block contains an unconditional branch
    2933                 :             :    to the original target of the conditional branch (in the other section).
    2934                 :             : 
    2935                 :             :    Unconditional branches are dealt with by converting them into
    2936                 :             :    indirect jumps.  */
    2937                 :             : 
    2938                 :             : namespace {
    2939                 :             : 
    2940                 :             : const pass_data pass_data_partition_blocks =
    2941                 :             : {
    2942                 :             :   RTL_PASS, /* type */
    2943                 :             :   "bbpart", /* name */
    2944                 :             :   OPTGROUP_NONE, /* optinfo_flags */
    2945                 :             :   TV_REORDER_BLOCKS, /* tv_id */
    2946                 :             :   PROP_cfglayout, /* properties_required */
    2947                 :             :   0, /* properties_provided */
    2948                 :             :   0, /* properties_destroyed */
    2949                 :             :   0, /* todo_flags_start */
    2950                 :             :   0, /* todo_flags_finish */
    2951                 :             : };
    2952                 :             : 
    2953                 :             : class pass_partition_blocks : public rtl_opt_pass
    2954                 :             : {
    2955                 :             : public:
    2956                 :      285617 :   pass_partition_blocks (gcc::context *ctxt)
    2957                 :      571234 :     : rtl_opt_pass (pass_data_partition_blocks, ctxt)
    2958                 :             :   {}
    2959                 :             : 
    2960                 :             :   /* opt_pass methods: */
    2961                 :             :   bool gate (function *) final override;
    2962                 :             :   unsigned int execute (function *) final override;
    2963                 :             : 
    2964                 :             : }; // class pass_partition_blocks
    2965                 :             : 
    2966                 :             : bool
    2967                 :     1419745 : pass_partition_blocks::gate (function *fun)
    2968                 :             : {
    2969                 :             :   /* The optimization to partition hot/cold basic blocks into separate
    2970                 :             :      sections of the .o file does not work well with linkonce or with
    2971                 :             :      user defined section attributes or with naked attribute.  Don't call
    2972                 :             :      it if either case arises.  */
    2973                 :     1419745 :   return (flag_reorder_blocks_and_partition
    2974                 :      664392 :           && optimize
    2975                 :             :           /* See pass_reorder_blocks::gate.  We should not partition if
    2976                 :             :              we are going to omit the reordering.  */
    2977                 :      664354 :           && optimize_function_for_speed_p (fun)
    2978                 :      620653 :           && !DECL_COMDAT_GROUP (current_function_decl)
    2979                 :      529479 :           && !lookup_attribute ("section", DECL_ATTRIBUTES (fun->decl))
    2980                 :      529423 :           && !lookup_attribute ("naked", DECL_ATTRIBUTES (fun->decl))
    2981                 :             :           /* Workaround a bug in GDB where read_partial_die doesn't cope
    2982                 :             :              with DIEs with DW_AT_ranges, see PR81115.  */
    2983                 :     1949116 :           && !(in_lto_p && MAIN_NAME_P (DECL_NAME (fun->decl))));
    2984                 :             : }
    2985                 :             : 
    2986                 :             : unsigned
    2987                 :      520595 : pass_partition_blocks::execute (function *fun)
    2988                 :             : {
    2989                 :      520595 :   vec<edge> crossing_edges;
    2990                 :             : 
    2991                 :      520595 :   if (n_basic_blocks_for_fn (fun) <= NUM_FIXED_BLOCKS + 1)
    2992                 :             :     return 0;
    2993                 :             : 
    2994                 :      251771 :   df_set_flags (DF_DEFER_INSN_RESCAN);
    2995                 :             : 
    2996                 :      251771 :   crossing_edges = find_rarely_executed_basic_blocks_and_crossing_edges ();
    2997                 :      251771 :   if (!crossing_edges.exists ())
    2998                 :             :     /* Make sure to process deferred rescans and clear changeable df flags.  */
    2999                 :             :     return TODO_df_finish;
    3000                 :             : 
    3001                 :       59813 :   crtl->has_bb_partition = true;
    3002                 :             : 
    3003                 :             :   /* Make sure the source of any crossing edge ends in a jump and the
    3004                 :             :      destination of any crossing edge has a label.  */
    3005                 :       59813 :   add_labels_and_missing_jumps (crossing_edges);
    3006                 :             : 
    3007                 :             :   /* Convert all crossing fall_thru edges to non-crossing fall
    3008                 :             :      thrus to unconditional jumps (that jump to the original fall
    3009                 :             :      through dest).  */
    3010                 :       59813 :   fix_up_fall_thru_edges ();
    3011                 :             : 
    3012                 :             :   /* If the architecture does not have conditional branches that can
    3013                 :             :      span all of memory, convert crossing conditional branches into
    3014                 :             :      crossing unconditional branches.  */
    3015                 :       59813 :   if (!HAS_LONG_COND_BRANCH)
    3016                 :             :     fix_crossing_conditional_branches ();
    3017                 :             : 
    3018                 :             :   /* If the architecture does not have unconditional branches that
    3019                 :             :      can span all of memory, convert crossing unconditional branches
    3020                 :             :      into indirect jumps.  Since adding an indirect jump also adds
    3021                 :             :      a new register usage, update the register usage information as
    3022                 :             :      well.  */
    3023                 :       59813 :   if (!HAS_LONG_UNCOND_BRANCH)
    3024                 :             :     fix_crossing_unconditional_branches ();
    3025                 :             : 
    3026                 :       59813 :   update_crossing_jump_flags ();
    3027                 :             : 
    3028                 :             :   /* Clear bb->aux fields that the above routines were using.  */
    3029                 :       59813 :   clear_aux_for_blocks ();
    3030                 :             : 
    3031                 :       59813 :   crossing_edges.release ();
    3032                 :             : 
    3033                 :             :   /* ??? FIXME: DF generates the bb info for a block immediately.
    3034                 :             :      And by immediately, I mean *during* creation of the block.
    3035                 :             : 
    3036                 :             :         #0  df_bb_refs_collect
    3037                 :             :         #1  in df_bb_refs_record
    3038                 :             :         #2  in create_basic_block_structure
    3039                 :             : 
    3040                 :             :      Which means that the bb_has_eh_pred test in df_bb_refs_collect
    3041                 :             :      will *always* fail, because no edges can have been added to the
    3042                 :             :      block yet.  Which of course means we don't add the right 
    3043                 :             :      artificial refs, which means we fail df_verify (much) later.
    3044                 :             : 
    3045                 :             :      Cleanest solution would seem to make DF_DEFER_INSN_RESCAN imply
    3046                 :             :      that we also shouldn't grab data from the new blocks those new
    3047                 :             :      insns are in either.  In this way one can create the block, link
    3048                 :             :      it up properly, and have everything Just Work later, when deferred
    3049                 :             :      insns are processed.
    3050                 :             : 
    3051                 :             :      In the meantime, we have no other option but to throw away all
    3052                 :             :      of the DF data and recompute it all.  */
    3053                 :       59813 :   if (fun->eh->lp_array)
    3054                 :             :     {
    3055                 :       59813 :       df_finish_pass (true);
    3056                 :       59813 :       df_scan_alloc (NULL);
    3057                 :       59813 :       df_scan_blocks ();
    3058                 :             :       /* Not all post-landing pads use all of the EH_RETURN_DATA_REGNO
    3059                 :             :          data.  We blindly generated all of them when creating the new
    3060                 :             :          landing pad.  Delete those assignments we don't use.  */
    3061                 :       59813 :       df_set_flags (DF_LR_RUN_DCE);
    3062                 :       59813 :       df_analyze ();
    3063                 :             :     }
    3064                 :             : 
    3065                 :             :   /* Make sure to process deferred rescans and clear changeable df flags.  */
    3066                 :             :   return TODO_df_finish;
    3067                 :             : }
    3068                 :             : 
    3069                 :             : } // anon namespace
    3070                 :             : 
    3071                 :             : rtl_opt_pass *
    3072                 :      285617 : make_pass_partition_blocks (gcc::context *ctxt)
    3073                 :             : {
    3074                 :      285617 :   return new pass_partition_blocks (ctxt);
    3075                 :             : }
        

Generated by: LCOV version 2.0-1

LCOV profile is generated on x86_64 machine using following configure options: configure --disable-bootstrap --enable-coverage=opt --enable-languages=c,c++,fortran,go,jit,lto,rust,m2 --enable-host-shared. GCC test suite is run with the built compiler.