GCC Middle and Back End API Reference
pointer_analysis Namespace Reference

Data Structures

struct  constraint
struct  constraint_expr
struct  constraint_stats
struct  variable_info

Typedefs

typedef struct constraint_expr ce_s
typedef struct constraintconstraint_t
typedef struct variable_infovarinfo_t

Enumerations

enum  constraint_expr_type { SCALAR , DEREF , ADDRESSOF }
enum  {
  nothing_id = 1 , anything_id = 2 , string_id = 3 , escaped_id = 4 ,
  nonlocal_id = 5 , escaped_return_id = 6 , storedanything_id = 7 , integer_id = 8
}

Functions

void solve_constraints (void)
varinfo_t first_vi_for_offset (varinfo_t start, unsigned HOST_WIDE_INT offset)
varinfo_t first_or_preceding_vi_for_offset (varinfo_t start, unsigned HOST_WIDE_INT offset)
void dump_constraint (FILE *file, constraint_t c)
DEBUG_FUNCTION void debug_constraint (constraint_t c)
void dump_constraints (FILE *file, int from)
DEBUG_FUNCTION void debug_constraints (void)
void dump_solution_for_var (FILE *file, unsigned int var)
DEBUG_FUNCTION void debug_solution_for_var (unsigned int var)
void dump_sa_stats (FILE *outfile)
void dump_sa_points_to_info (FILE *outfile)
DEBUG_FUNCTION void debug_sa_points_to_info (void)
void dump_varinfo (FILE *file, varinfo_t vi)
DEBUG_FUNCTION void debug_varinfo (varinfo_t vi)
void dump_varmap (FILE *file)
DEBUG_FUNCTION void debug_varmap (void)
varinfo_t get_varinfo (unsigned int n)
varinfo_t vi_next (varinfo_t vi)

Variables

bitmap_obstack pta_obstack
bitmap_obstack oldpta_obstack
vec< varinfo_tvarmap
vec< constraint_tconstraints
static hash_map< tree, varinfo_t > * vi_for_tree
unsigned int * var_rep
struct constraint_stats stats

Detailed Description

Andersen-style solver for tree based points-to analysis
Copyright (C) 2005-2025 Free Software Foundation, Inc.
Contributed by Daniel Berlin <dberlin@dberlin.org>

This file is part of GCC.

GCC is free software; you can redistribute it and/or modify
under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.

GCC is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3.  If not see
<http://www.gnu.org/licenses/>.   
Tree based points-to analysis
Copyright (C) 2005-2025 Free Software Foundation, Inc.
Contributed by Daniel Berlin <dberlin@dberlin.org>

This file is part of GCC.

GCC is free software; you can redistribute it and/or modify
under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.

GCC is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3.  If not see
<http://www.gnu.org/licenses/>.   
The idea behind this analyzer is to generate set constraints from the
 program, then solve the resulting constraints in order to generate the
 points-to sets.

 Set constraints are a way of modeling program analysis problems that
 involve sets.  They consist of an inclusion constraint language,
 describing the variables (each variable is a set) and operations that
 are involved on the variables, and a set of rules that derive facts
 from these operations.  To solve a system of set constraints, you derive
 all possible facts under the rules, which gives you the correct sets
 as a consequence.

 See  "Efficient Field-sensitive pointer analysis for C" by "David
 J. Pearce and Paul H. J. Kelly and Chris Hankin", at
 http://citeseer.ist.psu.edu/pearce04efficient.html

 Also see "Ultra-fast Aliasing Analysis using CLA: A Million Lines
 of C Code in a Second" by "Nevin Heintze and Olivier Tardieu" at
 http://citeseer.ist.psu.edu/heintze01ultrafast.html

 There are three types of real constraint expressions, DEREF,
 ADDRESSOF, and SCALAR.  Each constraint expression consists
 of a constraint type, a variable, and an offset.

 SCALAR is a constraint expression type used to represent x, whether
 it appears on the LHS or the RHS of a statement.
 DEREF is a constraint expression type used to represent *x, whether
 it appears on the LHS or the RHS of a statement.
 ADDRESSOF is a constraint expression used to represent &x, whether
 it appears on the LHS or the RHS of a statement.

 Each pointer variable in the program is assigned an integer id, and
 each field of a structure variable is assigned an integer id as well.

 Structure variables are linked to their list of fields through a "next
 field" in each variable that points to the next field in offset
 order.
 Each variable for a structure field has

 1. "size", that tells the size in bits of that field.
 2. "fullsize", that tells the size in bits of the entire structure.
 3. "offset", that tells the offset in bits from the beginning of the
 structure to this field.

 Thus,
 struct f
 {
   int a;
   int b;
 } foo;
 int *bar;

 looks like

 foo.a -> id 1, size 32, offset 0, fullsize 64, next foo.b
 foo.b -> id 2, size 32, offset 32, fullsize 64, next NULL
 bar -> id 3, size 32, offset 0, fullsize 32, next NULL


In order to solve the system of set constraints, the following is
done:

1. Each constraint variable x has a solution set associated with it,
Sol(x).

2. Constraints are separated into direct, copy, and complex.
Direct constraints are ADDRESSOF constraints that require no extra
processing, such as P = &Q
Copy constraints are those of the form P = Q.
Complex constraints are all the constraints involving dereferences
and offsets (including offsetted copies).

3. All direct constraints of the form P = &Q are processed, such
that Q is added to Sol(P)

4. All complex constraints for a given constraint variable are stored in a
linked list attached to that variable's node.

5. A directed graph is built out of the copy constraints. Each
constraint variable is a node in the graph, and an edge from
Q to P is added for each copy constraint of the form P = Q

6. The graph is then walked, and solution sets are
propagated along the copy edges, such that an edge from Q to P
causes Sol(P) <- Sol(P) union Sol(Q).

7.  As we visit each node, all complex constraints associated with
that node are processed by adding appropriate copy edges to the graph, or the
appropriate variables to the solution set.

8. The process of walking the graph is iterated until no solution
sets change.

Prior to walking the graph in steps 6 and 7, We perform static
cycle elimination on the constraint graph, as well
as off-line variable substitution.

TODO: Adding offsets to pointer-to-structures can be handled (IE not punted
on and turned into anything), but isn't.  You can just see what offset
inside the pointed-to struct it's going to access.

TODO: Constant bounded arrays can be handled as if they were structs of the
same number of elements.

TODO: Modeling heap and incoming pointers becomes much better if we
add fields to them as we discover them, which we could do.

TODO: We could handle unions, but to be honest, it's probably not
worth the pain or slowdown.   
IPA-PTA optimizations possible.

When the indirect function called is ANYTHING we can add disambiguation
based on the function signatures (or simply the parameter count which
is the varinfo size).  We also do not need to consider functions that
do not have their address taken.

The is_global_var bit which marks escape points is overly conservative
in IPA mode.  Split it to is_escape_point and is_global_var - only
externally visible globals are escape points in IPA mode.
There is now is_ipa_escape_point but this is only used in a few
selected places.

The way we introduce DECL_PT_UID to avoid fixing up all points-to
sets in the translation unit when we copy a DECL during inlining
pessimizes precision.  The advantage is that the DECL_PT_UID keeps
compile-time and memory usage overhead low - the points-to sets
do not grow or get unshared as they would during a fixup phase.
An alternative solution is to delay IPA PTA until after all
inlining transformations have been applied.

The way we propagate clobber/use information isn't optimized.
It should use a new complex constraint that properly filters
out local variables of the callee (though that would make
the sets invalid after inlining).  OTOH we might as well
admit defeat to WHOPR and simply do all the clobber/use analysis
and propagation after PTA finished but before we threw away
points-to information for memory variables.  WHOPR and PTA
do not play along well anyway - the whole constraint solving
would need to be done in WPA phase and it will be very interesting
to apply the results to local SSA names during LTRANS phase.

We probably should compute a per-function unit-ESCAPE solution
propagating it simply like the clobber / uses solutions.  The
solution can go alongside the non-IPA escaped solution and be
used to query which vars escape the unit through a function.
This is also required to make the escaped-HEAP trick work in IPA mode.

We never put function decls in points-to sets so we do not
keep the set of called functions for indirect calls.

And probably more.   
Tree based points-to analysis
Copyright (C) 2005-2025 Free Software Foundation, Inc.
Contributed by Daniel Berlin <dberlin@dberlin.org>

This file is part of GCC.

GCC is free software; you can redistribute it and/or modify
under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.

GCC is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3.  If not see
<http://www.gnu.org/licenses/>.   
NOTE: This file declares the internal interface of the points-to analyzer.
Outward-facing function declarations can be found in tree-ssa-alias.h.   

Typedef Documentation

◆ ce_s

◆ constraint_t

◆ varinfo_t

Enumeration Type Documentation

◆ anonymous enum

anonymous enum
Static IDs for the special variables.  Variable ID zero is unused
and used as terminator for the sub-variable chain.   
Enumerator
nothing_id 
anything_id 
string_id 
escaped_id 
nonlocal_id 
escaped_return_id 
storedanything_id 
integer_id 

◆ constraint_expr_type

Enumerator
SCALAR 
DEREF 
ADDRESSOF 

Function Documentation

◆ debug_constraint()

void pointer_analysis::debug_constraint ( constraint_t c)
Print out constraint C to stderr.   

References dump_constraint().

◆ debug_constraints()

void pointer_analysis::debug_constraints ( void )
Print out all constraints to stderr.   

References dump_constraints().

◆ debug_sa_points_to_info()

void pointer_analysis::debug_sa_points_to_info ( void )
Debug points-to information to stderr.   

References dump_sa_points_to_info().

◆ debug_solution_for_var()

void pointer_analysis::debug_solution_for_var ( unsigned int var)
Print the points-to solution for VAR to stderr.   

References dump_solution_for_var().

◆ debug_varinfo()

void pointer_analysis::debug_varinfo ( varinfo_t vi)
Dump varinfo VI to stderr.   

References dump_varinfo().

◆ debug_varmap()

void pointer_analysis::debug_varmap ( void )
Dump varmap to stderr.   

References dump_varmap().

◆ dump_constraint()

void pointer_analysis::dump_constraint ( FILE * file,
constraint_t c )

◆ dump_constraints()

void pointer_analysis::dump_constraints ( FILE * file,
int from )
Print out all constraints to FILE.   

References constraints, dump_constraint(), and i.

Referenced by compute_points_to_sets(), debug_constraints(), and ipa_pta_execute().

◆ dump_sa_points_to_info()

void pointer_analysis::dump_sa_points_to_info ( FILE * outfile)

◆ dump_sa_stats()

void pointer_analysis::dump_sa_stats ( FILE * outfile)
Dump stats information to OUTFILE.   

References stats.

Referenced by compute_points_to_sets(), and ipa_pta_execute().

◆ dump_solution_for_var()

void pointer_analysis::dump_solution_for_var ( FILE * file,
unsigned int var )

◆ dump_varinfo()

◆ dump_varmap()

void pointer_analysis::dump_varmap ( FILE * file)
Dump varmap to FILE.   

References dump_varinfo(), get_varinfo(), i, and varmap.

Referenced by debug_varmap().

◆ first_or_preceding_vi_for_offset()

varinfo_t pointer_analysis::first_or_preceding_vi_for_offset ( varinfo_t start,
unsigned HOST_WIDE_INT offset )
Find the first varinfo in the same variable as START that overlaps with
OFFSET.  If there is no such varinfo the varinfo directly preceding
OFFSET is returned.   

References get_varinfo(), and vi_next().

Referenced by do_ds_constraint(), do_sd_constraint(), get_constraint_for_ptr_offset(), and set_union_with_increment().

◆ first_vi_for_offset()

varinfo_t pointer_analysis::first_vi_for_offset ( varinfo_t start,
unsigned HOST_WIDE_INT offset )
Find the first varinfo in the same variable as START that overlaps with
OFFSET.  Return NULL if we can't find one.   

References get_varinfo(), NULL, and vi_next().

Referenced by create_function_info_for(), find_func_clobbers(), get_function_part_constraint(), and ipa_pta_execute().

◆ get_varinfo()

◆ solve_constraints()

◆ vi_next()

Variable Documentation

◆ constraints

◆ oldpta_obstack

bitmap_obstack pointer_analysis::oldpta_obstack
Used for oldsolution members of variables.   

Referenced by init_alias_vars(), and solve_graph().

◆ pta_obstack

bitmap_obstack pointer_analysis::pta_obstack

◆ stats

struct constraint_stats pointer_analysis::stats

Referenced by dump_sa_stats().

◆ var_rep

unsigned int * pointer_analysis::var_rep
The representative variable for a variable.  The points-to solution for a
var can be found in its rep.  Trivially, a var can be its own rep.

The solver provides this array once it is done solving.   

Referenced by compute_dependence_clique(), delete_points_to_sets(), dump_solution_for_var(), find_what_var_points_to(), ipa_pta_execute(), set_uids_in_ptset(), solve_constraints(), and visit_loadstore().

◆ varmap

vec< varinfo_t > pointer_analysis::varmap

◆ vi_for_tree

hash_map<tree, varinfo_t>* pointer_analysis::vi_for_tree
static