博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
PostgreSQL 统计信息pg_statistic格式及导入导出dump_stat - 兼容Oracle
阅读量:5951 次
发布时间:2019-06-19

本文共 31163 字,大约阅读时间需要 103 分钟。

标签

PostgreSQL , dump_stat , 统计信息 , 导出导入


背景

EXPLAIN是PG数据库用于输出SQL执行计划的语法,

1、生成的执行计划中包含COST一项。

如果校准了成本因子,COST可以和SQL实际执行时间对其。因子校对的方法如下,实际上每一种硬件,我们只需要校对一遍即可。

校对因子如下:

#seq_page_cost = 1.0                    # measured on an arbitrary scale    random_page_cost = 1.2                  # same scale as above    #cpu_tuple_cost = 0.01                  # same scale as above    #cpu_index_tuple_cost = 0.005           # same scale as above    #cpu_operator_cost = 0.0025             # same scale as above        # 以下不需要校对, 不考虑并行计算SQL    parallel_tuple_cost = 0.1               # same scale as above    parallel_setup_cost = 1000.0            # same scale as above    effective_cache_size = 10GB

2、评估COST还需要依赖统计信息柱状图:

涉及reltuples, relpages. 表示评估的记录数以及占用多少个数据块。注意源头的block_size可能和PG的不一致,占用多少个块需要转换一下。(show block_size可以查看数据块大小。)       postgres=# \d pg_class                         Table "pg_catalog.pg_class"           Column        |     Type     | Collation | Nullable | Default     ---------------------+--------------+-----------+----------+---------     relname             | name         |           | not null |   -- 对象名   relnamespace        | oid          |           | not null |   -- 对象所属的schema, 对应pg_namespace.oid   reltype             | oid          |           | not null |      reloftype           | oid          |           | not null |      relowner            | oid          |           | not null |      relam               | oid          |           | not null |      relfilenode         | oid          |           | not null |      reltablespace       | oid          |           | not null |      relpages            | integer      |           | not null |   -- 评估的页数(单位为block_size)   reltuples           | real         |           | not null |   -- 评估的记录数   relallvisible       | integer      |           | not null |      reltoastrelid       | oid          |           | not null |      relhasindex         | boolean      |           | not null |      relisshared         | boolean      |           | not null |      relpersistence      | "char"       |           | not null |      relkind             | "char"       |           | not null |      relnatts            | smallint     |           | not null |      relchecks           | smallint     |           | not null |      relhasoids          | boolean      |           | not null |      relhaspkey          | boolean      |           | not null |      relhasrules         | boolean      |           | not null |      relhastriggers      | boolean      |           | not null |      relhassubclass      | boolean      |           | not null |      relrowsecurity      | boolean      |           | not null |      relforcerowsecurity | boolean      |           | not null |      relispopulated      | boolean      |           | not null |      relreplident        | "char"       |           | not null |      relispartition      | boolean      |           | not null |      relfrozenxid        | xid          |           | not null |      relminmxid          | xid          |           | not null |      relacl              | aclitem[]    |           |          |      reloptions          | text[]       |           |          |      relpartbound        | pg_node_tree |           |          |     Indexes:        "pg_class_oid_index" UNIQUE, btree (oid)        "pg_class_relname_nsp_index" UNIQUE, btree (relname, relnamespace)        "pg_class_tblspc_relfilenode_index" btree (reltablespace, relfilenode)        涉及   空值比例、平均列宽、唯一值比例或个数、高频值以及频率、柱状图分布、存储相关性、多值列(高频元素及比例、元素柱状图分布)。        -- 这个是视图:    postgres=# \d pg_stats                          View "pg_catalog.pg_stats"             Column         |   Type   | Collation | Nullable | Default     ------------------------+----------+-----------+----------+---------     schemaname             | name     |           |          |   -- 对象所属的schema   tablename              | name     |           |          |   -- 对象名   attname                | name     |           |          |   -- 列名   inherited              | boolean  |           |          |   -- 是否为继承表的统计信息(false时表示当前表的统计信息,true时表示包含所有继承表的统计信息)   null_frac              | real     |           |          |   -- 该列空值比例   avg_width              | integer  |           |          |   -- 该列平均长度   n_distinct             | real     |           |          |   -- 该列唯一值个数(-1表示唯一,小于1表示占比,大于等于1表示实际的唯一值个数)   most_common_vals       | anyarray |           |          |   -- 该列高频词   most_common_freqs      | real[]   |           |          |   -- 该列高频词对应的出现频率   histogram_bounds       | anyarray |           |          |   -- 该列柱状图(表示隔出的每个BUCKET的记录数均等)   correlation            | real     |           |          |   -- 该列存储相关性(-1到1的区间),绝对值越小,存储越离散。小于0表示反向相关,大于0表示正向相关   most_common_elems      | anyarray |           |          |   -- 该列为多值类型(数组)时,多值元素的高频词   most_common_elem_freqs | real[]   |           |          |   -- 多值元素高频词的出现频率   elem_count_histogram   | real[]   |           |          |   -- 多值元素的柱状图中,每个区间的非空唯一元素个数      -- 这个是实际存储的数据(也就是要导入的部分):  -- https://www.postgresql.org/docs/10/static/catalog-pg-statistic.html    postgres=# \d pg_statistic                 Table "pg_catalog.pg_statistic"       Column    |   Type   | Collation | Nullable | Default     -------------+----------+-----------+----------+---------     starelid    | oid      |           | not null |   -- 对象OID,对应pg_class.oid   staattnum   | smallint |           | not null |   -- 该列在表中的位置序号,对应pg_attribute.attnum   stainherit  | boolean  |           | not null |   -- 是否为继承表的统计信息(false时表示当前表的统计信息,true时表示包含所有继承表的统计信息)   stanullfrac | real     |           | not null |   -- 空值比例   stawidth    | integer  |           | not null |   -- 平均长度   stadistinct | real     |           | not null |   -- 唯一值个数、比例   stakind1    | smallint |           | not null |   -- 表示第1个SLOT的统计信息分类编号   stakind2    | smallint |           | not null |   -- 表示第2个SLOT的统计信息分类编号   stakind3    | smallint |           | not null |   -- 表示第3个SLOT的统计信息分类编号   stakind4    | smallint |           | not null |   -- 表示第4个SLOT的统计信息分类编号   stakind5    | smallint |           | not null |   -- 表示第5个SLOT的统计信息分类编号   staop1      | oid      |           | not null |   -- 表示第1个SLOT的统计信息是用哪个operator生成的(例如统计柱状图边界,需要用到 "<" 这个操作符)   staop2      | oid      |           | not null |   -- 表示第2个SLOT的统计信息是用哪个operator生成的(例如统计柱状图边界,需要用到 "<" 这个操作符)   staop3      | oid      |           | not null |   -- 表示第3个SLOT的统计信息是用哪个operator生成的(例如统计柱状图边界,需要用到 "<" 这个操作符)   staop4      | oid      |           | not null |   -- 表示第4个SLOT的统计信息是用哪个operator生成的(例如统计柱状图边界,需要用到 "<" 这个操作符)   staop5      | oid      |           | not null |   -- 表示第5个SLOT的统计信息是用哪个operator生成的(例如统计柱状图边界,需要用到 "<" 这个操作符)   stanumbers1 | real[]   |           |          |   -- 表示第1个SLOT的以numeric[]为结果的统计信息,NULL说明这个SLOT分类没有numeric的统计信息。   stanumbers2 | real[]   |           |          |   -- 表示第2个SLOT的以numeric[]为结果的统计信息,NULL说明这个SLOT分类没有numeric的统计信息。   stanumbers3 | real[]   |           |          |   -- 表示第3个SLOT的以numeric[]为结果的统计信息,NULL说明这个SLOT分类没有numeric的统计信息。   stanumbers4 | real[]   |           |          |   -- 表示第4个SLOT的以numeric[]为结果的统计信息,NULL说明这个SLOT分类没有numeric的统计信息。   stanumbers5 | real[]   |           |          |   -- 表示第5个SLOT的以numeric[]为结果的统计信息,NULL说明这个SLOT分类没有numeric的统计信息。   stavalues1  | anyarray |           |          |   -- 表示第1个SLOT的以anyarray[]为结果的统计信息,NULL说明这个SLOT分类没有anyarray[]统计信息。数组类型为列的元素类型,或者列本身的类型。   stavalues2  | anyarray |           |          |   -- 表示第2个SLOT的以anyarray[]为结果的统计信息,NULL说明这个SLOT分类没有anyarray[]统计信息。数组类型为列的元素类型,或者列本身的类型。   stavalues3  | anyarray |           |          |   -- 表示第3个SLOT的以anyarray[]为结果的统计信息,NULL说明这个SLOT分类没有anyarray[]统计信息。数组类型为列的元素类型,或者列本身的类型。   stavalues4  | anyarray |           |          |   -- 表示第4个SLOT的以anyarray[]为结果的统计信息,NULL说明这个SLOT分类没有anyarray[]统计信息。数组类型为列的元素类型,或者列本身的类型。   stavalues5  | anyarray |           |          |   -- 表示第5个SLOT的以anyarray[]为结果的统计信息,NULL说明这个SLOT分类没有anyarray[]统计信息。数组类型为列的元素类型,或者列本身的类型。  Indexes:        "pg_statistic_relid_att_inh_index" UNIQUE, btree (starelid, staattnum, stainherit)

statkind的定义

src/include/catalog/pg_statistic.h

/*   * Currently, five statistical slot "kinds" are defined by core PostgreSQL,   * as documented below.  Additional "kinds" will probably appear in   * future to help cope with non-scalar datatypes.  Also, custom data types   * can define their own "kind" codes by mutual agreement between a custom   * typanalyze routine and the selectivity estimation functions of the type's   * operators.   *   * Code reading the pg_statistic relation should not assume that a particular   * data "kind" will appear in any particular slot.  Instead, search the   * stakind fields to see if the desired data is available.  (The standard   * function get_attstatsslot() may be used for this.)   */    /*   * The present allocation of "kind" codes is:   *   *      1-99:           reserved for assignment by the core PostgreSQL project   *                              (values in this range will be documented in this file)   *      100-199:        reserved for assignment by the PostGIS project   *                              (values to be documented in PostGIS documentation)   *      200-299:        reserved for assignment by the ESRI ST_Geometry project   *                              (values to be documented in ESRI ST_Geometry documentation)   *      300-9999:       reserved for future public assignments   *   * For private use you may choose a "kind" code at random in the range   * 10000-30000.  However, for code that is to be widely disseminated it is   * better to obtain a publicly defined "kind" code by request from the   * PostgreSQL Global Development Group.   */    /*   * In a "most common values" slot, staop is the OID of the "=" operator   * used to decide whether values are the same or not.  stavalues contains   * the K most common non-null values appearing in the column, and stanumbers   * contains their frequencies (fractions of total row count).  The values   * shall be ordered in decreasing frequency.  Note that since the arrays are   * variable-size, K may be chosen by the statistics collector.  Values should   * not appear in MCV unless they have been observed to occur more than once;   * a unique column will have no MCV slot.   */  #define STATISTIC_KIND_MCV      1    /*   * A "histogram" slot describes the distribution of scalar data.  staop is   * the OID of the "<" operator that describes the sort ordering.  (In theory,   * more than one histogram could appear, if a datatype has more than one   * useful sort operator.)  stavalues contains M (>=2) non-null values that   * divide the non-null column data values into M-1 bins of approximately equal   * population.  The first stavalues item is the MIN and the last is the MAX.   * stanumbers is not used and should be NULL.  IMPORTANT POINT: if an MCV   * slot is also provided, then the histogram describes the data distribution   * *after removing the values listed in MCV* (thus, it's a "compressed   * histogram" in the technical parlance).  This allows a more accurate   * representation of the distribution of a column with some very-common   * values.  In a column with only a few distinct values, it's possible that   * the MCV list describes the entire data population; in this case the   * histogram reduces to empty and should be omitted.   */  #define STATISTIC_KIND_HISTOGRAM  2    /*   * A "correlation" slot describes the correlation between the physical order   * of table tuples and the ordering of data values of this column, as seen   * by the "<" operator identified by staop.  (As with the histogram, more   * than one entry could theoretically appear.)  stavalues is not used and   * should be NULL.  stanumbers contains a single entry, the correlation   * coefficient between the sequence of data values and the sequence of   * their actual tuple positions.  The coefficient ranges from +1 to -1.   */  #define STATISTIC_KIND_CORRELATION      3    /*   * A "most common elements" slot is similar to a "most common values" slot,   * except that it stores the most common non-null *elements* of the column   * values.  This is useful when the column datatype is an array or some other   * type with identifiable elements (for instance, tsvector).  staop contains   * the equality operator appropriate to the element type.  stavalues contains   * the most common element values, and stanumbers their frequencies.  Unlike   * MCV slots, frequencies are measured as the fraction of non-null rows the   * element value appears in, not the frequency of all rows.  Also unlike   * MCV slots, the values are sorted into the element type's default order   * (to support binary search for a particular value).  Since this puts the   * minimum and maximum frequencies at unpredictable spots in stanumbers,   * there are two extra members of stanumbers, holding copies of the minimum   * and maximum frequencies.  Optionally, there can be a third extra member,   * which holds the frequency of null elements (expressed in the same terms:   * the fraction of non-null rows that contain at least one null element).  If   * this member is omitted, the column is presumed to contain no null elements.   *   * Note: in current usage for tsvector columns, the stavalues elements are of   * type text, even though their representation within tsvector is not   * exactly text.   */  #define STATISTIC_KIND_MCELEM  4    /*   * A "distinct elements count histogram" slot describes the distribution of   * the number of distinct element values present in each row of an array-type   * column.  Only non-null rows are considered, and only non-null elements.   * staop contains the equality operator appropriate to the element type.   * stavalues is not used and should be NULL.  The last member of stanumbers is   * the average count of distinct element values over all non-null rows.  The   * preceding M (>=2) members form a histogram that divides the population of   * distinct-elements counts into M-1 bins of approximately equal population.   * The first of these is the minimum observed count, and the last the maximum.   */  #define STATISTIC_KIND_DECHIST  5    /*   * A "length histogram" slot describes the distribution of range lengths in   * rows of a range-type column. stanumbers contains a single entry, the   * fraction of empty ranges. stavalues is a histogram of non-empty lengths, in   * a format similar to STATISTIC_KIND_HISTOGRAM: it contains M (>=2) range   * values that divide the column data values into M-1 bins of approximately   * equal population. The lengths are stored as float8s, as measured by the   * range type's subdiff function. Only non-null rows are considered.   */  #define STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM  6    /*   * A "bounds histogram" slot is similar to STATISTIC_KIND_HISTOGRAM, but for   * a range-type column.  stavalues contains M (>=2) range values that divide   * the column data values into M-1 bins of approximately equal population.   * Unlike a regular scalar histogram, this is actually two histograms combined   * into a single array, with the lower bounds of each value forming a   * histogram of lower bounds, and the upper bounds a histogram of upper   * bounds.  Only non-NULL, non-empty ranges are included.   */  #define STATISTIC_KIND_BOUNDS_HISTOGRAM  7

那么这些统计信息如何导入导出呢?

导出导入统计信息

dump_stat这个插件是PostgreSQL pro推出的兼容9.6版本的导出统计信息的插件。

代码如下:

导出

通过dump_stat导出(导出的结构已经是SQL形式),然后通过SQL导入。

$ psql test -A    test=# \t    test=# \o dump_stat.sql    test=# select dump_statistic();

pg_statistic的每一条记录,产生一条如下SQL:

WITH upsert as (      UPDATE pg_catalog.pg_statistic SET column_name = expression [, ...]      WHERE to_schema_qualified_relation(starelid) = t_relname        AND to_attname(t_relname, staattnum) = t_attname        AND to_atttype(t_relname, staattnum) = t_atttype        AND stainherit = t_stainherit      RETURNING *)    ins as (      SELECT expression [, ...]      WHERE NOT EXISTS (SELECT * FROM upsert)        AND to_attnum(t_relname, t_attname) IS NOT NULL        AND to_atttype(t_relname, t_attname) = t_atttype)    INSERT INTO pg_catalog.pg_statistic SELECT * FROM ins;        where expression can be one of:        array_in(array_text, type_name::regtype::oid, -1)    value::type_name

stat导入的实际例子

(表示public.test表的info列的统计信息,如果存在则更新,不存在则插入。)

WITH     upsert as (       UPDATE pg_catalog.pg_statistic       SET         stanullfrac = 0, stawidth = 4, stadistinct = -1, stakind1 = 2, stakind2 = 3, stakind3 = 0, stakind4 = 0, stakind5 = 0, staop1 = 'pg_catalog.<(pg_catalog.text, pg_catalog.text)'::regoperator, staop2 = 'pg_catalog.<(pg_catalog.text, pg_catalog.text)'::regoperator, staop3 = '0'::regoperator, staop4 = '0'::regoperator, staop5 = '0'::regoperator, stanumbers1 = NULL::real[], stanumbers2 = '{-0.5}'::real[], stanumbers3 = NULL::real[], stanumbers4 = NULL::real[], stanumbers5 = NULL::real[], stavalues1 = array_in('{abc,cde,test}', 'pg_catalog.text'::regtype, -1)::anyarray, stavalues2 = NULL::anyarray, stavalues3 = NULL::anyarray, stavalues4 = NULL::anyarray, stavalues5 = NULL::anyarray       WHERE to_schema_qualified_relation(starelid) = 'public.test' AND to_attname('public.test', staattnum) = 'info' AND to_atttype('public.test', staattnum) = 'pg_catalog.text' AND stainherit = false       RETURNING *    ),     ins as (       SELECT         'public.test'::regclass,         to_attnum('public.test', 'info'),         'false'::boolean,         0::real,         4::integer,         -1::real,         2,  -- stakind=2 表示柱状图        3,  -- stakind=3 表示相关性        0,         0,         0,         'pg_catalog.<(pg_catalog.text, pg_catalog.text)'::regoperator,         'pg_catalog.<(pg_catalog.text, pg_catalog.text)'::regoperator,         '0'::regoperator,         '0'::regoperator,         '0'::regoperator,         NULL::real[],         '{-0.5}'::real[],         NULL::real[],         NULL::real[],         NULL::real[],         array_in('{abc,cde,test}', 'pg_catalog.text'::regtype, -1)::anyarray,         NULL::anyarray,         NULL::anyarray,         NULL::anyarray,         NULL::anyarray       WHERE NOT EXISTS (SELECT * FROM upsert) AND to_attnum('public.test', 'info') IS NOT NULL AND to_atttype('public.test', 'info') = 'pg_catalog.text'    )   INSERT INTO pg_catalog.pg_statistic SELECT * FROM ins;

导入

1、修改postgresql.conf,允许修改系统表,重启数据库生效配置

vi postgresql.conf      allow_system_table_mods=on    pg_ctl restart -m fast

2、导入统计信息

-- 1 pg_class        update pg_class set reltuples=?, relpages=? where oid=?;        -- 2 pg_statistic        WITH upsert as (      UPDATE pg_catalog.pg_statistic SET column_name = expression [, ...]      WHERE to_schema_qualified_relation(starelid) = t_relname        AND to_attname(t_relname, staattnum) = t_attname        AND to_atttype(t_relname, staattnum) = t_atttype        AND stainherit = t_stainherit      RETURNING *)    ins as (      SELECT expression [, ...]      WHERE NOT EXISTS (SELECT * FROM upsert)        AND to_attnum(t_relname, t_attname) IS NOT NULL        AND to_atttype(t_relname, t_attname) = t_atttype)    INSERT INTO pg_catalog.pg_statistic SELECT * FROM ins;        where expression can be one of:        array_in(array_text, type_name::regtype::oid, -1)    value::type_name

3、导入完成后,将allow_system_table_mods设置为off,重启数据库。

dump_statistic代码

CREATE FUNCTION dump_statistic(relid oid) RETURNS SETOF TEXT AS $$          DECLARE                  result  text;                                    cmd             text;           -- main query                  in_args text;           -- args for insert                  up_args text;           -- args for upsert                                    fstaop  text := '%s::regoperator';                  arr_in  text := 'array_in(%s, %s::regtype, -1)::anyarray';                                    stacols text[] = ARRAY['stakind', 'staop',                                                             'stanumbers', 'stavalues' ];                                    r               record;                  i               int;                  j               text;                  ncols   int := 26;      -- number of columns in pg_statistic                                    stanum  text[];         -- stanumbers{1, 5}                  staval  text[];         -- stavalues{1, 5}                  staop   text[];         -- staop{1, 5}                                    relname text;           -- quoted relation name                  attname text;           -- quoted attribute name                  atttype text;           -- quoted attribute type                            BEGIN                  for r in                                  select * from pg_catalog.pg_statistic                                  where starelid = relid                                          and get_namespace(starelid) != to_namespace('information_schema')                                          and get_namespace(starelid) != to_namespace('pg_catalog') loop                                                    relname := to_schema_qualified_relation(r.starelid);                          attname := quote_literal(to_attname(relname, r.staattnum));                          atttype := quote_literal(to_atttype(relname, r.staattnum));                          relname := quote_literal(relname); -- redefine relname                                                    in_args := '';                          up_args = 'stanullfrac = %s, stawidth = %s, stadistinct = %s, ';                                                    cmd := 'WITH upsert as ( ' ||                                                  'UPDATE pg_catalog.pg_statistic SET %s ' ||                                                  'WHERE to_schema_qualified_relation(starelid) = ' || relname || ' '                                                          'AND to_attname(' || relname || ', staattnum) = ' || attname || ' '                                                          'AND to_atttype(' || relname || ', staattnum) = ' || atttype || ' '                                                          'AND stainherit = ' || r.stainherit || ' ' ||                                                  'RETURNING *), ' ||                                     'ins as ( ' ||                                                  'SELECT %s ' ||                                                  'WHERE NOT EXISTS (SELECT * FROM upsert) ' ||                                                          'AND to_attnum(' || relname || ', ' || attname || ') IS NOT NULL '                                                          'AND to_atttype(' || relname || ', ' || attname || ') = ' || atttype || ') '                                     'INSERT INTO pg_catalog.pg_statistic SELECT * FROM ins;';                                                                    for i in 1..ncols loop                                  in_args := in_args || '%s';                                    if i != ncols then                                          in_args := in_args || ', ';                                  end if;                          end loop;                                                            for j in 1..4 loop                                  for i in 1..5 loop                                          up_args := up_args || format('%s%s = %%s', stacols[j], i);                                            if i * j != 20 then                                                  up_args := up_args || ', ';                                          end if;                                  end loop;                          end loop;                                                    cmd := format(cmd, up_args, in_args);   --prepare template for main query                            staop := array[format(fstaop, quote_literal(to_schema_qualified_operator(r.staop1))),                                                     format(fstaop, quote_literal(to_schema_qualified_operator(r.staop2))),                                                     format(fstaop, quote_literal(to_schema_qualified_operator(r.staop3))),                                                     format(fstaop, quote_literal(to_schema_qualified_operator(r.staop4))),                                                     format(fstaop, quote_literal(to_schema_qualified_operator(r.staop5)))];                            stanum := array[r.stanumbers1::text,                                                          r.stanumbers2::text,                                                          r.stanumbers3::text,                                                          r.stanumbers4::text,                                                          r.stanumbers5::text];                                                                                    for i in 1..5 loop                                  if stanum[i] is null then                                          stanum[i] := 'NULL::real[]';                                  else                                          stanum[i] := '''' || stanum[i] || '''::real[]';                                  end if;                          end loop;                            if r.stavalues1 is not null then                                  staval[1] := format(arr_in, quote_literal(r.stavalues1),                                                                          quote_literal(                                                                                  to_schema_qualified_type(                                                                                          anyarray_elemtype(r.stavalues1))));                          else                                  staval[1] := 'NULL::anyarray';                          end if;                            if r.stavalues2 is not null then                                  staval[2] := format(arr_in, quote_literal(r.stavalues2),                                                                          quote_literal(                                                                                  to_schema_qualified_type(                                                                                          anyarray_elemtype(r.stavalues2))));                          else                                  staval[2] := 'NULL::anyarray';                          end if;                            if r.stavalues3 is not null then                                  staval[3] := format(arr_in, quote_literal(r.stavalues3),                                                                          quote_literal(                                                                                  to_schema_qualified_type(                                                                                          anyarray_elemtype(r.stavalues3))));                          else                                  staval[3] := 'NULL::anyarray';                          end if;                            if r.stavalues4 is not null then                                  staval[4] := format(arr_in, quote_literal(r.stavalues4),                                                                          quote_literal(                                                                                  to_schema_qualified_type(                                                                                          anyarray_elemtype(r.stavalues4))));                          else                                  staval[4] := 'NULL::anyarray';                          end if;                            if r.stavalues5 is not null then                                  staval[5] := format(arr_in, quote_literal(r.stavalues5),                                                                          quote_literal(                                                                                  to_schema_qualified_type(                                                                                          anyarray_elemtype(r.stavalues5))));                          else                                  staval[5] := 'NULL::anyarray';                          end if;                                                    --DEBUG                          --staop := array['{arr}', '{arr}', '{arr}', '{arr}', '{arr}'];                          --stanum := array['{num}', '{num}', '{num}', '{num}', '{num}'];                          --staval := array['{val}', '{val}', '{val}', '{val}', '{val}'];                            result := format(cmd,                                                           r.stanullfrac,                                                           r.stawidth,                                                           r.stadistinct,                                                           -- stakind                                                           r.stakind1, r.stakind2, r.stakind3, r.stakind4, r.stakind5,                                                           -- staop                                                           staop[1], staop[2], staop[3], staop[4], staop[5],                                                           -- stanumbers                                                           stanum[1], stanum[2], stanum[3], stanum[4], stanum[5],                                                           -- stavalues                                                           staval[1], staval[2], staval[3], staval[4], staval[5],                                                                                                                      -- first 6 columns                                                           format('%s::regclass', relname),                                                           format('to_attnum(%s, %s)', relname, attname),                                                           '''' || r.stainherit || '''::boolean',                                                           r.stanullfrac || '::real',                                                           r.stawidth || '::integer',                                                           r.stadistinct || '::real',                                                           -- stakind                                                           r.stakind1, r.stakind2, r.stakind3, r.stakind4, r.stakind5,                                                           -- staop                                                           staop[1], staop[2], staop[3], staop[4], staop[5],                                                           -- stanumbers                                                           stanum[1], stanum[2], stanum[3], stanum[4], stanum[5],                                                           -- stavalues                                                           staval[1], staval[2], staval[3], staval[4], staval[5]);                            return next result;                  end loop;                    return;          END;  $$ LANGUAGE plpgsql;

我们甚至可以将Oracle数据库的统计信息,平移到PG数据库,对齐需要的元素即可:

记录数、占用多少个数据块。每列的空值比例、平均列宽、唯一值比例或个数、高频值以及频率、柱状图分布、存储相关性、多值列(高频元素及比例、元素柱状图分布)。

好处:在迁移ORACLE数据时,可以关闭autovacuumm(提高导入速度),通过这种方法来导入统计信息。(只要元素对应即可,当然有些元素可能是ORACLE中不采集的,比如多值列的统计信息)。

参考

转载地址:http://gcpxx.baihongyu.com/

你可能感兴趣的文章
iOS 开发百问(2)
查看>>
MySQL for Mac 安装和基本操作(包含后期的环境变量设置)
查看>>
Linux及windows下常见压缩程序的压缩能力对比
查看>>
JAVAEE-junit测试hibernate里的方法(hibernate交给spring管理)的问题
查看>>
MOTO MB860 国行2.3.5优化增强ROM_Top_T5_end(经典收藏版)
查看>>
C#学习经典(二)---MVC框架(Model view Controller)
查看>>
log4j配置文件说明
查看>>
Maven: 为Compiler插件设置source和target版本
查看>>
linux下永久添加静态路由
查看>>
android 全局变量和局部变量命名规则
查看>>
Ubuntu Sub-process /usr/bin/dpkg
查看>>
详解DNS的常用记录(下):DNS系列之三
查看>>
linux的日志服务器关于屏蔽一些关键字的方法
查看>>
事情的两面性
查看>>
只要会营销,shi都能卖出去?
查看>>
sed单行处理命令奇偶行输出
查看>>
走向DBA[MSSQL篇] 从SQL语句的角度 提高数据库的访问性能
查看>>
VC++深入详解学习笔记1
查看>>
安装配置discuz
查看>>
CentOS7 64位小型操作系统的安装
查看>>