nz_genstats [-info] [-full|-express|-basic] <database> [table ...]
Purpose: Generate statistics on those tables that don't have up-to-date statistics.
The optimizer uses statistics to guide its decisions on how best to
execute a query. The more reliable and up-to-date the statistics are,
the more accurate the optimizer's decisions are likely to be.
You can easily "GENERATE STATISTICS;" for an entire database;
Or you could issue a "GENERATE STATISTICS ON <table>;"
or a "GENERATE EXPRESS STATISTICS ON <table>;".
In this case, you must issue the sql statement on a table-by-table basis.
But what if a given table already has up-to-date statistics? Then
regenerating statistics for it would be a waste of your time and the
system's resources.
This script first checks the state of the statistics for each table,
and then only regenerates statistics on those tables that have outdated
statistics. (Statistics are either 100% up-to-date+accurate+reliable,
or they aren't at 100%. There is no other statistical measure that
measures statistics).
Note: If you wish to get/display the current statistic values for any
given table you can use the script "nz_get"
Inputs: The database name is required.
The table name is optional. If not specified then all tables in the
database will be checked, and the statistics refreshed as appropriate.
Or, you can specify one (or many) table names to be checked.
If you have a file that contains a list of tablenames to be checked,
it can be used via the following syntax:
nz_genstats DBNAME `cat /tmp/the_list_of_tables`
-info An optional switch
If included, then statistics won't actually be generated on any table.
Instead, this script will simply report information about what tables
have up-to-date statistics and what tables need to have statistics
generated against them.
[-full|-express|-basic] An optional switch
You can control whether FULL, EXPRESS, or BASIC stats are generated for
the tables. (For an explanation of the difference between the three
choices, see "nz_get -help" ). If not specified, then this script will
adhere to whatever type of stats each individual table currently has
associated with it.
Full statistics take the longest to generate -- but provide more accurate
dispersion information (the # of unique values) for each column.
Express statistics use a faster (but less accurate) mathematical formula
to determine the dispersion.
Basic statistics can be generated the fastest -- because they skip the
dispersion calculation entirely. (This is new as of NPS Version 4.6).
Full statistics are recommended for smaller dimension tables ... which
usually have fewer rows ... where the accuracy of the dispersion value
tends to have more of an impact on query planning.
For larger fact tables, express statistics or basic statistics are
typically used because of the time savings in generating the statistics.
[-min <nnn>] Optional switches (default is -min 0)
[-max <nnn>] (default is -max 99999999999999999)
These switches allow you to control which tables are to have a GENSTATS
run against them ... based on their table rowcount being between the -min
and -max values specifed. The default is for all tables to be included.
Example: Specifying '-max 1000000' will result in only tables with
<= 1M rows having a GENSTATS performed against them (and then only if
necessary ... if the statistics aren't already up-to-date).
Outputs: Sample output. The first run included the "-info" switch. No changes
to the database were made. The script simply lists what it WOULD have
done if you let it.
The second run actually invoked GENSTATS on three of the tables. The
elapsed runtime for the individual operations is included.
Purpose: Generate statistics on those tables that don't have up-to-date statistics.
The optimizer uses statistics to guide its decisions on how best to
execute a query. The more reliable and up-to-date the statistics are,
the more accurate the optimizer's decisions are likely to be.
You can easily "GENERATE STATISTICS;" for an entire database;
Or you could issue a "GENERATE STATISTICS ON <table>;"
or a "GENERATE EXPRESS STATISTICS ON <table>;".
In this case, you must issue the sql statement on a table-by-table basis.
But what if a given table already has up-to-date statistics? Then
regenerating statistics for it would be a waste of your time and the
system's resources.
This script first checks the state of the statistics for each table,
and then only regenerates statistics on those tables that have outdated
statistics. (Statistics are either 100% up-to-date+accurate+reliable,
or they aren't at 100%. There is no other statistical measure that
measures statistics).
Note: If you wish to get/display the current statistic values for any
given table you can use the script "nz_get"
Inputs: The database name is required.
The table name is optional. If not specified then all tables in the
database will be checked, and the statistics refreshed as appropriate.
Or, you can specify one (or many) table names to be checked.
If you have a file that contains a list of tablenames to be checked,
it can be used via the following syntax:
nz_genstats DBNAME `cat /tmp/the_list_of_tables`
-info An optional switch
If included, then statistics won't actually be generated on any table.
Instead, this script will simply report information about what tables
have up-to-date statistics and what tables need to have statistics
generated against them.
[-full|-express|-basic] An optional switch
You can control whether FULL, EXPRESS, or BASIC stats are generated for
the tables. (For an explanation of the difference between the three
choices, see "nz_get -help" ). If not specified, then this script will
adhere to whatever type of stats each individual table currently has
associated with it.
Full statistics take the longest to generate -- but provide more accurate
dispersion information (the # of unique values) for each column.
Express statistics use a faster (but less accurate) mathematical formula
to determine the dispersion.
Basic statistics can be generated the fastest -- because they skip the
dispersion calculation entirely. (This is new as of NPS Version 4.6).
Full statistics are recommended for smaller dimension tables ... which
usually have fewer rows ... where the accuracy of the dispersion value
tends to have more of an impact on query planning.
For larger fact tables, express statistics or basic statistics are
typically used because of the time savings in generating the statistics.
[-min <nnn>] Optional switches (default is -min 0)
[-max <nnn>] (default is -max 99999999999999999)
These switches allow you to control which tables are to have a GENSTATS
run against them ... based on their table rowcount being between the -min
and -max values specifed. The default is for all tables to be included.
Example: Specifying '-max 1000000' will result in only tables with
<= 1M rows having a GENSTATS performed against them (and then only if
necessary ... if the statistics aren't already up-to-date).
Outputs: Sample output. The first run included the "-info" switch. No changes
to the database were made. The script simply lists what it WOULD have
done if you let it.
The second run actually invoked GENSTATS on three of the tables. The
elapsed runtime for the individual operations is included.
No comments:
Post a Comment