Changes between Version 105 and Version 106 of PetascopeUserGuide


Ignore:
Timestamp:
Dec 30, 2015 3:22:42 PM (21 months ago)
Author:
pbaumann
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • PetascopeUserGuide

    v105 v106  
    3535 * For the handling of '''time''' series see [PetascopeTimeHandling this page].
    3636 * For insights on dimensionless '''index''' coordinate reference systems see [IndexCrss this page].
     37
     38== Caches ==
     39
     40Petascope currently keeps a few internal caches, especially for [SecoreUserGuide SECORE] CRS resources and responses: the gain is both on performance and on robustness against resolver's or network  problems. Caching information about CRSs is safe: CRSs can be considered static resources, they might not change for years, or never.
     41
     42It should be noted that even in case of a locally deployed resolver, still coverages can be configured to a CRS whose definition is provided by an external resolver.
     43
     44Thanks to caches the network traffic for elaborate a request goes down to zero, so the performance gain is relevant: indeed you will observe slow responses on a newly deployed Petascope, but when caches start to get filled then response times get much shorter. It is suggested to run a WCS ''!GetCapabilities'' after a fresh new deployment, so that the CRS definitions of all the offered coverages are cached: after that single request, mainly almost all the CRS-related information has already been cached.
    3745
    3846== Static tables ==
     
    207215
    208216As an example, you can see the oracle of our WCS test for 0-dimensional output rectified grids: browser:systemtest/testcases_services/test_wcs/oracle/18-get_coverage_0D.oracle .
    209 
    210 === Updating the database schema ===
    211 
    212 The update process for the database schema of `petascopedb` is organized by means of a set of `update<N>.*` files, which are excecuted in a loop by [browser:applications/petascope/src/main/db/update_petascopedb.sh.in update_petascopedb.sh].
    213 
    214 Not all of them represent actual ''updates'': there can be ''upgrades'' as well. The major 9.0 [browser:applications/petascope/src/main/db/petascope/update8.sh upgrade script] for instance creates a completely new set of tables: to avoid the deletion of important data, existing tables are moved to a backup [http://www.postgresql.org/docs/9.3/static/ddl-schemas.html schema], whose name is `'ps_pre_update<N>'`, being `update<N>` the basename (no suffix) of the latest ''pre-upgrade'' db status. The upgrade then proceeds with i) the creation of new tables and db objects (triggers, procedures, indexes) inside an interim schema, ii) the migration of pre-existing coverages' metadata to these new tables and iii) moving the old tables to a backup schema. The new upgraded schema is finally published to the default `public` schema of Postgres. WMS tables, whose [browser:applications/petascope/src/main/db/petascope/update5.sql schema] has not changed ever since, are simply moved to the `public` schema.
    215 
    216 Here is the synopsis of  `update_petascopedb.sh`:
    217 {{{
    218 usage: update_petascopedb.sh [--revert] [--cleanup] [--help]
    219 where:
    220 --revert
    221     Restore tables to a pre-upgrade backed-up schema (ps_pre_updateX).
    222 --cleanup
    223     Drops all schemas in petascopedb, but the public one.
    224 --help
    225     Show this message.
    226 }}}
    227 
    228 We see that we have two options:
    229 
    230    i. '''`--revert`''' : moves WMS tables back to the backup schema; drops the new tables and restores the old tables: the final db status will be exactly the same as before running the last upgrade.
    231    i. '''`--cleanup`''' : drops the backup schema in `petascopedb`.
    232 
    233 Calling the script on an already synchronized database will have no effect.
    234 
    235 The database of geo-metadata is generally a small database, which hardly reaches relevant sizes.
    236 While one can restore a pre-upgrade snapshot with the `--revert` option (if the snapshot has not been cleaned up), it is then suggested to keep a backup dump of the database: an additional layer of backup at almost no cost.
    237 
    238 Before starting the upgrade, you might want to understand more in detail what is done by the migration script. The following list explains the flow of operations what are done by the [browser:applications/petascope/src/main/db/petascope/update8/migrate.sql migration script] (pre-upgrade and post-upgrade tables will be prefixed `'ps8'` and `'ps9'` respectively):
    239 
    240    * main : `db_migration()`
    241       * `migrate_uoms()` : migrate the catalog of '''Unit of Measures''' (UoM) from `ps8_uom` to `ps9_uom`/`ps9_quantity`; a UoM in the previous schema can indeed be seen as a minimal informational basis for a SWE field.
    242       * `migrate_crss()` : migrate the catalog of '''Coordinate Reference Systems''' (CRSs) from `ps8_crs` to `ps9_crs`; mind here that all CRSs are migrated but from version 9.0 the CRSs shall compulsorily be an actionable HTTP URI resolving to a valid GML definition of CRS. Additionally, in order to let the migrator understand whenever a CRS defines latitude first it is suggested to either turn all CRS in `ps8_crs` to URIs ''`.../def/crs/ESPG/0/<code>`''or to ''`EPSG:<code>`'' format (indeed only ESPG codes can be recognized to define latitude first: see this [browser:applications/petascope/src/main/db/petascope/update8/north_first_crss.sql table]); `CRS:1` should be left as is instead: it will be turned to an appropriate [IndexCrss index] CRS.
    243       * forall '''coverages''' in `ps8_coverage` do `migrate_coverage()`:
    244          * fetch GMLCOV coverage type from `ps8_coverage` fill in `ps9_coverage` table  .. GMLCOV types need to appear to be ''grid'' coverages (~"*Grid*") and must be a legal grid coverage value (see ''" GML coverage types"'' in [browser:applications/petascope/src/main/db/petascope/update8/populate.sql this] file), otherwise the coverage is not migrated;
    245          * `'application/x-octet-stream'` MIME type is assigned by default (`rasdaman` data source);
    246          * migrate optionally stored GMLCOV extra metadata annotations from `ps8_metadata` to `ps9_extra_metadata`;
    247          * build the native CRS for `ps9_domain_set` by concatenating the CRSs associated to each coverage axis in `ps8_crsset` and filling URIs in `ps9_crs` (then referenced by `ps9_domain_set`): axes with __type__ `'t'` (for ''temporal'', see `ps8_axistype`) are assigned the `OGC:AnsiDate` CRS URI (see time handling [wiki:PetascopeTimeHandling_8_5 prior to] and [PetascopeTimeHandling from] 9.0) while axes with `CRS:1` assignment are given an appropriate [IndexCrss index] CRS, depending on how many consecutive `CRS:1` axes are found (see `translate_crs()` function in the [browser:applications/petascope/src/main/db/petascope/update8/utilities.sh utilities]);
    248          * determine the '''origin''' of the coverage for `ps9_gridded_domain_set`: axis footprint/resolution is firstly recovered from coverage extent in `ps8_domain` and number of cells in `ps8_celldomain` -- `(dom_max-dom_min)/(cdom_max-cdom_min+1)`) -- then origin is placed in the upper-left corner (bottom of cube in case of 3D, and so on) in the centre of the sample space (e.g. pixel-centre); the order of the axes defined in the CRS is strictly followed, hence e.g. in case of [www.opengis.net/def/crs/EPSG/0/4326 WGS84] geographic CRS, latitudes will appear first in the tuple of coordinates (see [PetascopeSubsets this] page for a more thorough description of coverage geometries in v9.0);
    249             * for `CRS:1` indexed axis, the `+1` term in the denominator is removed: the domain is like the cell domain.
    250          * '''offset vectors''' are computed for `ps9_grid_axis` and `ps9_rectilinear_axis`: again the axis order of the CRS is kept into account and `CRS:1` indexed axes will not treat the domain extent as dense but indexed (no `+1` in the denominator of the resolution formula); as origin is now in the upper-left corner in the 2D horizontal space, the vector associated with northings will point South (negative norm);
    251          * the '''range set''' of the coverage is collected for `ps9_range_set` and `ps9_rasdaman_collection`: using the [http://www.postgresql.org/docs/current/static/dblink.html dblink] module, the OID of the collection marray is fetched from `RASBASE` (if more than 1 marray is found in the collection, then a warning will be thrown in the final log, and only the first will be selected: a coverage is now associated to one-and-only-one marray);
    252          * the '''range type''' is built for `ps9_range_type_component` and [PetascopeDevGuide SWE-related tables]: !UoMs/Quantities migrated in the first stage of migration (see above) are here linked to their correspondent coverages; data type from `ps8_datatype` is moved to `ps9_range_data_type`.
    253 
    254 A final report of coverage migrations is then printed: name and general information on coverage domain and range will be displayed for every migrated coverage. If some problem arose while migrating a coverage, an additional ''log'' column will describe what did not work. In case the mistake can be fixed (maybe changing the pre-upgrade database?), then one can just repeat the operation by reverting back (`--revert`) and re-migrating.
    255217
    256218
     
    684646 * ...
    685647
    686 == Caches ==
    687 
    688 Petascope currently keeps a few internal caches, especially for [SecoreUserGuide SECORE] CRS resources and responses: the gain is both on performance and on robustness against resolver's or network  problems. Caching information about CRSs is safe: CRSs can be considered static resources, they might not change for years, or never.
    689 
    690 It should be noted that even in case of a locally deployed resolver, still coverages can be configured to a CRS whose definition is provided by an external resolver.
    691 
    692 Thanks to caches the network traffic for elaborate a request goes down to zero, so the performance gain is relevant: indeed you will observe slow responses on a newly deployed Petascope, but when caches start to get filled then response times get much shorter. It is suggested to run a WCS ''!GetCapabilities'' after a fresh new deployment, so that the CRS definitions of all the offered coverages are cached: after that single request, mainly almost all the CRS-related information has already been cached.
    693 
    694648== Tickets reporting deviations from standards ==
    695649Tickets reporting deviations should be tagged with the keyword "deviation" so they can get listed here and assessed for documentation.