Changes between Version 70 and Version 71 of PetascopeDevGuide


Ignore:
Timestamp:
Dec 30, 2015 3:47:06 PM (21 months ago)
Author:
pbaumann
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • PetascopeDevGuide

    v70 v71  
     1[[PageOutline]]
    12[[TracNav(TOC)]]
    2 [[PageOutline]]
    33
    44= Petascope Developer's Documentation =
     
    185185
    186186
     187== Updating the schema ==
     188
     189Different versions of the database are tracked in the table __'''`ps_dbupdates`'''__, which stores the version number `<N>`__`+1`__ as a simple integer value.
     190
     191Version numbers are linked to the one `'update<N>.*'` update file inside the [browser:applications/petascope/src/main/db/petascope/ folder] of db updates (copied to `<install>/share/rasdaman/petascope/` during installation). Such files can be a set of SQL statements (`.sql`) or as well a shell script (`.sh`). In the latter case, the handling the update is possibly done with other auxiliary files, which should then be placed inside a folder `'update<N>/'`, like done for the [browser:applications/petascope/src/main/db/petascope/update8.sh major upgrade] script from version 8.5 to version 9.0.
     192
     193The script [browser:applications/petascope/src/main/db/update_petascopedb.sh.in update_petascopedb.sh] is the root handler: it checks the current status of the database, if ever created, and updates it to the latest available update in `<install>/share/rasdaman/petascope/`. Each SQL update is executed in a single transaction, whereas it is up to the developer to ensure atomic updates in case the shell script is chosen. After successful execution of each update file, `ps_dbupdates` is synchronized to the new `<N>+1` value.
     194
     195The major 9.0 [browser:applications/petascope/src/main/db/petascope/update8.sh upgrade script] creates a completely new set of tables, it does not represent an incremental update. To avoid the deletion of important data, existing tables are moved to a backup [http://www.postgresql.org/docs/9.3/static/ddl-schemas.html schema], whose name is `'ps_pre_update<N>'`, being `update<N>` the basename (no suffix) of the latest ''pre-upgrade'' db status. The upgrade then proceeds with i) the creation of new tables and db objects (triggers, procedures, indexes) inside an interim schema, ii) the migration of pre-existing coverages' metadata to these new tables and iii) moving the old tables to a backup schema. The new upgraded schema is finally published to the default public schema of Postgres. WMS tables, whose schema has not changed ever since, are simply moved to the public schema. The interim schema is moved on top of the [http://www.postgresql.org/docs/9.3/static/functions-info.html search patch] of Postgres during upgrade so that the upgrade files are schema independent.
     196
     197Here is the synopsis of update_petascopedb.sh:
     198
     199{{{
     200usage: update_petascopedb.sh [--revert] [--cleanup] [--help]
     201where:
     202--revert
     203    Restore tables to a pre-upgrade backed-up schema (ps_pre_updateX).
     204--cleanup
     205    Drops all schemas in petascopedb, but the public one.
     206--help
     207    Show this message.
     208}}}
     209
     210We see that we have two options:
     211
     212   i. '''`--revert`''' : moves WMS tables back to the backup schema; drops the new tables and restores the old tables: the final db status will be exactly the same as before running the last upgrade.
     213   i. '''`--cleanup`''' : drops the backup schema in `petascopedb`.
     214
     215Calling the script on an already synchronized database will have no effect.
     216
     217A final note: to avoid clashes with old existing tables during the upgrade while not breaking running services, new tables have a temporary `ps9` prefix while co-existing with the other pre-upgrade tables: at the end of a successful [browser:applications/petascope/src/main/db/petascope/update8.sh upgrade], such tables (+ primary indexes) are renamed back to the usual `ps` prefix. Some non-primary indexes and constraints might still harmlessly have `ps9` prefix though. Check the [https://groups.google.com/d/msg/rasdaman-dev/M0eibKBS16A/VvTVASuR0r4J discussion] in the mailing list for further insights.
     218
     219
     220=== Updating the database schema: is this below more than duplicate? ===
     221
     222The update process for the database schema of `petascopedb` is organized by means of a set of `update<N>.*` files, which are excecuted in a loop by [browser:applications/petascope/src/main/db/update_petascopedb.sh.in update_petascopedb.sh].
     223
     224Not all of them represent actual ''updates'': there can be ''upgrades'' as well. The major 9.0 [browser:applications/petascope/src/main/db/petascope/update8.sh upgrade script] for instance creates a completely new set of tables: to avoid the deletion of important data, existing tables are moved to a backup [http://www.postgresql.org/docs/9.3/static/ddl-schemas.html schema], whose name is `'ps_pre_update<N>'`, being `update<N>` the basename (no suffix) of the latest ''pre-upgrade'' db status. The upgrade then proceeds with i) the creation of new tables and db objects (triggers, procedures, indexes) inside an interim schema, ii) the migration of pre-existing coverages' metadata to these new tables and iii) moving the old tables to a backup schema. The new upgraded schema is finally published to the default `public` schema of Postgres. WMS tables, whose [browser:applications/petascope/src/main/db/petascope/update5.sql schema] has not changed ever since, are simply moved to the `public` schema.
     225
     226Here is the synopsis of  `update_petascopedb.sh`:
     227{{{
     228usage: update_petascopedb.sh [--revert] [--cleanup] [--help]
     229where:
     230--revert
     231    Restore tables to a pre-upgrade backed-up schema (ps_pre_updateX).
     232--cleanup
     233    Drops all schemas in petascopedb, but the public one.
     234--help
     235    Show this message.
     236}}}
     237
     238We see that we have two options:
     239
     240   i. '''`--revert`''' : moves WMS tables back to the backup schema; drops the new tables and restores the old tables: the final db status will be exactly the same as before running the last upgrade.
     241   i. '''`--cleanup`''' : drops the backup schema in `petascopedb`.
     242
     243Calling the script on an already synchronized database will have no effect.
     244
     245The database of geo-metadata is generally a small database, which hardly reaches relevant sizes.
     246While one can restore a pre-upgrade snapshot with the `--revert` option (if the snapshot has not been cleaned up), it is then suggested to keep a backup dump of the database: an additional layer of backup at almost no cost.
     247
     248Before starting the upgrade, you might want to understand more in detail what is done by the migration script. The following list explains the flow of operations what are done by the [browser:applications/petascope/src/main/db/petascope/update8/migrate.sql migration script] (pre-upgrade and post-upgrade tables will be prefixed `'ps8'` and `'ps9'` respectively):
     249
     250   * main : `db_migration()`
     251      * `migrate_uoms()` : migrate the catalog of '''Unit of Measures''' (UoM) from `ps8_uom` to `ps9_uom`/`ps9_quantity`; a UoM in the previous schema can indeed be seen as a minimal informational basis for a SWE field.
     252      * `migrate_crss()` : migrate the catalog of '''Coordinate Reference Systems''' (CRSs) from `ps8_crs` to `ps9_crs`; mind here that all CRSs are migrated but from version 9.0 the CRSs shall compulsorily be an actionable HTTP URI resolving to a valid GML definition of CRS. Additionally, in order to let the migrator understand whenever a CRS defines latitude first it is suggested to either turn all CRS in `ps8_crs` to URIs ''`.../def/crs/ESPG/0/<code>`''or to ''`EPSG:<code>`'' format (indeed only ESPG codes can be recognized to define latitude first: see this [browser:applications/petascope/src/main/db/petascope/update8/north_first_crss.sql table]); `CRS:1` should be left as is instead: it will be turned to an appropriate [IndexCrss index] CRS.
     253      * forall '''coverages''' in `ps8_coverage` do `migrate_coverage()`:
     254         * fetch GMLCOV coverage type from `ps8_coverage` fill in `ps9_coverage` table  .. GMLCOV types need to appear to be ''grid'' coverages (~"*Grid*") and must be a legal grid coverage value (see ''" GML coverage types"'' in [browser:applications/petascope/src/main/db/petascope/update8/populate.sql this] file), otherwise the coverage is not migrated;
     255         * `'application/x-octet-stream'` MIME type is assigned by default (`rasdaman` data source);
     256         * migrate optionally stored GMLCOV extra metadata annotations from `ps8_metadata` to `ps9_extra_metadata`;
     257         * build the native CRS for `ps9_domain_set` by concatenating the CRSs associated to each coverage axis in `ps8_crsset` and filling URIs in `ps9_crs` (then referenced by `ps9_domain_set`): axes with __type__ `'t'` (for ''temporal'', see `ps8_axistype`) are assigned the `OGC:AnsiDate` CRS URI (see time handling [wiki:PetascopeTimeHandling_8_5 prior to] and [PetascopeTimeHandling from] 9.0) while axes with `CRS:1` assignment are given an appropriate [IndexCrss index] CRS, depending on how many consecutive `CRS:1` axes are found (see `translate_crs()` function in the [browser:applications/petascope/src/main/db/petascope/update8/utilities.sh utilities]);
     258         * determine the '''origin''' of the coverage for `ps9_gridded_domain_set`: axis footprint/resolution is firstly recovered from coverage extent in `ps8_domain` and number of cells in `ps8_celldomain` -- `(dom_max-dom_min)/(cdom_max-cdom_min+1)`) -- then origin is placed in the upper-left corner (bottom of cube in case of 3D, and so on) in the centre of the sample space (e.g. pixel-centre); the order of the axes defined in the CRS is strictly followed, hence e.g. in case of [www.opengis.net/def/crs/EPSG/0/4326 WGS84] geographic CRS, latitudes will appear first in the tuple of coordinates (see [PetascopeSubsets this] page for a more thorough description of coverage geometries in v9.0);
     259            * for `CRS:1` indexed axis, the `+1` term in the denominator is removed: the domain is like the cell domain.
     260         * '''offset vectors''' are computed for `ps9_grid_axis` and `ps9_rectilinear_axis`: again the axis order of the CRS is kept into account and `CRS:1` indexed axes will not treat the domain extent as dense but indexed (no `+1` in the denominator of the resolution formula); as origin is now in the upper-left corner in the 2D horizontal space, the vector associated with northings will point South (negative norm);
     261         * the '''range set''' of the coverage is collected for `ps9_range_set` and `ps9_rasdaman_collection`: using the [http://www.postgresql.org/docs/current/static/dblink.html dblink] module, the OID of the collection marray is fetched from `RASBASE` (if more than 1 marray is found in the collection, then a warning will be thrown in the final log, and only the first will be selected: a coverage is now associated to one-and-only-one marray);
     262         * the '''range type''' is built for `ps9_range_type_component` and [PetascopeDevGuide SWE-related tables]: !UoMs/Quantities migrated in the first stage of migration (see above) are here linked to their correspondent coverages; data type from `ps8_datatype` is moved to `ps9_range_data_type`.
     263
     264A final report of coverage migrations is then printed: name and general information on coverage domain and range will be displayed for every migrated coverage. If some problem arose while migrating a coverage, an additional ''log'' column will describe what did not work. In case the mistake can be fixed (maybe changing the pre-upgrade database?), then one can just repeat the operation by reverting back (`--revert`) and re-migrating.
     265
     266
     267
    187268== Limitations ==
    188269The current schema supports gridded and multipoint coverages. However, it is designed to be extensible to further types in the [http://www.schemacentral.com/sc/niem21/e-gml32_AbstractCoverage.html coverage hierarchy].