Changes between Version 69 and Version 70 of PetascopeDevGuide


Ignore:
Timestamp:
Dec 30, 2015 3:42:07 PM (21 months ago)
Author:
pbaumann
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • PetascopeDevGuide

    v69 v70  
     1[[TracNav(TOC)]]
    12[[PageOutline]]
    23
     
    184185
    185186
    186 == Updating the schema ==
    187 
    188 Different versions of the database are tracked in the table __'''`ps_dbupdates`'''__, which stores the version number `<N>`__`+1`__ as a simple integer value.
    189 
    190 Version numbers are linked to the one `'update<N>.*'` update file inside the [browser:applications/petascope/src/main/db/petascope/ folder] of db updates (copied to `<install>/share/rasdaman/petascope/` during installation). Such files can be a set of SQL statements (`.sql`) or as well a shell script (`.sh`). In the latter case, the handling the update is possibly done with other auxiliary files, which should then be placed inside a folder `'update<N>/'`, like done for the [browser:applications/petascope/src/main/db/petascope/update8.sh major upgrade] script from version 8.5 to version 9.0.
    191 
    192 The script [browser:applications/petascope/src/main/db/update_petascopedb.sh.in update_petascopedb.sh] is the root handler: it checks the current status of the database, if ever created, and updates it to the latest available update in `<install>/share/rasdaman/petascope/`. Each SQL update is executed in a single transaction, whereas it is up to the developer to ensure atomic updates in case the shell script is chosen. After successful execution of each update file, `ps_dbupdates` is synchronized to the new `<N>+1` value.
    193 
    194 The major 9.0 [browser:applications/petascope/src/main/db/petascope/update8.sh upgrade script] creates a completely new set of tables, it does not represent an incremental update. To avoid the deletion of important data, existing tables are moved to a backup [http://www.postgresql.org/docs/9.3/static/ddl-schemas.html schema], whose name is `'ps_pre_update<N>'`, being `update<N>` the basename (no suffix) of the latest ''pre-upgrade'' db status. The upgrade then proceeds with i) the creation of new tables and db objects (triggers, procedures, indexes) inside an interim schema, ii) the migration of pre-existing coverages' metadata to these new tables and iii) moving the old tables to a backup schema. The new upgraded schema is finally published to the default public schema of Postgres. WMS tables, whose schema has not changed ever since, are simply moved to the public schema. The interim schema is moved on top of the [http://www.postgresql.org/docs/9.3/static/functions-info.html search patch] of Postgres during upgrade so that the upgrade files are schema independent.
    195 
    196 Here is the synopsis of update_petascopedb.sh:
    197 
    198 {{{
    199 usage: update_petascopedb.sh [--revert] [--cleanup] [--help]
    200 where:
    201 --revert
    202     Restore tables to a pre-upgrade backed-up schema (ps_pre_updateX).
    203 --cleanup
    204     Drops all schemas in petascopedb, but the public one.
    205 --help
    206     Show this message.
    207 }}}
    208 
    209 We see that we have two options:
    210 
    211    i. '''`--revert`''' : moves WMS tables back to the backup schema; drops the new tables and restores the old tables: the final db status will be exactly the same as before running the last upgrade.
    212    i. '''`--cleanup`''' : drops the backup schema in `petascopedb`.
    213 
    214 Calling the script on an already synchronized database will have no effect.
    215 
    216 A final note: to avoid clashes with old existing tables during the upgrade while not breaking running services, new tables have a temporary `ps9` prefix while co-existing with the other pre-upgrade tables: at the end of a successful [browser:applications/petascope/src/main/db/petascope/update8.sh upgrade], such tables (+ primary indexes) are renamed back to the usual `ps` prefix. Some non-primary indexes and constraints might still harmlessly have `ps9` prefix though. Check the [https://groups.google.com/d/msg/rasdaman-dev/M0eibKBS16A/VvTVASuR0r4J discussion] in the mailing list for further insights.
    217 
    218 
    219 === Updating the database schema: is this below more than duplicate? ===
    220 
    221 The update process for the database schema of `petascopedb` is organized by means of a set of `update<N>.*` files, which are excecuted in a loop by [browser:applications/petascope/src/main/db/update_petascopedb.sh.in update_petascopedb.sh].
    222 
    223 Not all of them represent actual ''updates'': there can be ''upgrades'' as well. The major 9.0 [browser:applications/petascope/src/main/db/petascope/update8.sh upgrade script] for instance creates a completely new set of tables: to avoid the deletion of important data, existing tables are moved to a backup [http://www.postgresql.org/docs/9.3/static/ddl-schemas.html schema], whose name is `'ps_pre_update<N>'`, being `update<N>` the basename (no suffix) of the latest ''pre-upgrade'' db status. The upgrade then proceeds with i) the creation of new tables and db objects (triggers, procedures, indexes) inside an interim schema, ii) the migration of pre-existing coverages' metadata to these new tables and iii) moving the old tables to a backup schema. The new upgraded schema is finally published to the default `public` schema of Postgres. WMS tables, whose [browser:applications/petascope/src/main/db/petascope/update5.sql schema] has not changed ever since, are simply moved to the `public` schema.
    224 
    225 Here is the synopsis of  `update_petascopedb.sh`:
    226 {{{
    227 usage: update_petascopedb.sh [--revert] [--cleanup] [--help]
    228 where:
    229 --revert
    230     Restore tables to a pre-upgrade backed-up schema (ps_pre_updateX).
    231 --cleanup
    232     Drops all schemas in petascopedb, but the public one.
    233 --help
    234     Show this message.
    235 }}}
    236 
    237 We see that we have two options:
    238 
    239    i. '''`--revert`''' : moves WMS tables back to the backup schema; drops the new tables and restores the old tables: the final db status will be exactly the same as before running the last upgrade.
    240    i. '''`--cleanup`''' : drops the backup schema in `petascopedb`.
    241 
    242 Calling the script on an already synchronized database will have no effect.
    243 
    244 The database of geo-metadata is generally a small database, which hardly reaches relevant sizes.
    245 While one can restore a pre-upgrade snapshot with the `--revert` option (if the snapshot has not been cleaned up), it is then suggested to keep a backup dump of the database: an additional layer of backup at almost no cost.
    246 
    247 Before starting the upgrade, you might want to understand more in detail what is done by the migration script. The following list explains the flow of operations what are done by the [browser:applications/petascope/src/main/db/petascope/update8/migrate.sql migration script] (pre-upgrade and post-upgrade tables will be prefixed `'ps8'` and `'ps9'` respectively):
    248 
    249    * main : `db_migration()`
    250       * `migrate_uoms()` : migrate the catalog of '''Unit of Measures''' (UoM) from `ps8_uom` to `ps9_uom`/`ps9_quantity`; a UoM in the previous schema can indeed be seen as a minimal informational basis for a SWE field.
    251       * `migrate_crss()` : migrate the catalog of '''Coordinate Reference Systems''' (CRSs) from `ps8_crs` to `ps9_crs`; mind here that all CRSs are migrated but from version 9.0 the CRSs shall compulsorily be an actionable HTTP URI resolving to a valid GML definition of CRS. Additionally, in order to let the migrator understand whenever a CRS defines latitude first it is suggested to either turn all CRS in `ps8_crs` to URIs ''`.../def/crs/ESPG/0/<code>`''or to ''`EPSG:<code>`'' format (indeed only ESPG codes can be recognized to define latitude first: see this [browser:applications/petascope/src/main/db/petascope/update8/north_first_crss.sql table]); `CRS:1` should be left as is instead: it will be turned to an appropriate [IndexCrss index] CRS.
    252       * forall '''coverages''' in `ps8_coverage` do `migrate_coverage()`:
    253          * fetch GMLCOV coverage type from `ps8_coverage` fill in `ps9_coverage` table  .. GMLCOV types need to appear to be ''grid'' coverages (~"*Grid*") and must be a legal grid coverage value (see ''" GML coverage types"'' in [browser:applications/petascope/src/main/db/petascope/update8/populate.sql this] file), otherwise the coverage is not migrated;
    254          * `'application/x-octet-stream'` MIME type is assigned by default (`rasdaman` data source);
    255          * migrate optionally stored GMLCOV extra metadata annotations from `ps8_metadata` to `ps9_extra_metadata`;
    256          * build the native CRS for `ps9_domain_set` by concatenating the CRSs associated to each coverage axis in `ps8_crsset` and filling URIs in `ps9_crs` (then referenced by `ps9_domain_set`): axes with __type__ `'t'` (for ''temporal'', see `ps8_axistype`) are assigned the `OGC:AnsiDate` CRS URI (see time handling [wiki:PetascopeTimeHandling_8_5 prior to] and [PetascopeTimeHandling from] 9.0) while axes with `CRS:1` assignment are given an appropriate [IndexCrss index] CRS, depending on how many consecutive `CRS:1` axes are found (see `translate_crs()` function in the [browser:applications/petascope/src/main/db/petascope/update8/utilities.sh utilities]);
    257          * determine the '''origin''' of the coverage for `ps9_gridded_domain_set`: axis footprint/resolution is firstly recovered from coverage extent in `ps8_domain` and number of cells in `ps8_celldomain` -- `(dom_max-dom_min)/(cdom_max-cdom_min+1)`) -- then origin is placed in the upper-left corner (bottom of cube in case of 3D, and so on) in the centre of the sample space (e.g. pixel-centre); the order of the axes defined in the CRS is strictly followed, hence e.g. in case of [www.opengis.net/def/crs/EPSG/0/4326 WGS84] geographic CRS, latitudes will appear first in the tuple of coordinates (see [PetascopeSubsets this] page for a more thorough description of coverage geometries in v9.0);
    258             * for `CRS:1` indexed axis, the `+1` term in the denominator is removed: the domain is like the cell domain.
    259          * '''offset vectors''' are computed for `ps9_grid_axis` and `ps9_rectilinear_axis`: again the axis order of the CRS is kept into account and `CRS:1` indexed axes will not treat the domain extent as dense but indexed (no `+1` in the denominator of the resolution formula); as origin is now in the upper-left corner in the 2D horizontal space, the vector associated with northings will point South (negative norm);
    260          * the '''range set''' of the coverage is collected for `ps9_range_set` and `ps9_rasdaman_collection`: using the [http://www.postgresql.org/docs/current/static/dblink.html dblink] module, the OID of the collection marray is fetched from `RASBASE` (if more than 1 marray is found in the collection, then a warning will be thrown in the final log, and only the first will be selected: a coverage is now associated to one-and-only-one marray);
    261          * the '''range type''' is built for `ps9_range_type_component` and [PetascopeDevGuide SWE-related tables]: !UoMs/Quantities migrated in the first stage of migration (see above) are here linked to their correspondent coverages; data type from `ps8_datatype` is moved to `ps9_range_data_type`.
    262 
    263 A final report of coverage migrations is then printed: name and general information on coverage domain and range will be displayed for every migrated coverage. If some problem arose while migrating a coverage, an additional ''log'' column will describe what did not work. In case the mistake can be fixed (maybe changing the pre-upgrade database?), then one can just repeat the operation by reverting back (`--revert`) and re-migrating.
    264 
    265 
    266 
    267187== Limitations ==
    268188The current schema supports gridded and multipoint coverages. However, it is designed to be extensible to further types in the [http://www.schemacentral.com/sc/niem21/e-gml32_AbstractCoverage.html coverage hierarchy].