wiki:WCSTImportGuide

Version 35 (modified by dmisev, 2 months ago) (diff)

--

WCSTImport Guide

Description

WCSTImport is a utility application in the rasdaman software suite that allows importing of georeferenced datasets into a WCS service supporting the Transaction Extension (see wiki:WCSTGuide). Its primary functionality is allowing the ingestion of archives of georeferenced files. This utility introduces two concepts:

  • Recipe - A recipe is a class implementing the BaseRecipe that based on a set of parameters (ingredients) can import a set of files into WCS forming a well defined structure (image, regular timeseries, irregular timeseries etc)
  • Ingredients - An ingredients file is a json file containing a set of parameters that define how the recipe should behave (e.g. the WCS endpoint, the CRS resolver etc are all ingredients)

Dependencies

The glob2, dateutil and lxml python packages are required by wcst_import, on Debian the following commands should set them up:

sudo apt-get install python-dateutil python-lxml python-pip python-gdal python-magic
sudo pip install glob2

Running

wcst_import.sh path/to/my_ingredient.json

A list of all possible ingredients can be found here: http://rasdaman.org/browser/applications/wcst_import/ingredients/possible_ingredients.json

Recipes

As of now, three recipes are in the codebase:

  • General Recipe
  • Mosaic Map
  • Regular Timeseries
  • Irregular Timeseries

For each one of them there is an ingredients file under the ingredients folder, which contain an example of what parameters are available. Below, you can find a description of each ingredient file:

REGULAR Tiling

You can set arbitrary tile sizes for this options in ingredient.json only if tile name is ALIGNED. If you want to take the advantage from the tile index (using: tile name: REGULAR, please see about tiling here: http://rasdaman.org/wiki/Tiling), you must set the tile size correctly as the divisors of total points for each axis.

Example:

A coverage which has axes: Lat: 4320 points and Long: 8640 points.

Then the tiling should be divisors of total points respectively 
(e.g: "tiling": "REGULAR [0:0, 0:431, 0:863]"). 
(432 points and 864 points for 1 tile).

General Recipe

The most flexible choice, you can create complex coverages from any type of file. Details on how to use here: http://rasdaman.org/wiki/WCSTImportGuide/GeneralRecipe

Mosaic Map

NOTE: If you want to ingest data in netCDF/Grib (e.g: 3D coverages from multiple 2D netCDF files), please use this recipe ​http://rasdaman.org/wiki/WCSTImportGuide/GeneralRecipe) as these files are processed more complex.

Well suited for importing a tiled map, not necessarily continuous, it will place all input files given under a single coverage and deal with their position in space. Parameters are explained below

(please note that the comments syntax using "//comment explaining things" is not valid json so remove them if you copy the parameters)

{
  "config": {
//The endpoint of the WCS service with the WCST extension enabled
    "service_url": "http://localhost:8080/rasdaman/ows",
//A directory where to store the intermediate results
    "tmp_directory": "/tmp/",
//A link to the crs resolver to be used, best to use one that is frequently updated
    "crs_resolver": "http://opengis.net/def/",
//A default 2D crs to be used when the given files do not have one
    "default_crs": "http://opengis.net/def/OGC/0/Index2D",
//If set to true, it will print the WCST requests and will not execute them. To actually execute them set it to false
    "mock": true,
//If set to true, the process will not require any user confirmation, use with care, useful for production environments when deployment is automated
    "automated": false
  },
  "input": {
//The name of the coverage, if the coverage already exists, we will update it with the new files
    "coverage_id": "MyCoverage",
    "paths": [
//Any normal full (or relative to the ingredients file) path or regex that would work with the ls command. You can add as many as you wish, separated by commas
      "/var/data/*"
    ]
  },
  "recipe": {
//The name of the recipe
    "name": "map_mosaic",
    "options": {
//The tiling that you want to be done in rasdaman
      "tiling": "ALIGNED [0:500, 0:500]"
    }
  }
}

Regular Timeseries

NOTE: If you want to ingest data in netCDF/Grib (e.g: 3D coverages from multiple 2D netCDF files), please use this recipe http://rasdaman.org/wiki/WCSTImportGuide/GeneralRecipe) as these files are processed more complex.

Well suited for importing multiple 2-D slices created at regular intervals of time (e.g sensor data, satelite imagery etc) as 3-D cube with the third axis being a temporal one. Parameters are explained below

(please note that the comments syntax using "//comment explaining things" is not valid json so remove them if you copy the parameters):

{
  "config": {
//The endpoint of the WCS service with the WCST extension enabled
    "service_url": "http://localhost:8080/rasdaman/ows",
//A directory where to store the intermediate results
    "tmp_directory": "/tmp/",
//A link to the crs resolver to be used, best to use one that is frequently updated
    "crs_resolver": "http://kahlua.eecs.jacobs-university.de:8080/def",
//A default 2D crs to be used when the given files do not have one
    "default_crs": "http://kahlua.eecs.jacobs-university.de:8080/def/OGC/0/Index2D",
//If set to true, it will print the WCST requests and will not execute them. To actually execute them set it to false
    "mock": true,
//If set to true, the process will not require any user confirmation, use with care, useful for production environments when deployment is automated
    "automated": false
  },
  "input": {
//The name of the coverage, if the coverage already exists, we will update it with the new files
    "coverage_id": "MyCoverage",
    "paths": [
//Any normal full (or relative to the ingredients file) path or regex that would work with the ls command. You can add as many as you wish, separated by commas
      "/var/data/*"
    ]
  },
  "recipe": {
//The name of the recipe
    "name": "time_series_regular",
    "options": {
//The starting date for the first slice
      "time_start": "2012-12-02T20:12:02",
//The format of the time provided above, auto will try to guess it, other wise use any combination of YYYY:MM:DD HH:mm:ss
      "time_format": "auto",
//The crs to be used for the time axis
      "time_crs": "http://kahlua.eecs.jacobs-university.de:8080/def/crs/OGC/0/AnsiDate",
//The distance between each slice in time, granularity seconds to days
      "time_step": "2 days 10 minutes 3 seconds",
//The tiling that should be used for it
      "tiling": "ALIGNED [0:1000, 0:1000, 0:2]"
    }
  }
}
  }
}

Irregular timeseries

NOTE: If you want to ingest data in netCDF/Grib (e.g: 3D coverages from multiple 2D netCDF files), please use this recipe http://rasdaman.org/wiki/WCSTImportGuide/GeneralRecipe) as these files are processed more complex.

Well suited for importing multiple 2-D slices created at irregular intervals of time that are known at import time into a 3-D cube with the third axis being a temporal one. Parameters are explained below

(please note that the comments syntax using "//comment explaining things" is not valid json so remove them if you copy the parameters):

NOTE: Irregular timeseries has 2 types of time parameter in "options", please choose 1 of 2 which is best for your case.

{
  "config": {
//The endpoint of the WCS service with the WCST extension enabled
    "service_url": "http://localhost:8080/rasdaman/ows",
//A directory where to store the intermediate results
    "tmp_directory": "/tmp/",
//A link to the crs resolver to be used, best to use one that is frequently updated
    "crs_resolver": "http://opengis.net/def/",
//A default 2D crs to be used when the given files do not have one
    "default_crs": "http://opengis.net/def/OGC/0/Index2D",
//If set to true, it will print the WCST requests and will not execute them. To actually execute them set it to false
    "mock": true,
//If set to true, the process will not require any user confirmation, use with care, useful for production environments when deployment is automated
    "automated": false
  },
  "input": {
//The name of the coverage, if the coverage already exists, we will update it with the new files
    "coverage_id": "MyCoverage",
    "paths": [
//Any normal full (or relative to the ingredients file) path or regex that would work with the ls command. You can add as many as you wish, separated by commas
      "/var/data/*"
    ]
  },
  "recipe": {
//The name of the recipe
    "name": "time_series_irregular",
    "options": {
//Information about the time parameter, two option possible, choose either of them
      "time_parameter": {
//Get the date for the slice from a tag that can be read by GDAL
        "metadata_tag": {
//The name of such a tag
          "tag_name": "TIFFTAG_DATETIME"
        },
//The format of the datetime value in the tag
        "datetime_format": "YYYY:MM:DD HH:mm:ss"
      },
      "time_parameter" :{
//Another option to extract the time. Use only one of the two!
        "filename": {
//The regex has to contain groups of tokens, separated by parentheses. The group parameter specifies which regex group to use for retrieving the time value
        "regex": "(.*)_(.*)_(.+?)_(.*)",
        "group": "2"
      },
}

//The crs of the time axis
      "time_crs": "http://opengis.net/def/crs/OGC/0/AnsiDate",
//The tiling to be used
      "tiling": "ALIGNED [0:10, 0:1000, 0:500]"
    }
  }
}

All possible ingredients

{
  "__comment__": [
    "Each possible parameter for every recipe is commented in this file. As JSON does not support comments above",
    "each filed, a __comment__ field is placed that explains the semantics of the field below it.",
    "In some cases, a parameter might have different possible values (e.g. recipe). In this case, the field for the",
    "parameter will be doubled.",
    "This file is considered a developer documentation that gives an overview over all possible ingredients.",
    "Refer to the user documentation at http://rasdaman.org or to the individual files for more documentation."
  ],
  "config": {
    "__comment__": "The base url to the WCST service, i.e. not including ?service=WCS&acceptversion=2.0.0",
    "service_url": "http://localhost:8080/rasdaman/ows",
    "__comment__": "Temporary directory in which to create gml and data files, should be readable and writable by both rasdaman, petascope and current user",
    "tmp_directory": "/tmp/",
    "__comment__": "The crs resolver to use for generating the gml",
    "crs_resolver": "http://opengis.net/def/",
    "__comment__": "The default crs to be used for gdal files that do not have it",
    "default_crs": "http://opengis.net/def/def/crs/OGC/0/Index2D",
    "__comment__": "[OPTIONAL] If mock parameter is true then the wcst requests are printed to stdout and not executed",
    "mock": false,
    "__comment__": "[OPTIONAL] Set to true if no human input should be requested and everything should be completely automated",
    "automated": false,
    "__comment__": "[OPTIONAL] This parameter adds default null values for bands that *DO NOT* have a null value provided by the file itself. The value for this parameter should be an array containing the desired null value in rasdaman format for each band. E.g. for a coverage with 3 bands:",
    "default_null_values": [
      "9995:9999",
      "-9, -10, -87",
      "4"
    ],
    "__comment__": "[OPTIONAL] In case the files are exposed via a web-server and not locally, you can add the root url here, otherwise the default is listed below",
    "url_root": "file://",
    "__comment__": "[OPTIONAL] In some cases the resolution is small enough to affect the precision of the transformation from domain coordinates to grid coordinates. To allow for corrections that will make the import possible, set this parameter to true.",
    "subset_correction": false,
    "__comment__": "[OPTIONAL] If set to true, it will skip files that were not imported and move to the next ones.",
    "skip": false,
    "__comment__": "[OPTIONAL] If a WCST request fails it will be retried a number of times before an error is thrown",
    "retry": true,
    "__comment__": "[OPTIONAL] Number of retries to be attempted.",
    "retries": 5,
    "__comment__": "[OPTIONAL] The number of seconds to wait before retrying after an error. You can also specify a floating number to represent subdivisions of seconds.",
    "retry_sleep": 1,
    "__comment__": "[OPTIONAL] Limit the slices that are imported to the ones that fit in the bounding box below. Each subset in the bounding box should be of form {low:0,high:100} in the format of the axis.",
    "slice_restriction": [
      {
        "low": 0,
        "high": 36000
      },
      {
        "low": 0,
        "high": 18000
      },
      {
        "low": "2012-02-09",
        "high": "2012-12-09T14:20",
        "type": "date"
      }
    ],
    "__comment__" : "[OPTIONAL] The directory in which to store the resumer file. By default, it will be stored in the same folder as the ingredients file."
    "resumer_dir_path" : "/var/geodata/resumer/",
    "__comment__" : "[OPTIONAL] The number of slices to show in the description.",
    "description_max_no_slices" : 42,
    "__comment__" : "[OPTIONAL] Allow files to be tracked in order to not reimport files that were already ingested",
    "track_files" : true
  },
  "input": {
    "__comment__": "The id of the coverage. If it already exists, we will consider this operation an update",
    "coverage_id": "MyCoverage",
    "__comment__": "The input paths to take into consideration. A path can be a single file or a unix file regex.",
    "paths": [
      "/var/data/test_1.tif",
      "/var/data/dir/*"
    ]
  },
  "recipe": {
    "__comment__": "The recipe name",
    "name": "map_mosaic",
    "__comment__": "A list of options required by the recipe",
    "options": {
      "__comment__": "[OPTIONAL]The tiling of the coverage in rasql format",
      "tiling": "ALIGNED [0:500, 0:500]",
      "__comment__": "[OPTIONAL] If you want to import in wms as well set this variable to true",
      "wms_import": true,
      "__comment__": "[OPTIONAL] Specify the names of the bands, in cases the automatic inference (default: field_1, ...) is not good enough",
      "band_names": [
        "red",
        "green",
        "blue"
      ]
    }
  },
  "recipe": {
    "__comment__": "This recipe should be used to extract a large coverage from an existing WCS service",
    "name": "wcs_extract",
    "options": {
      "__comment__": "The coverage to be imported",
      "coverage_id": "SomeOtherCoverage",
      "__comment__": "The endpoint of the WCS where the coverage resides",
      "wcs_endpoint": "http://example.org/rasdaman/ows",
      "__comment__": "A partitioning scheme to be used. For each grid axis specify the maximum number of pixels that should be retrieved. The system uses this as a hint and can generate different partitioning schemes depending on the coverage structure",
      "partitioning_scheme": [
        4000,
        4000,
        1
      ],
      "__comment__": "[OPTIONAL]The tiling of the coverage in rasql format",
      "tiling": "ALIGNED [0:4000, 0:4000, 4]",
      "__comment__": "[OPTIONAL] If you want to import in wms as well set this variable to true",
      "wms_import": true
    }
  },
  "recipe": {
    "__comment__": "The recipe name",
    "name": "time_series_regular",
    "__comment__": "A list of options required by the recipe",
    "options": {
      "__comment__": "The origin of the timeseries",
      "time_start": "2012-12-02T20:12:02",
      "__comment__": "The datetime format of the parameter above. Auto will try to guess it, any other datetime format is accepted",
      "time_format": "auto",
      "__comment__": "The time crs to be used with the 2d crs to create a compound crs for the whole coverage",
      "time_crs": "http://192.168.0.103:8080/def/crs/OGC/0/AnsiDate",
      "__comment__": "The time step between two slices, expressed in days, hours, minutes and seconds",
      "time_step": "2 days 10 minutes 3 seconds",
      "__comment__": "[OPTIONAL]The tiling of the coverage in rasql format",
      "tiling": "ALIGNED [0:1000, 0:1000, 0:2]",
      "__comment__": "[OPTIONAL] Specify the names of the bands, in cases the automatic inference (default: field_1, ...) is not good enough",
      "band_names": [
        "red",
        "green",
        "blue"
      ]
    }
  },
  "recipe": {
    "__comment__": "The recipe name",
    "name": "time_series_irregular",
    "__comment__": "A list of options required by the recipe",
    "options": {
      "__comment__": "The time parameter describes to the recipe how to extract the datetime. Two options possible: metadata_tag OR filename",
      "time_parameter": {
        "metadata_tag": {
          "__comment__": "The name of the tag in the gdal file, the default is the one below",
          "tag_name": "TIFFTAG_DATETIME"
        },
        "filename": {
          "__comment__": "The regex has to contain groups of tokens, separated by parentheses. The group parameter specifies which regex group to use for retrieving the time value",
          "regex": "(.*)_(.*)_(.+?)_(.*)",
          "group": "2"
        },
        "__comment__": "The format of the value of the time parameter: 'auto' will try to guess it",
        "datetime_format": "YYYY:MM:DD HH:mm:ss"
      },
      "__comment__": "The time crs to be used with the 2d crs to create a compound crs for the whole coverage",
      "time_crs": "http://kahlua.eecs.jacobs-university.de:8080/def/crs/OGC/0/AnsiDate",
      "__comment__": "[OPTIONAL]The tiling of the coverage in rasql format",
      "tiling": "ALIGNED [0:10, 0:1000, 0:500]",
      "__comment__": "[OPTIONAL] Specify the names of the bands, in cases the automatic inference (default: field_1, ...) is not good enough",
      "band_names": [
        "red",
        "green",
        "blue"
      ]
    }
  }
}

Creating your own recipe

The recipes above cover a frequent but limited subset of what is possible to model using a coverage. WCSTImport allows you to define your own recipes in order to fill these gaps. In this tutorial we will create a recipe that can construct a 3D coverage from 2D georeferenced files. The 2D files that we want to target have all the same CRS and cover the same geographic area. The time information that we want to retrieve is stored in each file in a GDAL readable tag. The tag name and time format differ from dataset to dataset so we want to take this information as an option to the recipe. We would also want to be flexible with the time crs that we require so we will add this option as well.

Based on this usecase, the following ingredient file seems to fulfill our need:

{
  "config": {
    "service_url": "http://localhost:8080/rasdaman/ows",
    "tmp_directory": "/tmp/",
    "crs_resolver": "http://localhost:8080/def/",
    "default_crs": "http://localhost:8080/def/def/crs/OGC/0/Index2D",
    "mock": false,
    "automated": false
  },
  "input": {
    "coverage_id": "MyCoverage",
    "paths": [
      "/var/data/*"
    ]
  },
  "recipe": {
    "name": "my_custom_recipe",
    "options": {      
      "time_format": "auto",
      "time_crs": "http://localhost:8080/def/crs/OGC/0/AnsiDate",
      "time_tag": "MY_SPECIAL_TIME_TAG",      
    }
  }
}

Now let's create our own custom recipe. To create a new recipe start by creating a new folder in the recipes folder. Let's call our recipe my_custom_recipe:

cd $RMANHOME/share/rasdaman/wcst_import/recipes/
mkdir my_custom_recipe
touch __init__.py

The last command is needed to tell python that this folder is containing python sources, if you forget to add it, your recipe will not be automatically detected. Let's first create an example of our ingredients file so we get a feeling for what we will be dealing with in the recipe. Our recipe will just request from the user two parameters Let's now create our recipe, by creating a file called recipe.py

touch recipe.py
editor recipe.py

Use your favorite editor or IDE to work on the recipe (there are type annotations for most WCSTImport classes so an IDE like PyCharm? would give out of the box completion support). First, let's add the skeleton of the recipe (please note that in this tutorial, we will omit the import section of the files (your IDE will help you auto import them)):

class Recipe(BaseRecipe):
    def __init__(self, session):
        """
        The recipe class for my_custom_recipe. To get an overview of the ingredients needed for this
        recipe check ingredients/my_custom_recipe
        :param Session session: the session for the import tun
        """
        super(Recipe, self).__init__(session)
        self.options = session.get_recipe()['options']        

    def validate(self):
        super(Recipe, self).validate()
        pass

    def describe(self):
        """
        Implementation of the base recipe describe method
        """
        pass

    def ingest(self):
        """
        Ingests the input files
        """
        pass

    def status(self):
        """
        Implementation of the status method
        :rtype (int, int)
        """
        pass

    @staticmethod
    def get_name():
        return "my_custom_recipe"

The first thing you need to do is to make sure get_name() method returns the name of your recipe. This name will be used to determine if an ingredient file should be processed by your recipe. Next, you will need to focus on the constructor. Let's examine it. We get a single parameter called session which contains all the information collected from the user plus a couple more useful things. You can check all the available methods of the class in the session.py file, for now we will just save the options provided by the user that are available in session.get_recipe()options? in a class attribute.

Next, let's look at the validate method. In this method, you will validate the options for the recipe provided by the user. It's generally a good idea to call the super method to validate some of the general things like the WCST Service availability and so on although it is not mandatory. We also want to validate our custom recipe options here. This is how the recipe looks like now:

class Recipe(BaseRecipe):
    def __init__(self, session):
        """
        The recipe class for my_custom_recipe. To get an overview of the ingredients needed for this
        recipe check ingredients/my_custom_recipe
        :param Session session: the session for the import tun
        """
        super(Recipe, self).__init__(session)
        self.options = session.get_recipe()['options']        

    def validate(self):
        super(Recipe, self).validate()
        if "time_crs" not in self.options or self.options['time_crs'] == "":
            raise RecipeValidationException("No valid time crs provided")

        if 'time_tag' not in self.options:
            raise RecipeValidationException("No valid time tag parameter provided")

        if 'time_format' not in self.options:
            raise RecipeValidationException("You have to provide a valid time format")

    def describe(self):
        """
        Implementation of the base recipe describe method
        """
        pass

    def ingest(self):
        """
        Ingests the input files
        """
        pass

    def status(self):
        """
        Implementation of the status method
        :rtype (int, int)
        """
        pass

    @staticmethod
    def get_name():
        return "my_custom_recipe"

Now that our recipe can validate the recipe options, let's move to the describe method. This method allows you to let your users know any relevant information about the ingestion before it actually starts. The irregular_timeseries recipe prints the timestamp for the first couple of slices for the user to check if they are correct. Similar behaviour should be done based on what your recipe has to do.

Next, we should define the ingest behaviour. The framework does not make any assumptions about how the correct method of ingesting is, however it offers a lot of utility functionality that help you do it in a more standardized way. We will continue this tutorial by describing how to take advantage of this functionality, however, note that this is not required for the recipe to work. The first thing that you need to do is to define an importer object. This importer object, takes a coverage object and ingests it using WCST requests. The object has two public methods, ingest, which ingests the coverage into the WCST service (note: ingest can be an insert operation when the coverage was not defined, or update if the coverage exists. The importer will handle both cases for you, so you don't have to worry if the coverage already exists.) and get_progress which returns a tuple containing the number of imported slices and the total number of slices. After adding the importer, the code should look like this:

class Recipe(BaseRecipe):
    def __init__(self, session):
        """
        The recipe class for my_custom_recipe. To get an overview of the ingredients needed for this
        recipe check ingredients/my_custom_recipe
        :param Session session: the session for the import tun
        """
        super(Recipe, self).__init__(session)
        self.options = session.get_recipe()['options']        
        self.importer = None

    def validate(self):
        super(Recipe, self).validate()
        if "time_crs" not in self.options or self.options['time_crs'] == "":
            raise RecipeValidationException("No valid time crs provided")

        if 'time_tag' not in self.options:
            raise RecipeValidationException("No valid time tag parameter provided")

        if 'time_format' not in self.options:
            raise RecipeValidationException("You have to provide a valid time format")

    def describe(self):
        """
        Implementation of the base recipe describe method
        """
        pass

    def ingest(self):
        """
        Ingests the input files
        """
        self._get_importer().ingest()

    def status(self):
        """
        Implementation of the status method
        :rtype (int, int)
        """
        pass

    def _get_importer():
      if self.importer is None:
        self.importer = Importer(self._get_coverage())
      return self.importer

    def _get_coverage():
      pass

    @staticmethod
    def get_name():
        return "my_custom_recipe"

In order to build the importer, we need to create a coverage object. Let's see how we can do that. The coverage constructor requires a:

  • coverage_id: the id of the coverage
  • slices: a list of slices that compose the coverage. Each slice defines the position in the coverage and the data that should be defined at the specified position
  • range_fields: the range fields for the coverage
  • crs: the crs of the coverage
  • pixel_data_type: the type of the pixel in gdal format, e.g. Byte, Float32 etc

You can construct the coverage object in many ways, we will present further a specific method of doing it. Let's start from the crs of the coverage. For our recipe, we want a 3D crs, composed of the CRS of the 2D images and a time crs indicated. The two lines of code would give us exactly this:

    # Get the crs of one of the images using a GDAL helper class. We are assuming all images have the same CRS
    gdal_dataset = GDALGmlUtil(self.session.get_files()[0].get_filepath())
    # Get the crs of the coverage by compounding the two crses
    crs = CRSUtil.get_compound_crs([gdal_dataset.get_crs(), self.options['time_crs']])  

Let's also get the range fields for this coverage. We can extract them again form the 2D image using a helper class that can use GDAL to get the relevant information:

  fields = GdalRangeFieldsGenerator(gdal_dataset).get_range_fields()

Let's also get the pixel base type, again using the gdal helper:

  pixel_type = gdal_dataset.get_band_gdal_type()

Let's see what we have so far:

class Recipe(BaseRecipe):
    def __init__(self, session):
        """
        The recipe class for my_custom_recipe. To get an overview of the ingredients needed for this
        recipe check ingredients/my_custom_recipe
        :param Session session: the session for the import tun
        """
        super(Recipe, self).__init__(session)
        self.options = session.get_recipe()['options']        
        self.importer = None

    def validate(self):
        super(Recipe, self).validate()
        if "time_crs" not in self.options or self.options['time_crs'] == "":
            raise RecipeValidationException("No valid time crs provided")

        if 'time_tag' not in self.options:
            raise RecipeValidationException("No valid time tag parameter provided")

        if 'time_format' not in self.options:
            raise RecipeValidationException("You have to provide a valid time format")

    def describe(self):
        """
        Implementation of the base recipe describe method
        """
        pass

    def ingest(self):
        """
        Ingests the input files
        """
        self._get_importer().ingest()

    def status(self):
        """
        Implementation of the status method
        :rtype (int, int)
        """
        pass

    def _get_importer(self):
      if self.importer is None:
        self.importer = Importer(self._get_coverage())
      return self.importer

    def _get_coverage(self):
      # Get the crs of one of the images using a GDAL helper class. We are assuming all images have the same CRS
      gdal_dataset = GDALGmlUtil(self.session.get_files()[0].get_filepath())
      # Get the crs of the coverage by compounding the two crses
      crs = CRSUtil.get_compound_crs([gdal_dataset.get_crs(), self.options['time_crs']])  
      fields = GdalRangeFieldsGenerator(gdal_dataset).get_range_fields()      
      pixel_type = gdal_dataset.get_band_gdal_type()
      coverage_id = self.session.get_coverage_id()
      slices = self._get_slices(crs)
      return Coverage(coverage_id, slices, fields, crs, pixel_type)

    def _get_slices(self, crs):
      pass

    @staticmethod
    def get_name():
        return "my_custom_recipe"

As you can notice, the only thing left to do is to implement the _get_slices() method. To do so we need to iterate over all the input files and create a slice for each. Here's an example on how we could do that

def _get_slices(self, crs):
  # Let's first extract all the axes from our crs
  crs_axes = CRSUtil(crs).get_axes()
  # Prepare a list container for our slices
  slices = []
  # Iterate over the files and create a slice for each one
  for infile in self.session.get_files():
      # We need to create the exact position in time and space in which to place this slice
      # For the space coordinates we can use the GDAL helper to extract it for us
      # The helper will return a list of subsets based on the crs axes that we extracted
      # and will fill the coordinates for the ones that it can (the easting and northing axes)
      subsets = GdalAxisFiller(crs_axes, GDALGmlUtil(infile.get_filepath())).fill()
      # Now we must fill the time axis as well and indicate the position in time
      for subset in subsets:
        # Find the time axis
        if subset.coverage_axis.axis.crs_axis.is_future():
          # Set the time position for it. Our recipe extracts it from a GDAL tag provided by the user
          subset.interval.low = GDALGmlUtil(infile).get_datetime(self.options["time_tag"])
      slices.append(Slice(subsets, FileDataProvider(tpair.file)))
  return slices

And we are done we now have a valid coverage object. The last thing needed is to define the status method. This method need to provide a status update to the framework in order to display it to the user. We need to return the number of finished work items and the number of total work items. In our case we can measure this in terms of slices and the importer can already provide this for us. So all we need to do is the following:

def status(self):
    return self._get_importer().get_progress()  
class Recipe(BaseRecipe):
    def __init__(self, session):
        """
        The recipe class for my_custom_recipe. To get an overview of the ingredients needed for this
        recipe check ingredients/my_custom_recipe
        :param Session session: the session for the import tun
        """
        super(Recipe, self).__init__(session)
        self.options = session.get_recipe()['options']        
        self.importer = None

    def validate(self):
        super(Recipe, self).validate()
        if "time_crs" not in self.options or self.options['time_crs'] == "":
            raise RecipeValidationException("No valid time crs provided")

        if 'time_tag' not in self.options:
            raise RecipeValidationException("No valid time tag parameter provided")

        if 'time_format' not in self.options:
            raise RecipeValidationException("You have to provide a valid time format")

    def describe(self):
        """
        Implementation of the base recipe describe method
        """
        return "This is some description."

    def ingest(self):
        """
        Ingests the input files
        """
        self._get_importer().ingest()

    def status(self):
        return self._get_importer().get_progress()  

    def _get_importer(self):
      if self.importer is None:
        self.importer = Importer(self._get_coverage())
      return self.importer

    def _get_coverage(self):
      # Get the crs of one of the images using a GDAL helper class. We are assuming all images have the same CRS
      gdal_dataset = GDALGmlUtil(self.session.get_files()[0].get_filepath())
      # Get the crs of the coverage by compounding the two crses
      crs = CRSUtil.get_compound_crs([gdal_dataset.get_crs(), self.options['time_crs']])  
      fields = GdalRangeFieldsGenerator(gdal_dataset).get_range_fields()      
      pixel_type = gdal_dataset.get_band_gdal_type()
      coverage_id = self.session.get_coverage_id()
      slices = self._get_slices(crs)
      return Coverage(coverage_id, slices, fields, crs, pixel_type)

    def _get_slices(self, crs):
      # Let's first extract all the axes from our crs
      crs_axes = CRSUtil(crs).get_axes()
      # Prepare a list container for our slices
      slices = []
      # Iterate over the files and create a slice for each one
      for infile in self.session.get_files():
          # We need to create the exact position in time and space in which to place this slice
          # For the space coordinates we can use the GDAL helper to extract it for us
          # The helper will return a list of subsets based on the crs axes that we extracted
          # and will fill the coordinates for the ones that it can (the easting and northing axes)
          subsets = GdalAxisFiller(crs_axes, GDALGmlUtil(infile.get_filepath())).fill()
          # Now we must fill the time axis as well and indicate the position in time
          for subset in subsets:
            # Find the time axis
            if subset.coverage_axis.axis.crs_axis.is_future():
              # Set the time position for it. Our recipe extracts it from a GDAL tag provided by the user
              subset.interval.low = GDALGmlUtil(infile).get_datetime(self.options["time_tag"])
          slices.append(Slice(subsets, FileDataProvider(tpair.file)))
      return slices

    @staticmethod
    def get_name():
        return "my_custom_recipe"

We now have a functional recipe. You can try the ingredients file against it and see how it works.

FAQ

Why is the subset_correction needed?

The subset_correction feature is needed when there is a discrepancy between the precision of gdal (or other tools used in the creation of gml coverages) and the precision of petascope. Take for example this coordinate 0.33333333, this would be represented in gdal as 0.3333333374353 for example due to the loss of precision in floating point values. Petascope uses BigDecimal? that keeps the precision up to 50 decimals. When working with arithmetic operations, the difference between the two leads to inconsistencies that cannot be detected automatically. WCST Import tries to help with that by giving the users the option of trying to ingest the coverage by adding a small number (<= offset_vector / 2) to the coordinates in order to correctly align the subsets. However this is not guaranteed to work, for example if your extent is close to the next geopixel by less than 1/2 of the offset vector. In this cases it is left to the user to deal with the incosistency (e.g. by manually creating the coverage)

Attachments (4)

Download all attachments as: .zip