#505 closed defect (fixed)
Performance leak for timestamp to numeric coordinate conversion
Reported by: | Piero Campalani | Owned by: | Piero Campalani |
---|---|---|---|
Priority: | blocker | Milestone: | Future |
Component: | petascope | Version: | 8.5 |
Keywords: | time conversion performance | Cc: | Dimitar Misev, Peter Baumann |
Complexity: | Medium |
Description
When a WCS GetCoverage requests specifies temporal subsettings with ISO8601 notation (which shall be within double-quotes), the values are converted to the proper numerical equivalent a priori, based on the temporal CRS associated with the coverage.
Currently the algorithm is based on a simple increase of time units from the origin of the t-CRS, but this can be very expensive when the temporal resolution is much smaller with respect to the origin-subset range, e.g. with Unix time CRS.
An optimal algorithm should be implemented, which flexibly adjusts to different time scales and steps.
Change History (11)
follow-up: 2 comment:1 by , 11 years ago
comment:2 by , 11 years ago
Replying to pcampalani:
Issues:
scalar*DateTime
operation is not implemented in the Date4J library, and needs to be manually done, taking care of all overflows. http://www.date4j.net/javadoc/index.html
Maybe there are some alternative library that can handle that? Unless we are deeply involved with Date4J already.
comment:3 by , 11 years ago
Isn't this in general a problem for irregular coverages btw, for any type of axis and not just temporal?
comment:4 by , 11 years ago
For that you can exploit the database: see petascope.core.DbMetadataSource.getIndexesFromIrregularRectilinearAxis()
.
comment:5 by , 11 years ago
The issue of this ticket is prior to irregular handling: is in the conversion from input timestamps to correspondent numerical form (number = coordinate in the associated Temporal CRS).
comment:6 by , 11 years ago
Cc: | added |
---|---|
Priority: | major → blocker |
I suggest to make this ticket blocker and delay the release of a few days: using timestamps becomes easily unusable before this issue is not fixed, and probably is one of most highly awaited features.
comment:7 by , 11 years ago
Instead of manual algorithm, it might be safer and wiser to inspect other Java libraries, mainly Joda, or even the (amazing) date
unix utility, although this could be dangerous since the behavior can depend on the system's installed versions and configurations.
Runtime r = Runtime.getSystemRuntime(); r.exec("date -d 'now - 123 weeks'");
comment:8 by , 11 years ago
Status: | new → accepted |
---|
comment:9 by , 11 years ago
Resolution: | → fixed |
---|---|
Status: | accepted → closed |
Problem fixed in changeset:3d51a00.
Instead of implementing the optimized algorithm described in comment:1, the date4j library has been replaced by Joda-Time.
With Joda-Time, time instants are internally stored as milliseconds from Unix epoch. It then provides methods to translate timestamps ISO (and also custom) formats back and forth to this internal Temporal CRS.
By applying the correct coefficient, the distance in milliseconds between two time instants can be easily (and quickly) converted to the distance in the Temporal CRS that is used.
The supported UCUM codes currently are:
// Standard codes for supported temporal Unit of Measures (which provide ISO8601 interface to the user) // http://unitsofmeasure.org/ucum.html public static final String UCUM_MILLIS = "ms"; public static final String UCUM_SECOND = "s"; public static final String UCUM_MINUTE = "min"; public static final String UCUM_HOUR = "h"; public static final String UCUM_DAY = "d"; public static final String UCUM_WEEK = "wk"; public static final String UCUM_MEAN_JULIAN_MONTH = "mo_j"; public static final String UCUM_MEAN_GREGORIAN_MONTH = "mo_g"; public static final String UCUM_SYNODAL_MONTH = "mo_s"; public static final String UCUM_MONTH = "mo"; public static final String UCUM_MEAN_JULIAN_YEAR = "a_j"; public static final String UCUM_MEAN_GREGORIAN_YEAR = "a_g"; public static final String UCUM_TROPICAL_YEAR = "a_t"; public static final String UCUM_YEAR = "a";
This means that the //CoordinateSystemAxis/@uom
attribute of the Temporal CRS in use needs to show such UCUM code in order to be able to insert quoted timestamps in a WC*S request. If this is not the case (eg ma
uom, megaannum?), this does not mean that the CRS is not supported, but only the conversion to ISO timestamp is.
The ISO formatted I used had default UTC time zone and optional time part (date is mandatory): see this page for the exact allowed syntax and this file for working examples.
JUnit tests have been added too.
comment:10 by , 11 years ago
good work! If not yet done, please document on the wiki - in particular the design decision to have ms as resolution lower limit.
comment:11 by , 11 years ago
Yes, I forgot to mention this is all been documented in PetascopeTimeHandling.
A possible solution is to define a temporal increment which starts from a big value, and gets decreased (log-scale, say) if the single increment goes beyond the subset upper bound.
Deacreasing this increment until reaching the single temporal unit of resolution and proceed until the single resolution increment exceeds the subset.
PROBLEM: how many resolutions
T_RES
fit in interval[LO,HI]
?This algorithm is much faster than the current implementation which adds T_RES until the HI upper limit is found.
Issues:
scalar*DateTime
operation is not implemented in the Date4J library, and needs to be manually done, taking care of all overflows. http://www.date4j.net/javadoc/index.htmlA JUnit tests is mandatory for such method.