Opened 5 years ago

Last modified 2 years ago

#1922 assigned defect

netcdf preserve file metadata

Reported by: Dimitar Misev Owned by: Dimitar Misev
Priority: major Milestone: 11.0
Component: wcst_import Version: development
Keywords: Cc: Vlad Merticariu
Complexity: Medium

Description (last modified by Dimitar Misev)

When importing netcdf files the metadata should be preserved precisely, and not converted to string. For example in eobstest

	short tg(time, latitude, longitude) ;
		tg:long_name = "mean temperature" ;
		tg:units = "Celsius" ;
		tg:standard_name = "air_temperature" ;
		tg:_FillValue = -9999s ;
		tg:scale_factor = 0.01f ;

_FillValue and scale_factor should be kept as short and float values, and should not be converted to string.

Furthermore, some metadata is not preserved at all (check unix, E, N)?

Input file metadata (in wcps_irregular_time_nc):

netcdf irr_time {
dimensions:
	unix = 15 ;
	N = 1 ;
	E = 1 ;
variables:
	short band_1(unix, N, E) ;
		band_1:valid_min = -32768s ;
		band_1:valid_max = 32767s ;
		band_1:units = "10^0" ;
	short band_2(unix, N, E) ;
		band_2:valid_min = -32768s ;
		band_2:valid_max = 32767s ;
		band_2:units = "10^0" ;
	short band_3(unix, N, E) ;
		band_3:valid_min = -32768s ;
		band_3:valid_max = 32767s ;
		band_3:units = "10^0" ;
	short band_4(unix, N, E) ;
		band_4:valid_min = -32768s ;
		band_4:valid_max = 32767s ;
		band_4:units = "10^0" ;
	short band_5(unix, N, E) ;
		band_5:valid_min = -32768s ;
		band_5:valid_max = 32767s ;
		band_5:units = "10^0" ;
	short band_6(unix, N, E) ;
		band_6:valid_min = -32768s ;
		band_6:valid_max = 32767s ;
		band_6:units = "10^0" ;
	short band_7(unix, N, E) ;
		band_7:valid_min = -32768s ;
		band_7:valid_max = 32767s ;
		band_7:units = "10^0" ;
	double E(E) ;
		E:axis = "X" ;
		E:standard_name = "longitude" ;
		E:units = "m" ;
	double N(N) ;
		N:axis = "Y" ;
		N:standard_name = "latitude" ;
		N:units = "m" ;
	double unix(unix) ;
		unix:axis = "T" ;
		unix:standard_name = "unix" ;
		unix:units = "d" ;

// global attributes:
		:Conventions = "CF-1.6, ACDD-1.3" ;
		:date_created = "2016-04-12T11:11:42.114427" ;
		:history = "NetCDF-CF file created by datacube version \'1.0.2\' at 20160412." ;
		:product_version = "0.0.0" ;
		:source = "This data is a reprojection and retile of Landsat surface reflectance scene data." ;
		:summary = "These files are experimental, short lived, and the format will change." ;
		:title = "Experimental Data files From the Australian Geoscience Data Cube - DO NOT USE" ;
}

output file metadata:

netcdf \153-irr_cube_3D_time_irregular {
dimensions:
	unix = 15 ;
	N = 1 ;
	E = 1 ;
variables:
	short band_1(unix, N, E) ;
		band_1:valid_min = -32768s ;
		band_1:valid_max = 32767s ;
		band_1:missing_value = -999s ;
		band_1:_FillValue = -999s ;
		band_1:description = "Nadir BRDF Adjusted Reflectance 0.43-0.45 microns (Coastal Aerosol)" ;
		band_1:product_version = "0.0.0" ;
		band_1:test_empty_attribute = "" ;
		band_1:units = "10^0" ;
	short band_2(unix, N, E) ;
		band_2:valid_min = -32768s ;
		band_2:valid_max = 32767s ;
		band_2:missing_value = -999s ;
		band_2:_FillValue = -999s ;
		band_2:units = "10^0" ;
	short band_3(unix, N, E) ;
		band_3:valid_min = -32768s ;
		band_3:valid_max = 32767s ;
		band_3:missing_value = -999s ;
		band_3:_FillValue = -999s ;
		band_3:units = "10^0" ;
	short band_4(unix, N, E) ;
		band_4:valid_min = -32768s ;
		band_4:valid_max = 32767s ;
		band_4:missing_value = -999s ;
		band_4:_FillValue = -999s ;
		band_4:units = "10^0" ;
	short band_5(unix, N, E) ;
		band_5:valid_min = -32768s ;
		band_5:valid_max = 32767s ;
		band_5:missing_value = -999s ;
		band_5:_FillValue = -999s ;
		band_5:units = "10^0" ;
	short band_6(unix, N, E) ;
		band_6:valid_min = -32768s ;
		band_6:valid_max = 32767s ;
		band_6:missing_value = -999s ;
		band_6:_FillValue = -999s ;
		band_6:units = "10^0" ;
	short band_7(unix, N, E) ;
		band_7:valid_min = -32768s ;
		band_7:valid_max = 32767s ;
		band_7:missing_value = -999s ;
		band_7:_FillValue = -999s ;
		band_7:Conventions = "CF-1.6, ACDD-1.3" ;
		band_7:date_created = "2016-04-12T11:11:42.114427" ;
		band_7:units = "10^0" ;
	double E(E) ;
	double N(N) ;
	double unix(unix) ;
		unix:directPositions = "-1000012.5" ;
		unix:max = "1387331687.45" ;
		unix:min = "1370137750.3699999" ;

// global attributes:
		:Conventions = "CF-1.6, ACDD-1.3" ;
		:date_created = "2016-04-12T11:11:42.114427" ;
		:history = "NetCDF-CF file created by datacube version \'1.0.2\' at 20160412." ;
		:product_version = "0.0.0" ;
		:source = "This data is a reprojection and retile of Landsat surface reflectance scene data." ;
		:summary = "These files are experimental, short lived, and the format will change." ;
		:test_empty_attribute = "" ;
		:title = "Experimental Data files From the Australian Geoscience Data Cube - DO NOT USE" ;

Change History (7)

comment:1 by Dimitar Misev, 5 years ago

Description: modified (diff)

comment:2 by Dimitar Misev, 5 years ago

Description: modified (diff)

comment:3 by Bang Pham Huu, 5 years ago

Cc: Vlad Merticariu added

comment:4 by Bang Pham Huu, 5 years ago

Milestone: 9.710.0

comment:5 by Bang Pham Huu, 4 years ago

Milestone: 10.0Future

comment:6 by Dimitar Misev, 4 years ago

Milestone: Future11.0

comment:7 by Bang Pham Huu, 2 years ago

Owner: changed from Bang Pham Huu to Dimitar Misev
Status: newassigned

Currently, _FillValue of output of this query

for c in (test_eobstest) return encode(c[Lat(25:25), Long(60:60), t("1950-01-01":"1950-01-01")], "netcdf")

with the rasql query

SELECT encode(c[0:0,29:29,39:39], "netcdf" , "{\"dimensions\":[\"t\",\"Lat\",\"Long\"],\"variables\":{\"t\":{\"type\":\"double\",\"data\":[127470.5],\"name\":\"t\",\"metadata\":{}},\"Lat\":{\"type\":\"double\",\"data\":[24.75],\"name\":\"Lat\",\"metadata\":{\"long_name\":\"Latitude values\",\"units\":\"degrees_N\",\"standard_name\":\"latitude\"}},\"Long\":{\"type\":\"double\",\"data\":[60.25],\"name\":\"Long\",\"metadata\":{\"long_name\":\"Longitude values\",\"units\":\"degrees_E\",\"standard_name\":\"longitude\"}},\"tg\":{\"type\":\"short\",\"name\":\"tg\",\"metadata\":{\"description\":\"Count of the number of observations from the MERIS sensor contributing to this bin cell\",\"units\":\"10^0\",\"long_name\":\"mean temperature\",\"units\":\"Celsius\",\"standard_name\":\"air_temperature\",\"_FillValue\":\"-9999\",\"scale_factor\":\"0.01\"}}},\"geoReference\":{\"crs\":\"EPSG:4326\",\"bbox\":{\"xmin\":60,\"ymin\":25,\"xmax\":60,\"ymax\":25,\"representation\":\"\\"60,25,60,25\\"\"}},\"metadata\":{\"Title\":\"This is a test file\",\"Project\":\"This is another test file\",\"Creator\":\"This is a test creator file\"},\"nodata\":[-9999]}") FROM test_eobstest AS c

has _FillValue as float

		short tg(t, Lat, Long) ;
  		  tg:missing_value = -9999s ;
		  tg:_FillValue = -9999s ;

but scale_factor is string instead of float as in the original file.

- output of rasdaman:

tg:scale_factor = "0.01" ;


- original file:

tg:scale_factor = 0.01f ;

So, this can be fixed in rasdaman encode netCDF (in petascope does not help), if metadata is string but can be parsed by number, then it should be set as number (float , short ,…)

Note: See TracTickets for help on using tickets.