Opened 11 years ago
Closed 9 years ago
#772 closed enhancement (fixed)
document inv_csv function
Reported by: | Marcus Sen | Owned by: | Peter Baumann |
---|---|---|---|
Priority: | major | Milestone: | 9.2 |
Component: | manuals_and_examples | Version: | 9.0 |
Keywords: | Cc: | James Passmore, Peter Baumann | |
Complexity: | Medium |
Description (last modified by )
Implement an inv_csv
function for reading nD CSV files.
Not strictly for CSV, as the formatting doesn't matter much actually, so it's very flexible.
Numbers from the input file are read in order of appearance and stored without any reordering in rasdaman;
whitespace plus the following characters are ignored: '{', '}', ',', '"', '\'', '(', ')', '[', ']'
Mandatory extra parameters:
domain
- minterval, e.g.[1:5,0:10,2:3]
- the domain has to match the number of cells read from the input file
basetype
- array base type, e.g.long
,char
, etc.- struct types have to be specified fully, e.g.
struct { char red, char blue, char green }
- struct types have to be specified fully, e.g.
Example
A
is a 2x3 array of longs:
1,2,3,2,1,3
Inserting A
can be done with
insert into A values inv_csv($1, "domain=[0:1,0:2];basetype=long")
B
is an 1x2 array of RGB values
{1,2,3},{2,1,3}
Inserting B
can be done with
insert into B values inv_csv($1, "domain=[0:0,0:1];basetype=struct {char red, char blue, char green}")
B
could just as well be formatted like this with the same effect:
1 2 3 2 1 3
Change History (21)
comment:1 by , 11 years ago
Cc: | added |
---|---|
Component: | undecided → conversion |
comment:2 by , 11 years ago
Component: | conversion → manuals_and_examples |
---|---|
Owner: | changed from | to
Status: | new → assigned |
Reassigning to Peter for fixing the manual.
comment:4 by , 10 years ago
Owner: | changed from | to
---|
Proposal
Implement an inv_csv
conversion function that will read a plaintext csv-like representation of an array.
Problem
How to encode the domain/type of the array in the plaintext file?
The bounding box can be encoded with parentheses or other markers, as is done in the csv
function. There is no option for representing the type however.
Solution
Encode domain/type with the extra params of inv_csv. This allows to get rid of the parentheses in the input file, and have just comma-separated values (proper csv encoding).
Rules for the csv encoding:
- single values are separated by comma
- composite values are wrapped in braces, within which single values are separated by commas
- white space is ignored
Extra params:
domain
- minterval, e.g.[1:5,0:10,2:3]
basetype
- array base type, e.g.RGBPixel
,long
,char
, etc.
Example
A
is a 2x3 array of longs:
1,2,3,2,1,3
Inserting A
can be done with
insert into A values inv_csv($1, "domain=[0:1,0:2];basetype=long")
B
is an 1x2 array of RGB values
{1,2,3},{2,1,3}
Inserting B
can be done with
insert into B values inv_csv($1, "domain=[0:0,0:1];basetype=RGBPixel")
Implementation
In source:conversion/csv.cc the convertFrom() function should be implemented. tiff.cc would provide a good example for the implementation.
Appropriate tests should be provided in source:systemtest/testcases_mandatory/test_conversion/test.sh
comment:5 by , 10 years ago
Component: | manuals_and_examples → conversion |
---|---|
Milestone: | → 9.1 |
comment:6 by , 9 years ago
Owner: | changed from | to
---|
comment:7 by , 9 years ago
Milestone: | 9.1 → 9.2 |
---|
comment:8 by , 9 years ago
Description: | modified (diff) |
---|
comment:9 by , 9 years ago
Description: | modified (diff) |
---|
comment:10 by , 9 years ago
Description: | modified (diff) |
---|---|
Type: | defect → enhancement |
comment:11 by , 9 years ago
looks good, but it is not exactly CSV
Goal should be that exported CSV can be imported again within a rasdaman ecosystem. Ideally with other tools as well, but that's a nightmare anyway, see https://en.wikipedia.org/wiki/Comma-separated_values.
Therefore, 2 friendly amendments:
- make mandatory extra parameters optional and assume suitable defaults (eg, "widest" data type as cell type)
- to this end, don't ignore nested {…}, but use them for recognizing domains (and throw an exception if the number of elements in some extent does not match with its neighbours)
comment:12 by , 9 years ago
I'm pretty sure any CSV format in inv_csv is supported properly.
What you are proposing is too error-prone and difficult to get right though..
I'd rather have a flexible format support at the expense of having to specify the domain and base type (and a very simple implementation as well).
comment:14 by , 9 years ago
ok, it's in - is there anything else I need to know for documenting it in the QL guide?
comment:16 by , 9 years ago
Owner: | changed from | to
---|
Reminder to document this, the QL guide still says "Note that inv_csv() is not implemented currently."
comment:17 by , 9 years ago
Component: | conversion → manuals_and_examples |
---|---|
Summary: | inv_csv function not supported → document inv_csv function |
comment:18 by , 9 years ago
the format() functions are deprecated, I hate to describe an obsoleted inv_csv() - therefore: I assume the same functionality is available as decode( $1, "csv" )? What is the exact format specifier? thx.
comment:19 by , 9 years ago
confirmation needed, is this correct?
"The decode() function automatically detects the format used, so there is no format parameter."
inv_csv is a mistake in the QL guide.