Opened 13 years ago

Closed 12 years ago

Last modified 12 years ago

#128 closed defect (fixed)

segmentation fault in rasserver causes oid operator failures

Reported by: beccati@… Owned by: Heinrich Stamerjohanns
Priority: major Milestone: 8.4
Component: rasserver Version: 8.3
Keywords: Cc: Peter Baumann
Complexity: Medium

Description (last modified by Dimitar Misev)

Following the two discussions on the mailing list [1] [2] that seem to be strictly related I open this ticket to keep track and gather info.

[1] http://groups.google.com/group/rasdaman-dev/browse_thread/thread/ac8324f9c09d0a4c/cce6409251be3ebe
[2] http://groups.google.com/group/rasdaman-dev/browse_thread/thread/2fbbe0842583cbc0/8b5d9d4c91065629

I am getting a weird behaviour from the oid operator in rasql,
suddenly it gives a non integer oid value for all collections (it is
the same for every collection I tested).

$ rasql -q "select oid(t) from MOD_WATERVAPOR_32633_1000 as t" --out
string
rasql: rasdaman query tool v1.0, rasdaman v8 -- generated on
31.01.2012 17:14:19.
opening database RASBASE at localhost:7001...ok
Executing retrieval query...ok
Query result collection has 1 element(s):
  Result element 1: 2.37934e-317
rasql done.

After restarting the server with:

stop_rasdaman.sh
start_rasdaman.sh

everything gets back to normal

$ rasql -q "select oid(t) from MOD_WATERVAPOR_32633_1000 as t" --out
string
rasql: rasdaman query tool v1.0, rasdaman v8 -- generated on
31.01.2012 17:14:19.
opening database RASBASE at localhost:7001...ok
Executing retrieval query...ok
Query result collection has 1 element(s):
  Result element 1: 213505
rasql done.

It would appear to be an issue only for the oid() operator and
displayed output since a different query based on oid works as
expected:

rasql -q "select csv(m[0:10,0:10]) from mr as m where oid(m)=31745" --
out string

But in the meantime anything relying on getting oids with select
breaks.

Here follows log excerpt and problem description, ongoing operations were rasgeo insert to a non existing collection that should have been followed by an update.

--- KERNEL LOG ---
Feb 28 12:42:26 ras-dev-dar kernel: rasserver[15480]: segfault at 0 ip 00000000004c089b sp 00007fff9ef3bc80 error 4 in rasserver[400000+1a8000]
--- rasmgr.001504.log ---
[2012-02-28 12:42:26] client request from 192.168.0.115: 'get server'...ok
rasdaman server process with pid 15480 has terminated.
Error: rasdaman server N1, pid 15480 terminated illegally, reason: uncaught signal 11
[2012-02-28 12:42:26] starting server N1, executable /usr/local/rasdaman8.3/bin/rasserver; pid 19987...[2012-02-28 12:42:27] client request from 192.168.0.115: 'get server'...ok
--- N1.015480.log (last entries) ---
[2012-02-28 12:42:26] request from 192.168.0.115
Request: 'insert MDD', type 'GreyCube', domain [0:0,0:0,0:0], cell length 1, 1 bytes...ok
[2012-02-28 12:42:26] request completed in 356 usecs.

PerformanceTimer: RnpRasDaManComm :: request = 366 usecs

[2012-02-28 12:42:26] request from 192.168.0.115
Request: 'insert tile'...insertTile created new TransTile (Array), changing endianness...
--- N1.019987.log (repeated insert, should have been an update ---
[...]
[2012-02-28 12:42:56] request from 192.168.0.115
Request: 'insert into none_TEST_32630_1000 values marray x in [0:0,0:0,149941:149941] values 0c'...parsing...checking semantics...evaluating...ok
[2012-02-28 12:42:56] request completed in 5703 usecs.
--- oid(t) output ---
rasql: rasdaman query tool v1.0, rasdaman v8 -- generated on 31.01.2012 17:14:19.
opening database RASBASE at localhost:7001...ok
Executing retrieval query...ok
Query result collection has 2 element(s):
  Result element 1: 2.37934e-317
  Result element 2: 2.37934e-317
rasql done.

Restarting the server N1 via rascontrol solves the oid issue

Change History (5)

comment:1 by Heinrich Stamerjohanns, 13 years ago

Owner: set to Heinrich Stamerjohanns
Status: newassigned

comment:2 by Dimitar Misev, 13 years ago

Cc: Peter Baumann added
Description: modified (diff)

Alan, thanks for providing the logs. I've opened #127 already for this issue, but I'll move it to a different topic, and leave this one for the problem you get.

I can't understand what exactly was rasgeo doing, can you maybe post the rasgeo command (along with small sample data maybe)?

comment:3 by Heinrich Stamerjohanns, 12 years ago

The patch from 2012-10-01 00:17:11 should have fixed this problem.
Could you please rerun your queries to confirm, that the problem is fixed?
Thanks.

comment:4 by Dimitar Misev, 12 years ago

Resolution: fixed
Status: assignedclosed

As well as

commit ebe14b8a1f402539cdc20a58a3166c024451ab12
Author: Dimitar Misev <misev@rasdaman.com>
Date:   Fri Nov 9 18:10:30 2012 +0100

    Recognize SELECT INTO expressions in rasql (ticket 238)

which fixed #238, so let's close this ticket then.

comment:5 by Dimitar Misev, 12 years ago

Complexity: Medium
Milestone: 8.4
Note: See TracTickets for help on using tickets.