Opened 10 years ago

Closed 7 years ago

#837 closed defect (fixed)

Parallel ingestion issue

Reported by: Dimitar Misev Owned by: Dimitar Misev
Priority: major Milestone: Future
Component: lockmgr Version: development
Keywords: Cc: Peter Baumann, Dirk Daems, Kinga Lipskoch
Complexity: Medium

Description

"Transfer failed" error when importing in parallel with two rasimport commands like

rasimport --coll TEST -t GreyImage:GreySet -f /home/rasdaman/data/single/GLC2000/glc2000.tiff --crs-uri %SECORE_URL%/crs/EPSG/0/4326

rasimport --coll TEST2 -t GreyImage:GreySet -f /home/rasdaman/data/single/GLC2000/glc2000.tiff --crs-uri %SECORE_URL%/crs/EPSG/0/4326 

Change History (14)

comment:1 by Dimitar Misev, 10 years ago

Cc: Kinga Lipskoch added
Status: newaccepted
RasdamanHelper2::updateImage():image update failed: RasManager Error: Write transaction in progress, please retry again later.

The first rasimport command finishes fine, the second fails with a "write transaction in progress" error though, even though I have 40 servers started and --enable-tilelocking in rasmgr.conf

So the tile locking doesn't seem to work properly as far as I can tell.

Kinga do you maybe have any idea about this?

Last edited 10 years ago by Dimitar Misev (previous) (diff)

comment:2 by Kinga Lipskoch, 10 years ago

According to the error message the query/queries try to execute a write on the same tile.
I would first check if that is the case or not.

comment:3 by Dimitar Misev, 10 years ago

That should not be the case because

  1. it is an insert
  2. into different collections TEST and TEST2

comment:4 by Kinga Lipskoch, 10 years ago

But, insert does not use the lockmanager as far as I remember, only select and update do.

comment:5 by Dimitar Misev, 10 years ago

Hmm so what happens on insert, the whole database is locked? Why?

Same thing happens with rasql as well btw.

I think my rasmgr.conf should be ok, right? Each server is defined like this:

define srv N40 -host hifi -type n -port 90040 -dbh rasdaman_host
change srv N40 -countdown 200 -autorestart on -xp --timeout 300 --enable-tilelocking

comment:6 by Kinga Lipskoch, 10 years ago

From the lockmanager's perspective I implemented nothing should be locked with an insert.
Maybe some other code does that.
I think the rasmgr.conf is fine, at least from what I remember.

comment:7 by Peter Baumann, 10 years ago

at least from the classical perspective, the whole database is locked, so the engine may fall back to that. Anyway, insert is a situation any lockmgr implementation should take into account - see std SQL.

comment:8 by Dimitar Misev, 10 years ago

Ok I removed the whole database write lock in rasmgr, and now for the second request I get this in the rasserver log:

Warning/error in DBMDDObj::insertInDb():
SQLSTATE: 23505 SQLCODE: -403

Throwing Exception (SQLCODE=-403): Warning/error in DBMDDObj::insertInDb():
SQLSTATE: 23505 SQLCODE: -403


Warning/error in DBStorageLayout::insertInDb() INSERT INTO RAS_STORAGE:
SQLSTATE: 25P02 SQLCODE: -400

Throwing Exception (SQLCODE=-400): Warning/error in DBStorageLayout::insertInDb() INSERT INTO RAS_STORAGE:
SQLSTATE: 25P02 SQLCODE: -400


Warning/error in DBMinterval::insertInDb() INSERT INTO RAS_DOMAINS:
SQLSTATE: 25P02 SQLCODE: -400

Throwing Exception (SQLCODE=-400): Warning/error in DBMinterval::insertInDb() INSERT INTO RAS_DOMAINS:
SQLSTATE: 25P02 SQLCODE: -400

From the ECPG docs:

  • -403 (ECPG_DUPLICATE_KEY)

Duplicate key error, violation of unique constraint. (SQLSTATE 23505)

  • -400 (ECPG_PGSQL)

Some error caused by the PostgreSQL server. The message contains the error message from the PostgreSQL server.

comment:9 by Peter Baumann, 10 years ago

this is likely more involved than just removing the whole lock (BTW, why is this still present in the code?). Kinga, although officially you have other obligations now, can I ask you to finalize the lockmgr job? thanks!

comment:10 by Dimitar Misev, 10 years ago

@Kinga: the write transaction check is in source:rasmgr/rasmgr_master_nb.cc, line 691

comment:11 by Dimitar Misev, 10 years ago

I have to wait for the new protocol code to be submitted in the repo before I can test the lockmgr (due to some limitations in RNP), so a few more days.

comment:12 by Dimitar Misev, 10 years ago

Milestone: 9.0.x9.1

comment:13 by Dimitar Misev, 10 years ago

Milestone: 9.1Future

comment:14 by Dimitar Misev, 7 years ago

Resolution: fixed
Status: acceptedclosed

rasimport is outdated and petascope/wcst_import.sh work fine now with parallel ingestion.

Note: See TracTickets for help on using tickets.