Opened 3 years ago

Closed 17 months ago

#735 closed question (fixed)

GetCapabilities pings rasdaman for every coverage

Reported by: dmisev Owned by: mdumitru
Priority: major Milestone: 9.0.x
Component: petascope Version: development
Keywords: getcapabilities performance Cc: joosthoek, mase, jpass, an.rossi@…
Complexity: Medium

Description

A GetCapabilities request seems to send request like

select sdom(c)[0] from coll as c

for every axis of every coverage in petascope. This should be optimized (caching, perhaps) as it really slows it down (think about thousands of coverages). The simplest optimization would be to send only one request per coverage

select sdom(c) from coll as c

Attachments (2)

GetCapabilities_opt_400x_3mins.png (53.2 KB) - added by pcampalani 3 years ago.
GetCapabilities without BBOX/covType performance profile - 400reqs/8covs
GetCapabilities_400x_10mins.png (46.8 KB) - added by pcampalani 3 years ago.
GetCapabilities with BBOX/covType performance profile - 400reqs/8covs

Download all attachments as: .zip

Change History (28)

comment:1 Changed 3 years ago by pcampalani

As discussed with Dimitar, we could also add a further parameter in the petascope.properties (as done for ows:Metadata, see #314) to enable/disable BBOXes in coverage summaries.

comment:2 Changed 3 years ago by pcampalani

  • Component changed from undecided to petascope
  • Keywords getcapabilities performance added
  • Owner changed from dmisev to pcampalani

comment:3 Changed 3 years ago by pbaumann

hm, BBOXes or not seems like a different issue from a client perspective (and relevant there indeed). For this ticket, I prefer Dimitar's idea of making sure to issue only 1 sdom() per coverage.

Stupid question: we are repeating some metadata already in the PS_ tables, wouldn't this one make sense, too? Maybe other use cases where sdom() slows down? I'm not easily for caching, but a complexity of O(n) for a GetCapabilities? doesn't sound like fun.

comment:4 Changed 3 years ago by jpass

Replying to dmisev:

for every axis of every coverage in petascope. This should be optimized (caching, perhaps) as it really slows it down (think about thousands of coverages).

Even for hundreds of coverages the response time is quite slow for example, comparing GetCapabilities responses between Rasdaman 8 and 9 on the servers themselves to cut out any networking issues with requests like:

Rasdaman 8
===========
time curl -w %{size_download} / 
 --request GET "http://localhost/petascope?service=WCS&request=GetCapabilities"

Rasdaman 9
===========
time curl -w %{size_download} / 
 --request GET "http://localhost/rasdaman/ows?service=WCS&request=GetCapabilities"

We have:

Server details Average response time Response document (bytes) Number of Coverages
Current internal service
(Rasdaman 8)
less than 0.5 seconds20946 65
Current public service
(Rasdaman 8)
about 0.5 seconds32812131
New service
(Rasdaman 9)
about 49 seconds134016186

comment:5 Changed 3 years ago by dmisev

Yes, and for thousands of coverages it takes 10+ minutes (Jelmer has experience with this).

I think first step is to disable the bboxes by default, and add an option to enable them in the petascope.properties. Then we can optimize it further.

Last edited 3 years ago by dmisev (previous) (diff)

comment:6 Changed 3 years ago by pbaumann

hm, I see a ratio of 100x in the response times, that seems not explained sufficiently with a maximum of 4x when accessing each axis. Maybe some other, additional effect in the code?

comment:7 Changed 3 years ago by mase

  • Cc mase jpass added

comment:8 follow-up: Changed 3 years ago by pcampalani

Are we talking about the first GetCapabilities request?
That additionally needs to parse and cache CRS defs from SECORE.
The bottleneck anyway are usually the sdom requests, as the number of CRS used by the service is usually not so big.

comment:9 in reply to: ↑ 8 Changed 3 years ago by jpass

Replying to pcampalani:

Are we talking about the first GetCapabilities request?
That additionally needs to parse and cache CRS defs from SECORE.
The bottleneck anyway are usually the sdom requests, as the number of CRS used by the service is usually not so big.

All GetCapabilities requests are taking this time, so some way of caching them, or even using a statically generated one, or some similar hack would be useful.

comment:10 Changed 3 years ago by pcampalani

I made a performance comparison between the current GetCapabilities management, and an optimized case where a single query to petascopedb is needed to fetch the coverage name and nothing more (DbMetadataSource.coverages()):

current optimized covs
1st ~17.000 s ~4.000 s (25%) 8
nth ~ 0.700 s ~0.035 s (5%) 8

So there are considerable gains (I'm attaching detailed monitoring details on 400 subsequent requests right away).

I have some proposals for solving this relevant problem:

  1. add parameter to disable BBOX in GetCapabilities
  2. fetch the coverage type in DbMetadataSource.coverages(), in addition the coverage name;

This would provide quick responses but without BBOX nor OWS metadata, so the solution is not really optimal I guess.

If we want to cache either the whole XML document or the Java summaries objects (or directly the whole CoverageMetadata instances) we have to discuss how/when to refresh the cache, eg:

  • define a ReloadCapabilities request, like implemented for the WMS service;
  • define a lighter ReloadCoverage request;
  • ?

Additionally, we could add triggers in petascopedb/RASBASE when metadata/spatial-domains change for a coverage.
I suggest we continue the discussion on the m-list now.

Changed 3 years ago by pcampalani

GetCapabilities without BBOX/covType performance profile - 400reqs/8covs

Changed 3 years ago by pcampalani

GetCapabilities with BBOX/covType performance profile - 400reqs/8covs

comment:11 Changed 3 years ago by dmisev

changeset:1b57f46123827e40ba0975892b15511c6c477907 changes from one request per axis to one per coverage.

comment:12 Changed 3 years ago by pcampalani

  • Cc an.rossi@… added

comment:13 Changed 3 years ago by pbaumann

I see this ticket still open, so a thought here: databases are known as inefficient when it comes to single tuple shipping ("navigational access"). They excel when returning sets. So, why do we have one access per axis (or per coverage now)? Why not collect all ids and send one request instead?

comment:14 Changed 3 years ago by dmisev

That would not be possible in rasql, you can return only a single 'column', i.e. this is not possible

select sdom(rgb), sdom(mr2) from rgb, mr2

comment:15 Changed 3 years ago by dmisev

Maybe if we would allow a union of collections, something like

select sdom(c) from (rgb, mr2) as c

comment:16 Changed 3 years ago by pbaumann

ah, I thought the "database calls" refer to PS tables. My mistake.

comment:17 Changed 3 years ago by pcampalani

  • Priority changed from major to blocker

Due to the serious performance issues of the capabilities document in v9, I am raising the priority of this ticket to blocker: I believe we really need to fix this for the next minor release.

I will proceed on the related topic in our m-list for discussions on how to fix this.

comment:18 Changed 3 years ago by pcampalani

(Configurable) bbox in coverage summary in added in changeset:46aaa33.

Dimitar, let me know if I can close the ticket in a reasonable time, thanks. !

comment:19 Changed 3 years ago by dmisev

  • Resolution set to fixed
  • Status changed from new to closed

Perhaps it would've been better to have it disabled by default?

comment:20 Changed 3 years ago by mase

  • Resolution fixed deleted
  • Status changed from closed to reopened

I have made the following timings since upgrading to v9.0.4 by running the command:

time curl -w %{size_download} --request GET "http://localhost/rasdaman/ows?service=WCS&request=GetCapabilities"

3 times and ignoring the first one if it is longer because of being the first request after a restart.

I notice 3 new settings in petascope.properties that seem relevant.

metadata_in_covsummary=true
bbox_in_covsummary=true
description_in_covsummary=true

Response time 50 seconds (1 minute 5 seconds first time after restart)

metadata_in_covsummary=true
bbox_in_covsummary=false
description_in_covsummary=true

Response time 49 seconds

metadata_in_covsummary=false
bbox_in_covsummary=false
description_in_covsummary=false

Response time 50 seconds (1 minute 5 seconds first time after restart)

So this problem doesn't seem to be fixed after all. Re-opening ticket.

comment:21 Changed 3 years ago by pcampalani

Marcus, changeset:cdc7a85 has been applied.
Turn those 3 params to false and then you should have a faster capabilities response.
Let us know, thank you very much.

comment:22 Changed 3 years ago by mase

I'll do that when it gets included in an RPM release.

comment:23 Changed 3 years ago by pbaumann

  • Owner changed from pcampalani to mdumitru
  • Status changed from reopened to assigned

comment:24 Changed 3 years ago by dmisev

  • Priority changed from blocker to major

The parameters are false by default now, this is not a blocker I'd say.

comment:25 Changed 17 months ago by bphamhuu

I've imported > 1000 coverages by wcst_import, the result is very quick (as 3 parameters are false by default)

time curl -w %{size_download} --request GET "http://localhost:8080/rasdaman/ows?service=WCS&request=GetCapabilities"
452997
real	0m2.003s
user	0m0.001s
sys	0m0.024s

comment:26 Changed 17 months ago by dmisev

  • Resolution set to fixed
  • Status changed from assigned to closed

Yes with these parameters off it's pretty fast. Let's close this ticket.

Note: See TracTickets for help on using tickets.