export of transformed data inactive

Added by Pierre Jamagne over 2 years ago


After having transformed my population distribution data (32.000 records) according to statisticaldistribution, the function "export transformed data" is inactive. When I transform a subset of 20.000, then 15.000, then 10.000 records,it remains inactive. The export function is working when I use a subset of 5000 records.

How can I do to have my export function working with 32.0000 records?

Thank you

Pierre Jamagne

Replies (4)

RE: export of transformed data inactive - Added by Simon Templer over 2 years ago

Hi Pierre,

from the description of the problem and the fact that this is the Population Distribution schema I can only guess that this might be a memory issue. Is the memory consumption bar in the status bar at the bottom of the window 90 to 100% all the time?
HALE only keeps the objects in memory that are currently transformed, and for "normal" application schemas that works quite well, because a single feature is not that big. What may be the problem in your case may be, that in the Population Distribution schema you have to fit all data into a single Statistical Distribution object. That means all the data has to fit into the memory assigned to HALE at once.

The working memory assigned to HALE by default is 800MB - you should increase that value and test again. You can change the value in the HALE.ini file (e.g. -Xmx2048m instead of -Xmx800m). The maximum value you can use here depends on your system (e.g. if it's 32 or 64 bit).
What also improves memory management is if you do the transformation not from within the HALE desktop application but use the command line interface for transformation. In that case none of the memory is required for the User Interface. If you use HALE with the user interface, it may help to close the map perspective and window.

Hope that helps


RE: export of transformed data inactive - Added by Pierre Jamagne over 2 years ago

Dear Simon,

1.Population distribution on the grid (32000 records, 3 population variables) When doing the same operations with my home laptop, everything went allright. So I guess there might be a problem with my office PC.

2. With my home laptop,Doing the same transformation for my population for another geography (neighbourhoods or statistical districts, 20.000 records for the whole of Belgium , 6 population variables), it failed. But when I took a subset of the shapefile (1000 records for the Brussels region), it works perfectly well. So there must be a problem of capacity? I do not understand that with the 32.000 records , It worked well while with 20.000 records it fails. Do you have an explanation?

Sincerely yours,
Pierre Jamagne

RE: export of transformed data inactive - Added by Pierre Jamagne over 2 years ago

Dear Simon

The transformation of my population distribution has been finally achieved when I splitted my source file into 11 provinces. The largest resulting gml file for a province is 9.000 KB. For the provinces with large number of records (around 2000-2500 rec) I had to adjust hale.ini in such a way the work memory has been brought to 8000 and also had to reinstall Hale.

For the grid population, my resulting gml file for the grid population (32.000 rec) is 120.000 KB.

So I think it would be nice if there could be an improvement in Hale to allow such kind of source file transformed in one single file and not 11 files.

Thank you for your valuable help

P Jamagne

RE: export of transformed data inactive - Added by Simon Templer over 2 years ago

Hi Pierre,

thank you for your feedback. I'm glad you got it to work for your use case.

The way to go to improve the performance for this kind of project would be using an alternative to Merging many features at once, e.g. by creating individual StatisticalValue objects (with Retype or a Merge/Join with limited scope) that are then wrapped in a StatisticalDistribution as container.
But StatisticalDistribution also has other information to be filled, which is right now not possible out-of-the-box if used as an output container in HALE.

Development on HALE is mostly done based on the projects we do, and currently we are not working on any projects where this is an issue. In general, with a mapping that has no type relations that require holding a lot of (or even all) objects in memory at once, memory usually isn't an issue and it is no problem creating several Gigabytes of GML. So I'm afraid this is not on our top priority list right now.

Best regards,