Feed on
Posts
Comments

Last week, Benjamin Herfort and Melanie Eckle of the HeiGIT team were invited to the MSF office in London to join a workshop around “Making sense of humanitarian geospatial data: assessing data, methods and governance practices to improve the impact of humanitarian interventions on health and wellbeing”, organized by the Centre for Interdisciplinary Methodologies, University of Warwick in Coventry.

The Warwick team gathered participants with a background and expertise in GIS, epidemiology, medical innovation as well as tech and community leads from MSF, British Red Cross, the Humanitarian OpenStreetMap Team and Bangladesh Humanitarian OpenStreetMap Operations Team (BHOOT), MapAction, Healthsites as well as University representatives from the Imperial College London and our group to discuss current approaches and challenges and most importantly: how to collaboratively address them.

The first day, participants provided an overview of their projects, information gaps and limitations in data use. The particpants then defined and prioritized emerging common themes that were transformed into direct action points and a collaborative research agenda the next day. Please also find further information about the background of the workshop and the agenda here.

Workshop particpants at MSF UK

The two-day event was perfectly round off with the London Missing Maps Mid-Month Mapathon. The participants of the workshop and London Missing Maps community herein had the chance to further discuss the workshop insights and receive direct community feedback. Tasauf A Baki Billah, community lead of the BHOOT team, furthermore provided an overview of their previous and current activities and the inspiring developments all around OSM in Bangladesh.

Tasauf (BHOOT) providing insights in OSM Bangladesh

We thank everyone for the great exchanges and discussions and are excited to commonly put our ideas into action.

The 7th International Conference on Cartography & GIS took place at Sozopol (Bulgaria) 18-23 June 2018.

iccgis logo

It was organized by Bulgarian Cartographic Association, International Cartographic Association and University of Architecture, Civil Engineering and Geodesy (UACEG), Sofia and co-organizers - Military Geographic Service, Bulgaria, and Bulgarian Red Cross. The main topics of the conference were:

  • GIS Technologies and Related Disciplines;
  • GIS for Geology;
  • Natural Sciences and Ecosystems;
  • Web Cartography and Digital Atlases;
  • Map Design and Production;
  • Cartographic Visualization;
  • Geodetic Coordinate Systems and Map Projections;
  • 3D Cartographic Modelling;
  • Cartography and GIS in Education;
  • Geoinformation for Smart Cities;
  • Geo-Spatial Analysis and Data Mining;
  • Virtual Geographic Environment;
  • UAV Applications and New Trends;
  • Geospatial Data Acquisition by Remote Sensing Technologies for Cartographic Purposes.

Best papers in the Special Session Digital Earth for Smart Solutions

A special session "Digital Earth for Smart Solutions" was held with the support of International Society for Digital Earth (ISDE). The papers of Prof. Jie Shen and Dr. Alexey Noskov presented during the Special Session “Digital Earth for Smart Solutions” are recommended to International Journal of Digital Earth (IJDE) for the peer-review procedure.

best paper awards

Two presented works are carried out in the frame of the WeGovNow project:

The proposed concepts and solutions will be deeply integrated with the OHSOME big spatial data analitic platform, which is developed as a part of the HeiGIT project by the GIScience research group.

The Centro Singular de Investigación en Tecnoloxías da Información - CiTIUS of the University of Santiago de Compostela (Prof. Rivera) is hosting a workshop on 27 September 2018, which is dedicated to the current research progress in LiDAR data analysis and shall bring together academic research and industry.

Prof. Bernhard Höfle, head of the 3DGeo Group of GIScience Heidelberg and IWR, was invited to give a keynote speech on the question if “Big Point Clouds == Deep Information?” and to present Heidelberg’s current progress in geospatial 3D/4D point cloud analysis. Feel free to follow the talk in the online video stream: https://citius.usc.es/novidades/eventos/lecture-big-pont-clouds-deep-information-current-progress-geospatial-3d4d-point

GSIS Special Issue: Crowdsourcing for Urban Geoinformatics.

Geo-Spatial Information Science (GSIS), Volume 23, Issue 3. Taylor & Francis.

Guest Editors:
Hongchao Fan, Rene Westerholt, João Porto de Albuquerque, Bernd Resch and Alexander Zipf

TOC:

In this blog post we report a small study on the size of the OpenStreetMap history data set, which we carried out in the context of developing the OSHDB. We estimated the size of OSM’s history in terms of raw data as it would be represented in a systems programming language such as C or Rust. We were interested in the raw amount of space the data would occupy, i.e., without the overhead of general purpose data formats like XML and without employing data compression that would lead to a computational overhead.

This size estimate gives us

  1. an estimation of the hardware resources required to store and process the full OSM history;
  2. a base reference to which we can compare our efforts of representing the data more compactly.

Overall Methodology and Results

We first defined systems-level data structures for the OSM data objects changeset, member, nd (node reference), node, relation, tag and way. We calculated the size of these data structures and multiplied them with the counts of corresponding XML elements in the OSM history dump. The results are presented in the following table.

Table 1: Number of XML elements, size of data structures and total data size (reported on the full history planet XML file downloaded on 2018-08-08).

XML element name data structure
size (bytes)
XML element count OSM history
size (bytes)
changeset 65 61,149,797 3,974,736,805
member 13 2,127,505,107 27,657,566,391
nd 8 14,537,932,143 116,303,457,144
node 49 7,024,909,604 400,419,847,428
relation 32 24,631,930 1,379,388,080
tag 16 4,861,971,375 77,791,542,000
way 32 957,849,585 47,892,479,250
total size 675,419,017,098
variant 603,352,497,027
compressed XML 111,691,457,431
uncompressed XML 2,001,921,015,360

In Table 1, one immediately sees that the nodes consume more space than the rest of the data, with the way’s node references coming second. Overall, OSM’s history occupies between 600 GB and 700 GB, which is roughly 6 times the size of the compressed XML (or 10 times the .pbf) and 1/3 of the uncompressed XML. This estimate does not include the concrete string representations of the tags which we represent by unsigned integer IDs. To our experience, these strings take less than 1% of the total size of the data and, hence, are not a major factor of our estimate.

The raw data structures have been designed to closely represent the XML data structure. That is, there is still some room for space optimizations. We could leave out the redundant user id stored at each node, way and relation because it is also stored in the corresponding changeset. Further, we could store the the visibility-bit in the highest bit of the version number, instead of representing it as an otherwise unused byte. With these two simple optimizations we could save 9 bytes per node, way and relation, which results in the total size reported as “variant”. With the current number of tag keys and tag values, we could also halve the size of the tag data structure, which would save another 38GB.

Concerning question (1), the hardware resources, this result is quite promising. As the above numbers report only the raw data size, we would have to add some indices to make effective use of the data. However, assuming the indices to be smaller than the data itself, the data would still fit into less than 1TB. In the era of TB SSD hard drives this is not too big.

Technical Details

For the technically interested reader, the remainder of this blog post discusses the detailed layout of the data structures and how we computed the above numbers. Recall that our goal is to estimate the size of OSM’s history if we would store it in plain old data structures. This shall serve us as a baseline for evaluating our effort in finding more compact representations.

Data Model

We describe the data structures with a notation roughly borrowed from Rust in order to easily compute the size of the data structures. That is, we use types like u64 denoting an 64-bit unsigned integer or i32 denoting a 32-bit signed integer, which consume 8 and 4 bytes respectively. A type in square brackets denotes an array of objects of the given type. The data structures closely follow OSM’s XML format, so there is not much to explain here. The comment in the first line of each structure reports the total size of the data structure not including the arrays because the latter will be counted separately.

changeset { // 65 bytes
  id: u64,
  creation_time: time64,
  closing_time: time64,
  open: bool,
  user_id: u64,
  min_lon: i32,
  min_lat: i32,
  max_lon: i32,
  max_lat: i32,
  num_changes: u32, // with current limit of 10000, u16 would suffice here
  comments_count: u32,
  tags_count: u64,
  tags: [tag],
}

node { // 57 bytes
  id: u64,
  lat: i32,
  lon: i32,
  version: u64,
  timestamp: time64,
  changeset: u64,
  user_id: u64,
  visible: bool,
  tags_count: u64,
  tags: [tag],
}

member { // 13 bytes
  type: u8,
  reference: u64,
  role: u32,
}

relation { // 56 bytes
  id: u64,
  version: u64,
  timestamp: time64,
  changeset: u64,
  user_id: u64,
  visible: bool,
  members_count: u64,
  members: [member],
  tags_count: u64,
  tags: [tag],
}

tag { // 16 bytes (maybe 2x u32 = 8 bytes are sufficient here)
  key: u64,
  value: u64,
}

way { // 42 bytes
  id: u64,
  version: u64,
  timestamp: time64,
  changeset: u64,
  user_id: u64,
  visible: bool,
  nodes_count: u16,
  nodes: [u64],     // node ids
  tags_count: u64,
  tags: [tag],
}

Computing the Estimates

The next step was to compute the size of the full history planet XML file. The size of the compressed XML file was easily read from the file system. Because the uncompressed XML file would not fit on the hard disk of the employed machine, we went by running lbzcat history-latest.osm.bz2 | wc -c for computing the size of the uncompressed XML.

The next step was to process the XML file in order to count the numbers of occurrences for each of the OSM data types. To this end, we implemented a small program that counts the number of occurrences of XML tags in the file. Assuming the file is well-formed, the sum of start tags and empty-element tags corresponds to the number of XML elements. We used lbzcat to uncompress the packed XML file in parallel and piped the result into our program. In order to validate the results and to gain some experience of how suitable different programming approaches are for our purposes, we implemented this program in three different programming languages: Python, Haskell and Rust.

The Python and Haskell implementations exploit the fact that OSM’s XML dump stores each XML tag on a separate line. So counting the occurrence numbers was easily done by inspecting the first word on each line. The Rust implementation employs the full fledged XML pull parser quick_xml as we wanted to be sure that the previous exploit was correct.

Performance Considerations

We computed the above numbers on an Intel® Core™ i7-7500U CPU @ 2.70GHz × 4. The Python and Haskell implementations took about 10 hours to process the data, the Rust implementation only 4 hours. This is particularly remarkable because the Rust implementation solves a more complex task of processing the full XML.

We also have to consider that the bz2-compression causes a significant overhead. For instance, the Rust implementation was running at 60% CPU while lbzcat was eating up the remaining 340% CPU of the 400% CPU provided by the four cores. Limiting lbzcat to three cores made our implementation use even less CPU, which indicates that the decompression is the limiting factor here. In fact, only decompressing the file by running lbzcat history-latest.osm.bz2 > /dev/null took 3 hours. Therefore, we measured the performance of our implementation on a smaller region that we were able to uncompress in advance. Processing directly the uncompressed XML showed a performance improvement by 30-40% compared to uncompressing the file on the fly. This roughly matches the fact that our program used 100% CPU with a priori decompression, which would take our 4 hours down to roughly 2.5 hours. When exploiting that we have a fixed set of possible XML tags, we can speedup our implementation by factor 2 on the small data set, which would take down the processing time to 1.25 hours for the global history dump.

Of course, XML is a less than optimal data format for such a task. Our main reason for working with the XML dump was the ease of implementing the processing. Both the Python and the Haskell program have less than 10 lines of code and were implemented in a few minutes each. With about 20 lines of code, the Rust implementation is slightly more complex as it really processes the XML. The obvious alternative is the full history planet PBF file, which is designed to be more friendly to the machine but involves considerably more work for the programmer. For comparison, we implemented the same counting process for the PBF dump in about 50 lines of C++ using libosmium. The computation took 0.5 hours with the process running at 200% CPU. Apparently, reading the data is still a bottleneck that prevents the process from running at 400% CPU.

The major factors we identified for the performance benefit of the C++ implementation are:

  • The PBF dump is easier on disk IO: 68GB vs 2TB is about 1/30 of the data transfer;
  • The PBF dump is easier on processing: estimated 700GB of structured data vs 2TB of unstructured data is about 1/3 of the data amount and reduces parsing and structuring the data;
  • The PBF dump is easier on parallelization due to the data being sharded.

Considering these factors, the performance of the Rust implementation is not bad.

Heute Abend stellen Robert Danziger (Mamapa) und Melanie Eckle (disastermappers heidelberg/ HeiGIT) in Ludwigshafen das Mamapa Projekt vor. Zudem erläutern sie die Potenziale von OpenStreetMap (OSM) und illustrieren, wie OSM eingesetzt und wie zu OSM beigetragen werden kann.

Donnerstag, 20. September 2018, 18.00 Uhr
Zentralbibliothek Schulungsraum

Weitere Informationen hier und unter 0621 504-3533,
der Eintritt ist frei.

Wer mehr über das Mamapa Projekt erfahren will oder sich direkt beim Mamapa Projekt engagieren will, hat 27.9.2018 bereits die nächste Gelegenheit. Um 14:30 findet in der Mannheimer Abendakademie die nächste Mamapa Veranstaltung in Mannheim statt.

Weitere Informationen unter mamapa.org, sowie die Anmeldung gerne unter: https://mamapa.org/register/tandem-partner.

Das Projekt MANNHEIMER MAPATHONS (”MAMAPA“) hat zwei Hauptziele: erstens, die Integration von Migrantinnen und Migranten in Deutschland zu unterstützen und zweitens, einen Beitrag im Kontext internationaler, humanitärer Hilfe zu leisten.

Praktische Einführung in einige Methoden der modernen, humanitären Kartographie werden den teilnehmenden Migrantinnen und Migranten — mittels einer Serie von lokal organisierten Mapathon “Events” — angeboten. Dadurch werden beide Projektziele gleichzeitig gefördert. Ein Mapathon ist eine bewährte Methode zur Unterstützung internationaler Einsätze — wo immer sich Menschen in Not befinden.

We are happy to announce that we are going to be part of INTERGEO 2018 in Frankfurt this year. This event being the “global hub of the geospatial community” has quite something to offer with hundreds and hundreds of innovative companies showcasing their products and visions.

We are joining up with the Federal Agency for Cartography and Geodesy (BKG) and organising a booth in hall 12.1 (12.1F.017 to be exact). It is worthwhile mentioning that BKG will be holding one of the keynote speeches at the exhibition reflecting the increasing digitalisation around us - don’t miss out on it!

Feel free to visit us and discover the growing ecosystem around openrouteservice.org with all the many possibilities - we are definitely looking forward to meeting you!

Call for papers

ISCRAM is a forum where researchers and practitioners from all around the world meet every year to share experiences and raise challenges in all the aspects related to the design, development, and use of information systems to improve the management of crisis and disaster situations. The 16th International Conference on Information Systems for Crisis Response and Management will be held in Valencia (Spain) from May 19 to 22, 2019, following successful previous editions celebrated in Brussels (2004 & 2005), Newark (2006), Delft (2007), Washington DC (2008), Gothenburg (2009), Seattle (2010), Lisbon (2011), Vancouver (2012), Baden-Baden (2013), State College (2014), Kristiansand (2015), Rio de Janeiro (2016), Albi (2017) and Rochester (2018).

ISCRAM2019 invites two categories of papers:

  • CoRe: Completed Research (from 4000 to 8000 words).
  • WiPe: Work In Progress (form 3000 to 6000 words).

We invite you to submit papers to the rack T6 – Geospatial Technologies and Geographic Information Science for Crisis Management (GIS)

Geospatial Technologies and Geographic Information Science for Crisis Management (GIS)

With disasters and disaster management being an “inherently spatial” problem, geospatial information and technologies have been widely employed for supporting disaster and crisis management. This includes SDSS and GIS architectures, VGI, spatial databases, spatial- temporal methods, as well as geovisual analytics technologies, which have a great potential to build risk map, estimate damaged areas, define evacuation routes, and plan resource distribution. Collaborative platforms like OSM have been also employed to support disaster management (e.g., near real-time mapping). Nevertheless, all these geospatial big data pose new challenges for not only geospatial data visualization, but also data modeling and analysis; existing technologies, methodologies, and approaches now have to deal with data shared in various formats, different velocities, and uncertainties. Furthermore, new issues have been also emerging in urban computing and smart cities for making communities more resilient against disasters. In line with this year’s conference theme, the GIS Track particularly welcomes submissions addressing aspects of individual-centric geospatial information in disaster risk and crisis research. This includes SDSS, near-real-time mapping, situational awareness, VGI, spatiotemporal modeling, urban computing, and other related aspects. We seek conceptual, theoretical, technological, methodological, empirical contributions, as well as research papers employing different methodologies, e.g., design- oriented research, case studies, and action research. Solid student contributions are welcome.

TRACK TOPICS

Track topics are therefore focused on but not limited to the following list.

  • Geospatial data analytics for crisis management

  • Location-based services and technologies for crisis management

  • Geospatial ontology for crisis management

  • Geospatial big data in the context of disaster and crisis management

  • Geospatial linked data for crisis management

  • Urban computing and geospatial aspects of smart cities for crisis management

  • Spatial Decision Support Systems for crisis management

  • Individual-centric geospatial information;

  • Remote sensing for crisis management

  • Geospatial intelligence for crisis management

  • Spatial data management for crisis management

  • Spatial data infrastructure for crisis management

  • Geovisual analytics for crisis management

  • Spatial-temporal modeling in disaster and crisis context

  • Crisis mapping and geovisualization

  • Empirical case studies

Chairs: João Porto de Albuquerque, Alexander Zipf, and Flávio E. A. Horita

https://iscram2019.webs.upv.es/submissions/call-for-papers/geospatial-technologies-and-geographic-information-science-for-crisis-management-gis/

IMPORTANT DATES

– Submission deadline CoRe papers: December 1, 2018
– Decision notification CoRe papers: January 7, 2018

– Submission deadline WiPe papers: February 8, 2019
– Decision notification WiPe papers: March 8, 2019

Please use the ISCRAM paper template for the preparation of your submission.

As previously announced, Sabrina Marx and Melanie Eckle recently visited Dar es Salaam to attend the Missing Maps Members Gathering and to join the FOSS4G and HOT Summit community for their joint annual gatherings.

The Missing Maps Members Gathering was the first side event of the conference week. Likewise to previous years, the Missing Maps members made use of the face-to-face time to discuss last year´s activities and future plans. Please stay tuned for the full overview that will be published soon in collaboration with the other attending Missing Maps members.

The main focus of the then following FOSS4G and HOT Summit was as usual on open and open-source geo data and applications, however this year also with special regard to approaches to make sure “To leave no one behind”.

In this vein, Sabrina and Melanie presented the openrouteservice, openrouteservice for disaster management and the ohsome platform. In the presentations, they showed how the communities can make use of the services for their own use cases, asked for feedback and invited the community to support with further developments.

We sincerely thank the organizing team for this great event and inclusive conference approach that we will make sure to follow in our state State of the Map conference 2019 in Heidelberg as well. We furthermore thank the communities for the great discussions and feedback and look forward to taking our projects further with you soon.

Missing Maps Gathering, pic by Humanitarian OSM Team

Only two days left to register for the Fachtagung Katastrophenvorsorge in Berlin, 22. - 23. October 2018. Join our workshop on Geoninformation & Disaster Prepardness, which we are running together with the Federal Office of Civil Protection and Disaster Assistance (BBK) and the German Aerospace Centre (DLR). The Symposium is organized by the German Red Cross (DRK).

« Newer Posts - Older Posts »