Fusion Registry 10 Use Cases

Here are details of some of the popular use cases for Fusion Registry.

Statistics Data Warehouse

Fusion Registry is fundamentally a statistical data warehouse which uses the SDMX standard as its data model. So it works well for any applications where a repository is required for highly-structured statistical datasets that can be easily queried, searched and analysed. Typical uses are by central banks as an internal data warehouse for statisticians and analysts, and international organisations as a searchable public data service for socioeconomic statistics. Reference metadata is supported allowing the addition of general unstructured information like narrative on a dataset’s collection methodology. This can be of great benefit to the data warehouse’s end users, providing anything from simple comments and guidance on how to interpret the data to in-depth analysis and insight.

SDMX implements a form of star schema where Dataflows are equivalent to fact tables, and Codelists equivalent to dimension tables. Its multi-dimensional model natively supports the concept of 'data cubes' so will be familiar to anyone with exposure to traditional data warehousing. The key difference here is that Fusion Registry is tuned for time series. That makes it more suitable for handling variables with observations over time like interest rates or access to education, rather than cross-sectional datasets like census.

Many users choose to load data into Fusion Registry’s own storage, but its virtualisation engine can also dynamically and transparently integrate data from a range of other sources including SQL databases and SDMX web services.

Data Collection and Integration

Collection of data from multiple providers is a common official statistics use case and is well supported by Fusion Registry and SDMX through the concept of Provision Agreements which describe who is allowed to report data for a Dataflow. Applying optional Reporting Constraints allows further rules to be added controlling what data each provider is allowed to report. So a CPI Dataflow that collects data from multiple countries will have a Provision Agreement for each country each with a Reporting Constraint enforcing a rule that the reporter can provide data for their own country.

Data can be reported using a number of alternative formats and through a range of channels including interactive upload of files using Fusion Registry’s web user interface, programmatic submission through the SDMX REST API and on-demand pull by registering the URL of an SDMX web service or file that Fusion Registry can query. Alternative formats include SDMX XML, JSON and EDI.

Excel is also supported with Fusion Registry’s Excel Reporting Template feature which makes the data reporting process simpler and more reliable by generating Excel spread sheet forms automatically customised for each data provider that need only be completed with the required observation values and returned.

Fusion Registry will automatically and seamlessly integrate data from multiple providers or sources as required, irrespective of how the data is supplied and in what format. This is particularly useful for datasets which combine data from a number of regional or sub-regional suppliers.

Reporting Data in SDMX

Fusion Registry helps data reporters with the task of reporting to organisations like Eurostat who require the data in SDMX.

At the simplest level, the data collector’s SDMX structures can be easily loaded into into the data reporter's Registry - these provide an unambiguous definition of how the data should be structured and classified. Fusion Registry structural validation functions can then be used to check data produced is structurally correct before submission.

Often the biggest problem in data reporting is producing the data in one of the accepted SDMX formats - usually SDMX-ML (XML). Fusion Registry can help here by accepting data to load as plain Excel, but providing the option to export in a variety of SDMX formats. Crucially, this avoids data reporters having to generate SDMX content directly from data production systems.

Fusion Registry’s SDMX REST API can be used for the purpose of submitting data by publishing a URL for the data collector to pull from as in the case of IMF SDDS Plus subscribers. But for many use cases, loading data into Fusion Registry as Excel and exporting it as SDMX files is often sufficient. Running their own Fusion Registry installation gives data reporters the additional benefits of its statistical data warehousing, data collection and data dissemination functionality.

Data Dissemination

Fusion Registry provides good support for data dissemination use cases supporting a wide range of services including web portals, data APIs, web and desktop analysis tools.

All dissemination services are driven from the SDMX REST API which can either be exposed directly as a data API service or used as the server-side interface for data-driven web applications. The API supports all of the standard SDMX formats, including SDMX-JSON 1.0 which was primarily designed as a dissemination format (rather than data exchange) for use with web applications like data portals because of the relative ease with which it can be handled with standard JavaScript.

For web application developers who aren’t familiar with SDMX-JSON, Metadata Technology’s IM-JS JavaScript framework which comes with Fusion Registry Enterprise Edition provides a simpler client-side interface to the REST API.

Many dissemination use cases require an interactive tool for users to search the data catalogue, visualise and analyse selected datasets and series, and retrieve data in useful formats. Fusion Data Browser meets this requirement. As a standalone web application, it can be used both internally within organisations, or embedded into public facing sites or web data portals.

For public dissemination it's best to use cluster of Fusion Edge Servers for driving external dissemination services. Fusion Edge Server is a light-weight read-only data server providing only an SDMX REST API. It’s designed to be deployed in the DMZ or other exposed location and allows the master Fusion Registry service to be kept private. Data is refreshed when required using the 'push' method from the private Fusion Registry which avoids the need for the Edge Servers to maintain any active connections back through the firewall which represent a security risk. Embargo is supported making it possible to push an update to the Edge Servers for publication at a specified later point in time. When the embargo time arrives, the Edge Servers perform an atomic switch from the current to the new set of data for publication typically taking less than 1 second, important for time sensitive publications.

Dissemination using the IMF's eGDDS, SDDS and SDDS Plus Standards

For countries subscribing to the IMF'S eGDDS, SDDS and SDDS Plus dissemination standards, Fusion Registry will automatically generate the required National Summary Data Page (NSDP). The only pre-requisities are to maintain an up-to-date copy of the relevent IMF structures in Fusion Registry and load the required data - both Excel and SDMX are accepted.

The IMF's structures can be retrieved from SDMX Central by exporting to file and loading using the web user interface. More efficiently though, Fusion Registry's Environment Synchronisation tool can be used to pull selected structures directly from SDMX Central's REST API. And from Fusion Registry 10.1, the Data Portal tool will allow selected structures to be synchronised on a defined schedule.