Generating Physical Schemas From a PIM

A while back, my colleague Barry Schimpf touched upon some of the tools that we use in conjunction with the Platform Independent Model (PIM). Today, I will delve into one of the tools we use to generate physical schemas from the PIM. Before, I jump in, let’s review what a PIM is and what it does.

The PIM is an approach we have developed to enable proper configuration management of geospatial data models. We have used it successfully for federal customers to track multiple versions of complex data models, validate physical implementations of those models, and support profiling and adaptation of the models across user communities. The focus of a PIM is on the data model as opposed to the actual geospatial data so a PIM itself doesn’t store any geospatial feature data. It is merely a representation of the logical model; defining the feature types, attributes, relationships, and constraints necessary to build a geospatial data set that is in compliance with a particular data standard.

So what does a PIM look like? It is actually just a relational database that contains all of the information described above, acting as a quasi-logical data model. It is designed to bridge the gap between a purely logical data model and the physical models we deal with on a daily basis in the form of data sets such as Esri geodatabases, SpatiaLite, or PostGIS.

In order to make the PIM more useful, we have developed a suite of tools to perform key functions. In this post, I will discuss our script generator tool, which uses the PIM to generate the data definition language (DDL) scripts necessary to create a physical schema in either SpatiaLite, PostGIS, or Geography Markup Language (GML). The first step is to connect to a PIM. By design, an instance of a PIM database is intended to manage one data model. We are examining strategies for removing this restriction but the current method of identifying the root version of a data model prevents this at the moment. Currently a PIM can be stored in Microsoft SQL Server (the original platform) or SQLite. We are in the process of building support for PostgreSQL and support for Microsoft Access has been deprecated.

Once a connection has been made to the PIM, we must simply choose the version and configuration from which we want to generate the physical schema. Configurations are how we store profiles of a version, enabling us to tailor data content (features, geometry types, attributes) for uses cases such as differing resolutions. Once these are selected, we can choose the feature types we want to generate. PIM feature types will represent individual tables the resulting spatial database.

Once the desired feature types have been selected, we can move on to the options tab to specify the details of the generation process. In this case, we will be generating a script for PostGIS. We are also selecting all of the generation options in order to apply default values, artificial keys, and constraints. Default values are easily understood but the other options require a bit of explanation.

By choosing artificial keys, we are adding the DDL to add a serial integer (autonumber) to each table and define that column as the primary key. By building constraints and enumerations,we are doing two things: 1) for attributes that allow only one value, such as a feature type code, or a range of values, we generate the DDL for a check constraint and 2) for attributes that allow a list of values, we add the DDL to create a lookup table with the values and add a BEFORE INSERT/BEFORE UPDATE trigger to validate data in the feature table against the lookup. In the case of PostGIS, this means we generate a trigger function and a trigger.

Once we have specified our options, we can run the generation, which places the output in a separate window.

From here, we can save the script to disk or copy/paste it into another tool such as pgAdmin. This gives the user the ability to fine tune the script before execution. For example, the user may want to make the automatically generated function names conform to an organizational naming convention. When ready, we can execute the script to generate the database objects.

If we take a close look at the ADMINISTRATIVE_BOUNDARY_C table, we can see that a check constraint and a trigger were created as part of the generation.

The script generator tool described here is one of two tools we have developed for producing physical models from the logical representation in a PIM. The other tool, the geodatabase generator, exclusively produces Esri geodatabases and will be the topic of my next post. As can be seen, using tools such as those described here can facilitate the consistent creation of reference implementations of data standards across various platforms. For organizations charge with managing such standards, these tools can streamline the interaction with physical databases as they are populated and as they grow and evolve.

The script generated in this example can be found here.

This post was written by:
Bill Dollins
Senior Vice President

See all the posts in this series

For more information on this post, the PIM, Zekiah's geospatial standards support, or our data configuration management capabilities, please e-mail us at contact@zekiah.com