Using OS AddressBase for Address Search in QGIS

In this blog post we’ll learn how to use Ordnance Survey AddressBase data with the QGIS Discovery plugin for searching addresses.

Discovery Plugin for QGIS

Before we start

The AddressBase data will be loaded into a PostGIS table for Discovery to query. At this stage we should already have a functioning PostgreSQL / PostGIS installation.

A previous blog post describes how to quickly set up such an environment.

Creating the addressbase table

Let’s now create a table for storing the addressbase data. In the example below we’ll create a table called addressbase in the os_address schema.

The script below can be executed through pgAdminIII.

To run the script:

  1. Open pgAdminIII
  2. Connect to your destination database
  3. Select Query tool from the Tools menu
  4. Paste the code below into the Query tool
  5. Press F5 to execute the query (it may take a few seconds to complete)

When the query has finished you should see Query returned successfully with no result in … seconds. in the Messages panel:

pgAdminIII Messages Panel

At this point we should be able to locate the new addressbase table within the os_address schema:

addressbase Table

If you can’t see the schema / table you probably need to refresh the schemas / tables views in pgAdminIII’s Object browser panel by hitting F5.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
-- Create the destination schema if required
CREATE SCHEMA IF NOT EXISTS os_address;

-- Create a function which will populate the full_address and geom columns as 
-- data are imported
CREATE OR REPLACE FUNCTION create_geom_and_address()
RETURNS trigger AS $$
BEGIN
  -- The geometry
  -- Set it based on the x_coord and y_coord fields
  NEW.geom = ST_SetSRID(ST_MakePoint(NEW.x_coordinate, NEW.y_coordinate), 27700);
  -- The full address
  -- Initialise it
  NEW.full_address = '';
  -- Build the full address by only including optional address components if they 
  -- exist
  IF NEW.organisation_name IS NOT NULL AND length(NEW.organisation_name) > 0 THEN
  NEW.full_address = NEW.full_address || NEW.organisation_name || ', ';
  END IF;
  IF NEW.department_name IS NOT NULL AND length(NEW.department_name) > 0 THEN
  NEW.full_address = NEW.full_address || NEW.department_name || ', ';
  END IF;
  IF NEW.po_box_number IS NOT NULL AND length(NEW.po_box_number) > 0 THEN
  NEW.full_address = NEW.full_address || NEW.po_box_number || ', ';
  END IF;
  IF NEW.sub_building_name IS NOT NULL AND length(NEW.sub_building_name) > 0 THEN
  NEW.full_address = NEW.full_address || NEW.sub_building_name || ', ';
  END IF;
  IF NEW.building_name IS NOT NULL AND length(NEW.building_name) > 0 THEN
  NEW.full_address = NEW.full_address || NEW.building_name || ', ';
  END IF;
  IF NEW.building_number IS NOT NULL THEN
  NEW.full_address = NEW.full_address || NEW.building_number || ', ';
  END IF;
  IF NEW.dependent_thoroughfare IS NOT NULL AND length(NEW.dependent_thoroughfare) > 0 THEN
  NEW.full_address = NEW.full_address || NEW.dependent_thoroughfare || ', ';
  END IF;
  IF NEW.thoroughfare IS NOT NULL AND length(NEW.thoroughfare) > 0 THEN
  NEW.full_address = NEW.full_address || NEW.thoroughfare || ', ';
  END IF;

  NEW.full_address = NEW.full_address || NEW.post_town || ', ';

  IF NEW.double_dependent_locality IS NOT NULL AND length(NEW.double_dependent_locality) > 0 THEN
  NEW.full_address = NEW.full_address || NEW.double_dependent_locality || ', ';
  END IF;
  IF NEW.dependent_locality IS NOT NULL AND length(NEW.dependent_locality) > 0 THEN
  NEW.full_address = NEW.full_address || NEW.dependent_locality || ', ';
  END IF;

  NEW.full_address = NEW.full_address || NEW.postcode;

  RETURN NEW;
END;
$$ LANGUAGE 'plpgsql';

-- Drop any existing addressbase table
DROP TABLE IF EXISTS os_address.addressbase CASCADE;
CREATE TABLE os_address.addressbase
(
  -- id will be the primary key, populated automatically
  id serial NOT NULL,
  uprn bigint NOT NULL,
  os_address_toid varchar(24) NOT NULL,
  -- os_address_toid bigint NOT NULL,
  udprn integer NOT NULL,
  organisation_name varchar(60),
  department_name varchar(60),
  po_box_number varchar(6),
  sub_building_name varchar(30),
  building_name varchar(50),
  building_number smallint,
  dependent_thoroughfare varchar(80),
  thoroughfare varchar(80),
  post_town varchar(30) NOT NULL,
  double_dependent_locality varchar(35),
  dependent_locality varchar(35),
  postcode varchar(8) NOT NULL,
  postcode_type char(1) NOT NULL,
  x_coordinate numeric(8,2) NOT NULL,
  y_coordinate numeric(9,2) NOT NULL,
  latitude numeric(9,7) NOT NULL,
  longitude numeric(8,7) NOT NULL,
  rpc char(1) NOT NULL,
  country char(1) NOT NULL,
  change_type char(1) NOT NULL,
  la_start_date date NOT NULL,
  rm_start_date date NOT NULL,
  last_update_date date NOT NULL,
  class char(1) NOT NULL,
  -- the next two fields are populated automatically on insert
  full_address text NOT NULL,
  geom geometry(Point,27700) NOT NULL,
  CONSTRAINT addressbase_pkey PRIMARY KEY (id)
)
WITH (
  OIDS=FALSE
);

-- Create a pg_trgm index on the full_address column
-- This will allow super-fast, case-insensitive search on the column
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX addressbase_full_address_gin_trgm
  ON os_address.addressbase
  USING gin
  ("full_address" gin_trgm_ops);

-- Spatial index for the geometry column 
CREATE INDEX addressbase_geom_gist
  ON os_address.addressbase
  USING gist
  (geom);

-- trigger to create points and addresses
-- This trigger will be executed on each row inserted, calling the function defined above
CREATE TRIGGER tr_create_geom_and_address BEFORE INSERT
  ON os_address.addressbase
  FOR EACH ROW
  EXECUTE PROCEDURE create_geom_and_address();

The script above has:

  • Created a table
  • Added any necessary indices
  • Created two additional, derived columns, full_address and geom

full_address will be used to store various address components into a sensible, human readable address. geom will be used to store point geometry based on address eastings/northings.

See the script comments for more information / detail.

Loading AddressBase

At this point we have an empty table ready to accept our AddressBase data. We will now import the data using pgAdminIII. Extract the CSV files for the addresses, you should end up seeing one or more CSV files, for example AddressBase_FULL_2016-03-19_001.csv

In pgAdminIII:

  1. Locate the addressbase table
  2. Right click it, select Import

An import dialog should appear. Select the first CSV file and set the settings in the File Options tab as shown here:

AddressBase Import Options 1

Uncheck the id, full_address and geom columns in the Columns tab as shown here:

AddressBase Import Options 2

Click Import. After a few seconds the dialog may report (Not Responding). This is nothing to worry about, be patient.

When the import process completes, close the import dialog and repeat the above steps with any remaining CSV files.

At this stage the data has been imported and the full_address field should contain sensible, human-readable addresses.

Configuring Discovery

With the data loaded in QGIS, we can now configure Discovery to make use of it.

  1. Install the Discovery plugin if not already installed
  2. Open Discovery’s settings using the button
  3. Set the settings as follows, changing the Scale Expression if required

Discovery Settings for OS AddressBase

Congratulations! QGIS should now be set up to search your AddressBase data.