PostGIS and the Hard Rock Cafe collection

2024/05/27

In 2007, I went to the west coast of the United States where I visited a Hard Rock Cafe store in Hollywood CA and bought the first T-Shirt of my collection. Now, I’m an open-source DBA and my number one database is PostgreSQL. So I’ve decided to use the PostGIS extension to plan my next vacations or conferences to buy more T-Shirts.

The collection

Inventory

Let’s do the inventory of my collected T-Shirts by looking at my closet.

The titles represent a location, generally a city but it could be a known place like a stadium. This will be enough to find the related shop in a database. Titles can be written into a file:

Dublin
Los Angeles
Prague
London

(collection.csv)

Coordinates

The next step is to add coordinates. This information can be found by querying the Nominatim API based on the OpenStreetMap community-driven project.

#!/usr/bin/env python3
import requests
import csv
import time


if __name__ == "__main__":
    headers = {"User-Agent": "Hard Rock Cafe Blog Post From Julien Riou"}
    session = requests.Session()
    session.headers.update(headers)

    with open("collection_with_coordinates.csv", "w") as dest:
        writer = csv.writer(dest)
        with open("collection.csv", "r") as source:
            for row in csv.reader(source):
                name = row[0]
                r = session.get(f"https://nominatim.openstreetmap.org/search?q={name}&limit=1&format=json")
                time.sleep(1)
                r.raise_for_status()
                data = r.json()
                if len(data) == 1:
                    data = data[0]
                    writer.writerow([name, data["lon"], data["lat"]])
                else:
                    print(f"Location not found for {title}, skipping")

(collection.py)

The Python script iterates over the collection.csv file to query the Nominatim API to find the most relevent OpenStreetMap node then writes coordinates in the collection_with_coordinates.csv file using the coma-separated values (CSV) format.

Dublin,53.3493795,-6.2605593
Los Angeles,34.0536909,-118.242766
Prague,50.0596288,14.446459273258009
London,51.4893335,-0.14405508452768728

(collection_with_coordinates.csv)

The shops (or “nodes”)

Now we need a complete list of Hard Rock Cafe locations with their coordinates to match the collection.

OpenStreetMap

My first idea was to use OpenStreetMap that should provide the needed dataset. I tried to use the Nominatim API but the queries are limited to 40 results. I could download the entire dataset, import it in a PostgreSQL instance locally but it would have been space and time consuming. So, I used the Overpass API with this query (thanks Andreas). In the end, the quality of the data was not satisfying. The amenity was restaurant, bar, pub, cafe, nightclub or shop. The name had an accent (“é”) or not (“e”). Sometimes, the brand was not reported. Even with all those filters, there was a node that was not even a Hard Rock Cafe. The more the query grew, the more I wanted to use another method.

Website

I decided to parse the official website. By using a well-known library like Selenium or ferret? Given the personal time I had for this project, I’ve chosen the quick and dirty path. Let me present you the ugly but functional one-liner to parse the official Hard Rock Cafe website:

curl -sL https://cafe.hardrock.com/locations.aspx | \
    grep 'var currentMapPoint=' | \
    sed "s/.*{'title':/{'title':/g;s/,'description.*/}/g;s/'/\"/g" | \
    sed 's/{"title"://g;s/"lat":/"/g;s/,"lng":/","/g;s/}/"/g' \
    > nodes.csv

Very ugly, not future-proof, but it did the job.

"Hard Rock Cafe Amsterdam","52.36211","4.88298"
"Hard Rock Cafe Andorra","42.507707","1.531977"
"Hard Rock Cafe Angkor","13.35314","103.85676"
"Hard Rock Cafe Asuncion","-25.2896910","-57.5737599"

(nodes.csv)

Data exploration

The tool of choice to import and analyze this data is PostgreSQL and its PostGIS extension. I’ve used Docker to have a disposable local instance to perform quick analysis.

docker run -d --name postgres -v "$(pwd):/mnt:ro" \
    -e POSTGRES_USER=hrc -e POSTGRES_PASSWORD=insecurepassword \
    -e POSTGRES_DB=hrc \
    postgis/postgis:16-3.4
docker exec -ti postgres psql -U hrc -W

The instance is now started and we are connected.

Import

The COPY command on PostgreSQL can import CSV lines easily. We’ll use the psql alias (\copy) to send data directly through the client.

create table collection (
    name text primary key,
    lat numeric,
    lon numeric
);
\copy collection (name, lat, lon) from '/mnt/collection_with_coordinates.csv' csv;

create table nodes (
    name text primary key,
    lat numeric,
    lon numeric
);
\copy nodes (name, lat, lon) from '/mnt/nodes.csv' delimiter ',' csv;

Correlation

The SQL query takes all the rows from the collection table and try to find a row in the nodes table within 50 km based on coordinates.

 select c.name as tshirt, n.name as restaurant,
        round((ST_Distance(ST_Point(c.lon, c.lat), ST_Point(n.lon, n.lat), true)/1000)::numeric, 2)
            as distance_km
   from collection c
        left join nodes n
            on ST_DWithin(ST_Point(c.lon, c.lat), ST_Point(n.lon, n.lat), 50000, true)
order by c.name, distance_km;

The PostGIS functions used are:

Result:

      tshirt      |                 restaurant                 | distance_km 
------------------+--------------------------------------------+-------------
 Amsterdam        | Hard Rock Cafe Amsterdam                   |        1.38
 Angkor           | Hard Rock Cafe Phnom Penh                  |        0.75
 Antwerp          | Hard Rock Cafe Brussels                    |       41.84
 Barcelona        | Hard Rock Cafe Barcelona                   |        0.64
 Berlin           | Hard Rock Cafe Berlin                      |        4.32
 Boston           | (null)                                     |      (null)
 Brussels         | Hard Rock Cafe Brussels                    |        0.10
 Detroit          | (null)                                     |      (null)
 Dublin           | Hard Rock Cafe Dublin                      |        0.40
 Hamburg          | Hard Rock Cafe Hamburg                     |        2.36
 Ho Chi Minh City | (null)                                     |      (null)
 Hollywood        | Hard Rock Cafe Hollywood on Hollywood Blvd |        1.07
 Lisbon           | Hard Rock Cafe Lisbon                      |        1.07
 London           | Hard Rock Cafe London                      |        1.66
 London           | Hard Rock Cafe London Piccadilly Circus    |        2.40
 Los Angeles      | Hard Rock Cafe Hollywood on Hollywood Blvd |       10.46
 Miami            | Hard Rock Cafe Miami                       |        0.97
 Miami            | Hard Rock Cafe Hollywood FL                |       30.83
 Miami Gardens    | Hard Rock Cafe Hollywood FL                |       12.62
 Miami Gardens    | Hard Rock Cafe Miami                       |       19.26
 New York         | Hard Rock Cafe New York Times Square       |        5.18
 New York         | Hard Rock Cafe Yankee Stadium              |       14.53
 Orlando          | Hard Rock Cafe Orlando                     |       11.50
 Oslo             | (null)                                     |      (null)
 Paris            | Hard Rock Cafe Paris                       |        2.14
 Prague           | Hard Rock Cafe Prague                      |        3.59
 San Francisco    | Hard Rock Cafe San Francisco               |        3.36
 Singapore        | Hard Rock Cafe Singapore                   |        5.74
 Singapore        | Hard Rock Cafe Changi Airport Singapore    |       18.88
 Singapore        | Hard Rock Cafe Puteri Harbour              |       19.33
 Yankee Stadium   | Hard Rock Cafe Yankee Stadium              |        0.14
 Yankee Stadium   | Hard Rock Cafe New York Times Square       |        9.53

We can identify multiple patterns here:

I’ve created a script to update the names in the collection table to match the names in the nodes tables to join them by names instead of the location.

Next locations

The last step of the exploration is to find Hard Rock Cafe locations within a reasonable distance from home (1000 Km). As I don’t want to disclose the exact position, we’ll search for “Belgium”. The country is quite small so that should not be an issue.

$ curl -A "Hard Rock Cafe Blog Post From Julien Riou" \
    -sL "https://nominatim.openstreetmap.org/search?limit=1&format=json&q=Belgium" | \
    jq -r '.[0].name,.[0].lon,.[0].lat'
België / Belgique / Belgien
4.6667145
50.6402809

The query to search for shops that I don’t have already visited looks like this:

  select n.name,
         round((ST_Distance(ST_Point(n.lon, n.lat), ST_Point(4.6667145, 50.6402809), true)/1000)::numeric, 2)
           as distance_km
    from nodes n
        left join collection c
            on n.name = c.name
   where c.name is null
     and ST_Distance(ST_Point(n.lon, n.lat), ST_Point(4.6667145, 50.6402809), true)/1000 < 1000
order by distance_km;

The final result:

                  name                   | distance_km 
-----------------------------------------+-------------
 Hard Rock Cafe Cologne                  |      164.87
 Hard Rock Cafe London Piccadilly Circus |      349.99
 Hard Rock Cafe Manchester               |      569.37
 Hard Rock Cafe Munich                   |      573.42
 Hard Rock Cafe Innsbruck                |      618.90
 Hard Rock Cafe Newcastle                |      640.64
 Hard Rock Cafe Milan                    |      666.41
 Hard Rock Cafe Copenhagen               |      769.51
 Hard Rock Cafe Edinburgh                |      789.31
 Hard Rock Cafe Venice                   |      812.96
 Hard Rock Cafe Wroclaw                  |      870.89
 Hard Rock Cafe Vienna                   |      890.23
 Hard Rock Cafe Florence                 |      911.37
 Hard Rock Cafe Gothenburg               |      918.32
 Hard Rock Cafe Andorra                  |      935.20

Conclusion

According to this study, my next vacations or conferences should take place in Germany or UK. A perfect opportunity to go to PGConf.DE and PGDay UK!