Philly Center City District Sips 2022: An Interactive Map
An interactive map showing restaurants participating in CCD Sips 2022 & a companion R tutorial on webscraping, geocoding, and map-making
By Silvia Canelón in R Tutorial
May 31, 2022
Philly’s Center City District posted a list of restaurants and bars participating in Philly’s 2022 CCD Sips. CCD Sips is a series of summer Wednesday evenings (4:30-7pm) filled with happy hour specials, between June 1st and August 31st.
I prefer to take in this information as a map instead of a list, so I scraped some information from the website and made one! You can click or tap on the circle map markers to see information about each restaurant/bar along with a direct link to their posted happy hour specials.
Check out the link at the top of this post for a larger version of the interactive map below. And jump down to the tutorial if you’d like to learn how I used R to build the interactive map!
Tutorial start
Aside from the tidyverse
and here
packages, I used a handful of R packages to bring this map project together.
Package | Purpose | Version |
---|---|---|
robotstxt |
Check website for scraping permissions | 0.7.13 |
rvest |
Scrape the information off of the website | 1.0.1 |
ggmap |
Geocode the restaurant addresses | 3.0.0 |
leaflet |
Build the interactive map | 2.0.4.1 |
leaflet.extras |
Add extra functionality to map | 1.0.0 |
Scraping the data
Checking site permissions
Check the site’s terms of service using the robotstxt package, which downloads and parses the site’s robots.txt file.
What I wanted to look for was whether any pages are not allowed to be crawled by bots/scrapers. In my case there weren’t any, indicated by Allow: /
.
get_robotstxt("https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view")
Output
[robots.txt]
--------------------------------------
# robots.txt overwrite by: on_suspect_content
User-agent: *
Allow: /
[events]
--------------------------------------
requested: https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view/robots.txt
downloaded: https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view/robots.txt
$on_not_found
$on_not_found$status_code
[1] 404
$on_file_type_mismatch
$on_file_type_mismatch$content_type
[1] "text/html; charset=utf-8"
$on_suspect_content
$on_suspect_content$parsable
[1] FALSE
$on_suspect_content$content_suspect
[1] TRUE
[attributes]
--------------------------------------
problems, cached, request, class
Harvesting data from the first page
Then I used the rvest package to scrape the information from the tables of restaurants/bars participating in CCD Sips.
I’ve learned that ideally you would only scrape each page once, so I checked my approach with the first page before I wrote a function to scrape the remaining pages.
# define the page
url <- "https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view?page=1"
# read the page html
html1 <- read_html(url)
# extract table info
table1 <-
html1 |>
html_node("table") |>
html_table()
table1 |> head(3) |> kableExtra::kable()
Name | Address | Phone | CCD SIPS Specials |
---|---|---|---|
1028 Yamitsuki Sushi & Ramen | 1028 Arch Street, Philadelphia, PA 19107 | 215.629.3888 | CCD SIPS Specials |
1225 Raw Sushi and Sake Lounge | 1225 Sansom St, Philadelphia, PA 19102 | 215.238.1903 | CCD SIPS Specials |
1518 Bar and Grill | 1518 Sansom St, Philadelphia, PA 19102 | 267.639.6851 | CCD SIPS Specials |
# extract hyperlinks to specific restaurant/bar specials
links <-
html1 |>
html_elements(".o-table__tag.ccd-text-link") |>
html_attr("href") |>
as_tibble()
links |> head(3) |> kableExtra::kable()
value |
---|
#1028-yamitsuki-sushi-ramen |
#1225-raw-sushi-and-sake-lounge |
#1518-bar-and-grill |
# add full hyperlinks to the table info
table1Mod <-
bind_cols(table1, links) |>
mutate(Specials = paste0(url, value)) |>
select(-c(`CCD SIPS Specials`, value))
table1Mod |> head(3) |> kableExtra::kable()
Name | Address | Phone | Specials |
---|---|---|---|
1028 Yamitsuki Sushi & Ramen | 1028 Arch Street, Philadelphia, PA 19107 | 215.629.3888 | https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view?page=1#1028-yamitsuki-sushi-ramen |
1225 Raw Sushi and Sake Lounge | 1225 Sansom St, Philadelphia, PA 19102 | 215.238.1903 | https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view?page=1#1225-raw-sushi-and-sake-lounge |
1518 Bar and Grill | 1518 Sansom St, Philadelphia, PA 19102 | 267.639.6851 | https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view?page=1#1518-bar-and-grill |
Harvesting data from the remaining pages
Once I could confirm that the above approach harvested the information I needed, I adapted the code into a function that I could apply to pages 2-3 of the site.
getTables <- function(pageNumber) {
Sys.sleep(2)
url <- paste0("https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view?page=", pageNumber)
html <- read_html(url)
table <-
html |>
html_node("table") |>
html_table()
links <-
html |>
html_elements(".o-table__tag.ccd-text-link") |>
html_attr("href") |>
as_tibble()
tableSpecials <<-
bind_cols(table, links) |>
mutate(Specials = paste0(url, value)) |>
select(-c(`CCD SIPS Specials`, value))
}
I used my getTable()
function and the purrr::map_df()
function to harvest the table of restaurants/bars from pages 2 and 3. Then I combined all the data frames together and saved the complete data frame as an .Rds
object so that I wouldn’t have to scrape the data again.
# get remaining tables
table2 <- map_df(2:3, getTables)
# combine all tables
table <- bind_rows(table1Mod, table2)
table |> head(3) |> kableExtra::kable()
Name | Address | Phone | Specials |
---|---|---|---|
1028 Yamitsuki Sushi & Ramen | 1028 Arch Street, Philadelphia, PA 19107 | 215.629.3888 | https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view?page=1#1028-yamitsuki-sushi-ramen |
1225 Raw Sushi and Sake Lounge | 1225 Sansom St, Philadelphia, PA 19102 | 215.238.1903 | https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view?page=1#1225-raw-sushi-and-sake-lounge |
1518 Bar and Grill | 1518 Sansom St, Philadelphia, PA 19102 | 267.639.6851 | https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view?page=1#1518-bar-and-grill |
# save full table to file
write_rds(
table,
file = here("content/blog/2022-05-31-ccd-sips/specialsScraped.Rds")
)
Geocoding addresses
The next step was to use geocoding to convert the restaurant/bar addresses to geographical coordinates (longitude and latitude) that I could map. I used the ggmap package and the Google Geocoding API service because this was a small project (59 addresses/requests) which wouldn’t make a dent in the free credit available on the platform.
The last time I geocoded addresses was for an almost identical project in 2019 and I had issues using the same API key from back then, so I made a new one. I restricted my new key to the Geocoding and Geolocation APIs.
# register my API key
# ggmap::register_google(key = "[your key]")
# geocode addresses
specials_ggmap <-
table |>
mutate_geocode(Address)
# rename new variables
specials <-
specials_ggmap |>
rename(Longitude = lon,
Latitude = lat)
specials |> head(3) |> kableExtra::kable()
Name | Address | Phone | Specials | Longitude | Latitude |
---|---|---|---|---|---|
1028 Yamitsuki Sushi & Ramen | 1028 Arch Street, Philadelphia, PA 19107 | 215.629.3888 | https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view?page=1#1028-yamitsuki-sushi-ramen | -75.15746 | 39.95354 |
1225 Raw Sushi and Sake Lounge | 1225 Sansom St, Philadelphia, PA 19102 | 215.238.1903 | https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view?page=1#1225-raw-sushi-and-sake-lounge | -75.16149 | 39.95004 |
1518 Bar and Grill | 1518 Sansom St, Philadelphia, PA 19102 | 267.639.6851 | https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view?page=1#1518-bar-and-grill | -75.16665 | 39.95020 |
I made sure to save the new data frame with geographical coordinates as an .Rds
object so I wouldn’t have to geocode the data again! This would be particularly important if I was working on a large project.
# save table with geocoded addresses to file
write_rds(
specials,
file = here("content/blog/2022-05-31-ccd-sips/specialsGeocoded.Rds"))
Building the map
To build the map, I used the leaflet package. Some of the resources I found helpful, in addition to the package documentation:
- Scrape website data with the new R package rvest (+ a postscript on interacting with web pages with RSelenium) · Hollie at ZevRoss – how to style pop-ups
- Leaflet Map Markers in R · Jindra Lacko – how to customize marker icons
- A guide to basic Leaflet accessibility · Leaflet – accessibility considerations. Though it’s unclear to me how these features built into the Leaflet library translate over to the leaflet R package. For example, I couldn’t find an option for adding alt-text or a title to each marker, but maybe I wasn’t looking in the right place within the documentation.
Customizing map markers
# style pop-ups for the map with inline css styling
# marker for the restaurants/bars
popInfoCircles <- paste("<h2 style='font-family: Red Hat Text, sans-serif; font-size: 1.6em; color:#43464C;'>", "<a style='color: #00857A;' href=", specials$Specials, ">", specials$Name, "</a></h2>","<p style='font-family: Red Hat Text, sans-serif; font-weight: normal; font-size: 1.5em; color:#9197A6;'>", specials$Address, "</p>")
# marker for the center of the map
popInfoMarker<-paste("<h1 style='padding-top: 0.5em; margin-top: 1em; margin-bottom: 0.5em; font-family: Red Hat Text, sans-serif; font-size: 1.8em; color:#43464C;'>", "<a style='color: #00857A;' href='https://centercityphila.org/explore-center-city/ccdsips'>", "Center City District Sips 2022", "</a></h1><p style='color:#9197A6; font-family: Red Hat Text, sans-serif; font-size: 1.5em; padding-bottom: 1em;'>", "Philadelphia, PA", "</p>")
# custom icon for the center of the map
awesome <-
makeAwesomeIcon(
icon = "map-pin",
iconColor = "#FFFFFF",
markerColor = "darkblue",
library = "fa"
)
Plotting the restaurants/bars
leaflet(data = specials,
width = "100%",
height = "850px",
# https://stackoverflow.com/a/42170340
options = tileOptions(minZoom = 15,
maxZoom = 19)) |>
# add map markers ----
addCircles(
lat = ~ specials$Latitude,
lng = ~ specials$Longitude,
fillColor = "#009E91", #olivedrab goldenrod
fillOpacity = 0.6,
stroke = F,
radius = 12,
popup = popInfoCircles,
label = ~ Name,
labelOptions = labelOptions(
style = list(
"font-family" = "Red Hat Text, sans-serif",
"font-size" = "1.2em")
))
Adding the map background
leaflet(data = specials,
width = "100%",
height = "850px",
# https://stackoverflow.com/a/42170340
options = tileOptions(minZoom = 15,
maxZoom = 19)) |>
# add map markers ----
addCircles(
lat = ~ specials$Latitude,
lng = ~ specials$Longitude,
fillColor = "#009E91", #olivedrab goldenrod
fillOpacity = 0.6,
stroke = F,
radius = 12,
popup = popInfoCircles,
label = ~ Name,
labelOptions = labelOptions(
style = list(
"font-family" = "Red Hat Text, sans-serif",
"font-size" = "1.2em")
)) |>
# add map tiles in the background ----
addProviderTiles(providers$CartoDB.Positron)
Setting the map view
leaflet(data = specials,
width = "100%",
height = "850px",
# https://stackoverflow.com/a/42170340
options = tileOptions(minZoom = 15,
maxZoom = 19)) |>
# add map markers ----
addCircles(
lat = ~ specials$Latitude,
lng = ~ specials$Longitude,
fillColor = "#009E91", #olivedrab goldenrod
fillOpacity = 0.6,
stroke = F,
radius = 12,
popup = popInfoCircles,
label = ~ Name,
labelOptions = labelOptions(
style = list(
"font-family" = "Red Hat Text, sans-serif",
"font-size" = "1.2em")
)) |>
# add map tiles in the background ----
addProviderTiles(providers$CartoDB.Positron) |>
# set the map view
setView(mean(specials$Longitude),
mean(specials$Latitude),
zoom = 16)
Adding a marker at the center
leaflet(data = specials,
width = "100%",
height = "850px",
# https://stackoverflow.com/a/42170340
options = tileOptions(minZoom = 15,
maxZoom = 19)) |>
# add map markers ----
addCircles(
lat = ~ specials$Latitude,
lng = ~ specials$Longitude,
fillColor = "#009E91", #olivedrab goldenrod
fillOpacity = 0.6,
stroke = F,
radius = 12,
popup = popInfoCircles,
label = ~ Name,
labelOptions = labelOptions(
style = list(
"font-family" = "Red Hat Text, sans-serif",
"font-size" = "1.2em")
)) |>
# add map tiles in the background ----
addProviderTiles(providers$CartoDB.Positron) |>
# set the map view
setView(mean(specials$Longitude),
mean(specials$Latitude),
zoom = 16) |>
# add marker at the center ----
addAwesomeMarkers(
icon = awesome,
lng = mean(specials$Longitude),
lat = mean(specials$Latitude),
label = "Center City District Sips 2022",
labelOptions = labelOptions(
style = list(
"font-family" = "Red Hat Text, sans-serif",
"font-size" = "1.2em")
),
popup = popInfoMarker,
popupOptions = popupOptions(maxWidth = 250))
Adding fullscreen control
leaflet(data = specials,
width = "100%",
height = "850px",
# https://stackoverflow.com/a/42170340
options = tileOptions(minZoom = 15,
maxZoom = 19)) |>
# add map markers ----
addCircles(
lat = ~ specials$Latitude,
lng = ~ specials$Longitude,
fillColor = "#009E91", #olivedrab goldenrod
fillOpacity = 0.6,
stroke = F,
radius = 12,
popup = popInfoCircles,
label = ~ Name,
labelOptions = labelOptions(
style = list(
"font-family" = "Red Hat Text, sans-serif",
"font-size" = "1.2em")
)) |>
# add map tiles in the background ----
addProviderTiles(providers$CartoDB.Positron) |>
# set the map view
setView(mean(specials$Longitude),
mean(specials$Latitude),
zoom = 16) |>
# add marker at the center ----
addAwesomeMarkers(
icon = awesome,
lng = mean(specials$Longitude),
lat = mean(specials$Latitude),
label = "Center City District Sips 2022",
labelOptions = labelOptions(
style = list(
"font-family" = "Red Hat Text, sans-serif",
"font-size" = "1.2em")
),
popup = popInfoMarker,
popupOptions = popupOptions(maxWidth = 250)) |>
# add fullscreen control button ----
leaflet.extras::addFullscreenControl()
Creating the map with Quarto
The first time around, I created a standalone map by first running an R script with the necessary code, and then exporting the HTML output as a webpage. This worked well enough, except that I realized:
- The title of the map webpage (the name that is displayed on a browser tab) was just “map” because the name of the HTML file was
map.html
. I wanted something more descriptive. - The map wasn’t mobile-responsive. In other words, the map markers and text looked too small when viewed on a mobile device.
Changing the webpage title
The webpage title was a quick one to fix thanks to a Stack Overflow response to a
question about turning off the title in an R Markdown document. The pagetitle
YAML option lets you set the HTML’s title tag independently of the document title:
pagetitle: "Philly CCD Sips 2022 Map"
Fixing the mobile-responsiveness
The mobile-responsiveness issue could be solved by adding metadata to the map HTML, but I would need to be able to blend HTML with R code. I have been practicing using
Quarto and figured I could make a standalone map from a Quarto document (.qmd
) rather than an R Markdown one (.Rmd
or .Rmarkdown
). You can find the map’s Quarto document
alongside this blog post.
According to the Leaflet library documentation and this Stack Overflow answer, fixing the map to be mobile-responsive required adding the following metadata to the HTML code:
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" />
I used the
metathis R package to add this metadata to an R code chunk in my Quarto document using the meta_viewport()
function:
# make mobile-responsive
meta_viewport(
width = "device-width",
initial_scale = "1.0",
maximum_scale = "1.0",
user_scalable = "no"
)
Update: In the process of updating this post I’m noticing that specifying the viewport metadata tag doesn’t seem to be necessary anymore, and I don’t understand why 🤔 …so I’ll leave the step as is, just in case it’s helpful to anyone 🤷🏽♀️
Adding social media tags
Then I added more metadata. I was particularly interested in adding social media tags so that if I (or anyone else) shared this map webpage, an informative preview would display as a social card.
I used the meta_social()
function to add these tags:
# tags for social media
meta_social(
title = "Philly CCD Sips 2022 Interactive Map",
url = "https://www.silviacanelon.com/blog/2022-ccd-sips/map.html",
image = "https://github.com/spcanelon/silvia/blob/main/content/blog/2022-05-31-ccd-sips/featured.png?raw=true",
image_alt = "Map of Philly's Center City with a pop-up saying Center City District Sips 2022",
og_type = "website",
og_author = "Silvia Canelón",
twitter_card_type = "summary_large_image",
twitter_creator = "@spcanelon"
)
Great, I had added all of the metadata I was interested in! Except that because I was using Quarto, and not one of the more common outputs I had a couple of extra steps to take:
-
Write my metadata tags to an HTML file, using the
write_meta()
function:# write meta tags to file write_meta(path = "meta-map.html")
-
Manually include this HTML in my webpage via the Quarto file. The
include-in-header
Quarto YAML option helped me here:include-in-header: meta-map.html
Making the map fullscreen
A side effect of creating the map from a Quarto (or R Markdown) document is that the output is styled by default to fit within the width of an article (in this case 900 pixels). I wanted the map to take up the whole width of the page, so I made use of the
page-layout
Quarto YAML option:
format:
html:
page-layout: custom
Another option that worked pretty well was to use the column: screen
code chunk option built into Quarto. The Quarto documentation even shows an
example to display a Leaflet map I but it left a thin margin at the top margin, and I wanted the map to be flush against the top edge of the webpage.
Rendering the standalone map
Lastly, I added one more option to the YAML that would render the Quarto document into a self-contained HTML with all of the content needed to create the map.
format:
html:
page-layout: custom
self-contained: true