Bakken Spill Mapr
It’s a website I made that scrapes data from the North Dakota Department of Health and makes it available via user interface in the form of an interactive map. If you want to check it out before reading the rest of this post feel free, just click below:
Now how did I come up with this project? I lived in North Dakota for three years, from 2013 to 2016, working as an environmental scientist. During that time North Dakota saw a huge boom in oil production and environmental scientists were hard at work cleaning up after the spills that came with it. In 2015, the North Dakota Department of Health decided (or was forced to .. by that pesky FOIA) to make their environmental incident reports public. They posted the spill reports in list format as shown below:
Now this is fine. Plenty of data there, the jobs done, right? Well this day in age people don’t like looking at spreadsheets, its all about the user experience, and luckily, since each record is a spill that corresponds to a latitude and longitude, a interactive-map user interface was the perfect solution to getting more people to view and participate in this public information, something where people could actually see each area, where the spill occurred, not just a spreadsheet.
There were two main problems to solve here, 1) i need to write code that will scrape this data from the NDDH website and save it directly into my apps database, and 2) I need to use a mapping api for plotting my data. Ofcourse there were plenty of other challenges that would come along with building this app, but these were the main ones.
The code for scraping the data would need to be written in the seed file of my rails app. Several gems were required including, selenium webdriver, phantomjs, open-uri, and nokogiri. Then it’s picking through the html of the page you are scaping and pulling out the data you need. The code ends up looking like this:
require 'rubygems'
require 'nokogiri'
require 'open-uri'
require 'selenium-webdriver'Spill.delete_alldef get_data(browser)
doc = Nokogiri::HTML(browser.page_source)
keys = doc.css("table#GridView1 th").map { |item| item.text }
data = []
doc.css("table#GridView1 tr").each do |row|
row_data = {}
row.css("td").each_with_index do |item, index|
row_data[keys[index]] = item.text.strip
end
if row_data != {}
s = Spill.new
s.incident_url = row_data["Incident ID"]
raw_date_reported = row_data["Date Reported"].split('/')
s.date_reported = "#{raw_date_reported[1]}/#{raw_date_reported[0]}/#{raw_date_reported[2]}"
raw_date_incident = row_data["Date Incident6"].split('/')
s.date_incident = "#{raw_date_incident[1]}/#{raw_date_incident[0]}/#{raw_date_incident[2]}"
s.county = row_data["County"]
s.latitude = row_data["Latitude"]
s.longitude = row_data["Longitude"]
s.contaminant = row_data["Contaminant"]
s.volume = row_data["Volume"]
s.units = row_data["Units"]
s.contained = row_data["Contained"]
s.save
end
end
data
endurl = 'https://deq.nd.gov/FOIA/Spills/defaultarc.aspx/'puts "Creating phantom browswer..."
browser = Selenium::WebDriver.for :phantomjs
puts "Opening url..."
browser.get urlputs "Reading data page 1"
data = get_data(browser)
230.times do |index|
puts "Reading data page #{index + 2}"
browser.find_element(css: 'input[value="Next"]').click
data += get_data(browser)
endurl = 'https://deq.nd.gov/FOIA/Spills/default.aspx/'puts "Creating phantom browswer..."
browser = Selenium::WebDriver.for :phantomjs
puts "Opening url..."
browser.get urlputs "Reading data page 1"
data = get_data(browser)
19.times do |index|
puts "Reading data page #{index + 2}"
browser.find_element(css: 'input[value="Next"]').click
data += get_data(browser)
endbrowser.closeputs "finished!"
The code for building the map involved using the google maps api. The google maps api runs on javascript, so my data, stored in my rails database would need to be pulled from a json api that would need to be built internally. So the api controller ended up looking like this (note there is alot of extra code in here pertaining to querying data, but the basic code is there):
class Api::V1::SearchController < ApplicationController
def index
rows = Spill.all
if !params[:contaminant].nil?
rows = Spill.where(contaminant: params[:contaminant])
end
if !params[:county].nil?
rows = Spill.where(county: params[:county])
end
if !params[:start_date].nil? && !params[:end_date].nil?
start_date = DateTime.new(params[:start_date].to_i)
end_date = DateTime.new(params[:end_date].to_i)
rows = Spill.where("date_incident > ? AND date_incident < ?", start_date, end_date)
end
@spills = rows
render 'spills.json.jbuilder'
end
end
and the api view, like this:
json.array! @spills.each do |spill|
json.partial! 'spill.json.jbuilder', spill: spill
end
Then the google maps javascript code ended up looking like this:
<body>
<h3>My Google Maps Demo</h3>
<!--The div element for the map -->
<div id="map"></div>
<script>var markers = [];
var map;
var markerCluster;
var useMarkerCluser = true;
function initMap() {
map = new google.maps.Map(document.getElementById('map'), {
zoom: 7,
mapTypeControl: true,
center: new google.maps.LatLng( 47.361893,-100.465128),
mapTypeControlOptions: {
style: google.maps.MapTypeControlStyle.DROPDOWN_MENU,
mapTypeIds: ['hybrid', 'roadmap', 'terrain']
}
});$.getJSON('/json/all', createMarkers);
var bakkenLayer = new google.maps.KmlLayer({
url: 'https://sites.google.com/site/jskordkmlfiles/kml_files/Bakken.kml?revision=2',
preserveViewport: true,
map: map
});}function createMarkers(data) {
data.forEach(function(spill) {
var type = spill.contaminant;
var volume = spill.volume;
var date = spill.date_incident;
var units = spill.units;
var id = spill.id.toString();
var str = "View Spill Page";
var infowindow = new google.maps.InfoWindow({
content: "Date:" + " " + date + " Contaminant: " + type + " " + "Volume:" + volume + units + " " + str.link("/spills/" + id)
});
var marker = new google.maps.Marker({
position: new google.maps.LatLng(spill.latitude, spill.longitude),
map: map
});
marker.addListener('click', function() {
infowindow.open(map, marker);
});markers.push(marker);});if (useMarkerCluser) {
markerCluster = new MarkerClusterer(map, markers,{
imagePath: 'https://developers.google.com/maps/documentation/javascript/examples/markerclusterer/m'
});
}
}</script>
<!--Load the API from the specified URL
* The async attribute allows the browser to render the page while the API loads
* The key parameter will contain your own API key (which is not needed for this tutorial)
* The callback parameter executes the initMap() function
-->
<script async defer
src="https://maps.googleapis.com/maps/api/js?&callback=initMap">
</script><%= @spills[0].latitude %>
<%= @total_spills %></body>
These were the major building blocks for Bakken SpillMapr. Take a look at the final product and see if it’s not an improvement on the State’s data format.