{"id":4446,"date":"2023-06-15T14:50:14","date_gmt":"2023-06-15T21:50:14","guid":{"rendered":"https:\/\/parkingreform.org\/?p=4446"},"modified":"2023-06-26T07:44:29","modified_gmt":"2023-06-26T14:44:29","slug":"building-a-u-s-parking-minimums-database","status":"publish","type":"post","link":"https:\/\/parkingreform.org\/2023\/06\/15\/building-a-u-s-parking-minimums-database\/","title":{"rendered":"Building a U.S. Parking Minimums Database"},"content":{"rendered":"\n<p>Using data to inform and reform policies isn\u2019t new. However, in order to analyze data, you need to be able to access the data \u2014 even better if it\u2019s already processed and formatted. There\u2019s no unified database for parking minimums in the United States, so we were tasked to build one.&nbsp;<\/p>\n\n\n\n<h5 class=\"wp-block-heading\">What are parking minimums?<\/h5>\n\n\n\n<p>Parking minimums are a set of parking requirements for new developments, ranging from residential to commercial buildings. Developers have to, at a minimum, build a certain number of parking spaces, which can be based on characteristics from square footage to number of employees.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/la_parking_minimums_ex.jpg?resize=778%2C703&#038;ssl=1\" alt=\"\" class=\"wp-image-4479\" width=\"778\" height=\"703\" srcset=\"https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/la_parking_minimums_ex.jpg?resize=1024%2C927&amp;ssl=1 1024w, https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/la_parking_minimums_ex.jpg?resize=300%2C272&amp;ssl=1 300w, https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/la_parking_minimums_ex.jpg?resize=768%2C695&amp;ssl=1 768w, https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/la_parking_minimums_ex.jpg?resize=2048%2C1854&amp;ssl=1 2048w, https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/la_parking_minimums_ex.jpg?resize=600%2C543&amp;ssl=1 600w\" sizes=\"(max-width: 778px) 100vw, 778px\" data-recalc-dims=\"1\" \/><figcaption class=\"wp-element-caption\">Los Angeles County: Required parking spaces<\/figcaption><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>Parking minimums are also known as parking requirements, parking ratios, or parking schedules. These tables are scattered across municipalities\u2019 Code of Ordinances or Unified Development Ordinance (UDO).\u00a0<\/p>\n\n\n\n<p>Our ten-week project\u2019s main goal was to build and populate a database to hold all parking minimum requirements across the United States, using automation to speed up the data parsing.<\/p>\n\n\n\n<h5 class=\"wp-block-heading\">Developed workflow<\/h5>\n\n\n\n<p>We mainly web-scraped the requirements from <a href=\"https:\/\/library.municode.com\/\">Municode<\/a>, which is a digital library of Code of Ordinances across the United States. A <a href=\"https:\/\/library.municode.com\/ca\/los_angeles_county\/codes\/code_of_ordinances?nodeId=TIT22PLZO_DIV6DEST_CH22.112PA_22.112.070REPASP\">parking section URL<\/a> is inputted into our program for web-scraping and data processing. The data entries are then inserted into the database.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/municode.jpg?resize=1024%2C576&#038;ssl=1\" alt=\"\" class=\"wp-image-4481\" srcset=\"https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/municode.jpg?resize=1024%2C576&amp;ssl=1 1024w, https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/municode.jpg?resize=300%2C169&amp;ssl=1 300w, https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/municode.jpg?resize=768%2C432&amp;ssl=1 768w, https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/municode.jpg?resize=1536%2C864&amp;ssl=1 1536w, https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/municode.jpg?resize=600%2C338&amp;ssl=1 600w, https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/municode.jpg?w=1920&amp;ssl=1 1920w\" sizes=\"(max-width: 1000px) 100vw, 1000px\" data-recalc-dims=\"1\" \/><\/figure>\n\n\n\n<h5 class=\"wp-block-heading\">The database<\/h5>\n\n\n\n<p>The data lives in a PostgreSQL database through Supabase. Each entry states the state, region, and use case, which maps to the raw requirement.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"237\" src=\"https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/table-ratios.jpeg?resize=1024%2C237&#038;ssl=1\" alt=\"Figure showing an example of the data entry in the tables. There are 4 columns: state, region, use, raw. One example row = NC, Burlington, Landfill, 2+1 per employee on largest shift. The figure nots that the columns are the primary key, which is how the database ensures no duplicates. \" class=\"wp-image-4451\" srcset=\"https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/table-ratios.jpeg?resize=1024%2C237&amp;ssl=1 1024w, https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/table-ratios.jpeg?resize=300%2C69&amp;ssl=1 300w, https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/table-ratios.jpeg?resize=768%2C178&amp;ssl=1 768w, https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/table-ratios.jpeg?resize=600%2C139&amp;ssl=1 600w, https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/table-ratios.jpeg?w=1244&amp;ssl=1 1244w\" sizes=\"(max-width: 1000px) 100vw, 1000px\" data-recalc-dims=\"1\" \/><figcaption class=\"wp-element-caption\">A note about the interaction with Supabase: Our pipeline interacts with the database through SQLAlchemy, instead of Supabase\u2019s Python Client. This decision accounts for future plans involving switching hosting providers.<\/figcaption><\/figure>\n\n\n\n<h5 class=\"wp-block-heading\">Web-scraping parking requirements<\/h5>\n\n\n\n<p>In order to efficiently extract data from the parking tables, we developed a web scraper refined for our purpose and for a website like Municode.<\/p>\n\n\n\n<p>We used Selenium Webdriver to obtain the HTML of a webpage. We can then parse the HTML according to the structure of data representation. In our case, most of the information is presented in tables and bullet points, so we have functions that extract data from both structures.&nbsp;<\/p>\n\n\n\n<p>We encountered one difficulty when using our current table parsing function. When scraping tabular data, the function will acquire all of the tables in a webpage, including the ones we don\u2019t need. So, we tried to write a function that would automatically select the tables needed. We attempted to resolve this problem by using natural language processing (NLP) machine learning approach. Each table will be assessed by their column names, and they\u2019ll be categorized as \u201cuseful\u201d or \u201cuseless.\u201d&nbsp;<\/p>\n\n\n\n<p>Ideally, the model will reach above 95% in accuracy to be implemented. However, the accuracy of our trained model is only 80%. We still have to double-check the results and manually select some tables from the prediction. Having this model didn\u2019t increase our efficiency when adding data into the database, so we ended up abandoning it.&nbsp;<\/p>\n\n\n\n<h5 class=\"wp-block-heading\">Web crawling<\/h5>\n\n\n\n<p>For the majority of our project, the parking code URLs are found manually, which is tedious. There are almost 3,500 municipalities in the United States. Using a generous estimate of five minutes to find a parking code on Municode, it would take about 290 hours. We\u2019re not expecting to crawl for every single parking code, but if we could find one out of three parking minimums, we\u2019d save 100 hours of manual work.<\/p>\n\n\n\n<p>What if we could write a script that can search keywords and find the parking code for us? We experimented with Python\u2019s scrapy library (which also has scraping functions) and focused on its \u201ccrawling\u201d ability to find the pages we need.<\/p>\n\n\n\n<p>However, Municode has fancy elements, like a search bar and drop-down tabs, which can only be loaded by waiting for a browser to interpret the code. We landed on using a joint library called scrapy-playwright, which implements a browser.<\/p>\n\n\n\n<p>Our crawler is able to search for keywords in the search bar, wait for its results, and extract the first link.<\/p>\n\n\n\n<p>This feature isn\u2019t fully implemented, as the spider can only search for one keyword and only scrapes the first link, which isn\u2019t always the right one with the parking code.<\/p>\n\n\n\n<h5 class=\"wp-block-heading\">What we would have done differently<\/h5>\n\n\n\n<p>We wanted to automate as much of the process as possible, from trying to parse every single format of parking minimums to using machine learning to find the right table. It\u2019s difficult to get computers to work 100% correctly, and it\u2019s easy to speed up a human\u2019s workflow. Rather than approaching this project as an end-to-end system with no humans involved, we realized our time is better spent building tools to make the manual work easier.<\/p>\n\n\n\n<p>We were also torn between inserting as many entries as possible and building a viable system to onboard other volunteers. As a result, we developed our personal scraping and parsing pipeline without fully considering how other people can use it.<\/p>\n\n\n\n<h5 class=\"wp-block-heading\">What comes next<\/h5>\n\n\n\n<p>There are a few more core and auxiliary features that can be implemented:<\/p>\n\n\n\n<ul>\n<li>Build an interface, like a website, for the general public to view the database and contribute to some of the manual steps of the workflow<\/li>\n\n\n\n<li>Parse the raw requirements (text) into numbers, in order for the database to be useful in data analysis (like aggregations and comparisons)<\/li>\n\n\n\n<li>Standardize data entries (i.e. \u201claundromat\u201d and \u201claundry and cleaning service\u201d can be processed as one category, like \u201claundry service\u201d)&nbsp;<\/li>\n\n\n\n<li>Combine our crawler and scraper into one unit<\/li>\n<\/ul>\n\n\n\n<p>Our codebase can be publicly accessed on <a href=\"https:\/\/github.com\/ParkingReformNetwork\/parking-requirement-database\">GitHub<\/a>. Contributors are always welcome! If you\u2019re interested in contributing in a non-coding fashion, please open a GitHub issue at <a href=\"https:\/\/github.com\/ParkingReformNetwork\/parking-requirement-database\">https:\/\/github.com\/ParkingReformNetwork\/parking-requirement-database<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Using data to inform and reform policies isn\u2019t new. However, in order to analyze data, you need to be able to access the data \u2014 even better if it\u2019s already processed and formatted. There\u2019s no unified database for parking minimums in the United States, so we were tasked to build one using automation to speed up the data parsing.<\/p>\n","protected":false},"author":6090,"featured_media":4481,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_uag_custom_page_level_css":"","ub_ctt_via":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""}},"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"mc4wp_mailchimp_campaign":[],"footnotes":""},"categories":[16,19,68],"tags":[146,147,150,148],"featured_image_src":"https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/municode.jpg?fit=1920%2C1080&ssl=1","author_info":{"display_name":"Tung Lin","author_link":"https:\/\/parkingreform.org\/author\/tunglinn\/?mab_v3=4446"},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/municode.jpg?fit=1920%2C1080&ssl=1","uagb_featured_image_src":{"full":["https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/municode.jpg?fit=1920%2C1080&ssl=1",1920,1080,false],"thumbnail":["https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/municode.jpg?resize=150%2C150&ssl=1",150,150,true],"medium":["https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/municode.jpg?fit=300%2C169&ssl=1",300,169,true],"medium_large":["https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/municode.jpg?fit=768%2C432&ssl=1",768,432,true],"large":["https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/municode.jpg?fit=1024%2C576&ssl=1",1024,576,true],"1536x1536":["https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/municode.jpg?fit=1536%2C864&ssl=1",1536,864,true],"2048x2048":["https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/municode.jpg?fit=1920%2C1080&ssl=1",1920,1080,true],"authorship-box-avatar":["https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/municode.jpg?resize=150%2C150&ssl=1",150,150,true],"authorship-box-related":["https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/municode.jpg?resize=70%2C70&ssl=1",70,70,true],"woocommerce_thumbnail":["https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/municode.jpg?resize=300%2C300&ssl=1",300,300,true],"woocommerce_single":["https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/municode.jpg?fit=600%2C338&ssl=1",600,338,true],"woocommerce_gallery_thumbnail":["https:\/\/i0.wp.com\/parkingreform.org\/wp-content\/uploads\/2023\/06\/municode.jpg?resize=100%2C100&ssl=1",100,100,true]},"uagb_author_info":{"display_name":"Tung Lin","author_link":"https:\/\/parkingreform.org\/author\/tunglinn\/?mab_v3=4446"},"uagb_comment_info":0,"uagb_excerpt":"Using data to inform and reform policies isn\u2019t new. However, in order to analyze data, you need to be able to access the data \u2014 even better if it\u2019s already processed and formatted. There\u2019s no unified database for parking minimums in the United States, so we were tasked to build one using automation to speed&hellip;","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/parkingreform.org\/wp-json\/wp\/v2\/posts\/4446"}],"collection":[{"href":"https:\/\/parkingreform.org\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/parkingreform.org\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/parkingreform.org\/wp-json\/wp\/v2\/users\/6090"}],"replies":[{"embeddable":true,"href":"https:\/\/parkingreform.org\/wp-json\/wp\/v2\/comments?post=4446"}],"version-history":[{"count":16,"href":"https:\/\/parkingreform.org\/wp-json\/wp\/v2\/posts\/4446\/revisions"}],"predecessor-version":[{"id":4484,"href":"https:\/\/parkingreform.org\/wp-json\/wp\/v2\/posts\/4446\/revisions\/4484"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/parkingreform.org\/wp-json\/wp\/v2\/media\/4481"}],"wp:attachment":[{"href":"https:\/\/parkingreform.org\/wp-json\/wp\/v2\/media?parent=4446"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/parkingreform.org\/wp-json\/wp\/v2\/categories?post=4446"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/parkingreform.org\/wp-json\/wp\/v2\/tags?post=4446"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}