GeoJSON and KML data for the United States
I had a devil of a time finding simple GeoJSON and KML boundary files for US counties and states. Eventually I realized that I could get shapefiles from the United States Census Cartographic Boundary Files and convert them to GeoJSON and KML formats using the MyGeoData vector converter.
The result is the following set of boundary files. Since copyright protection is not available for any work of the United States Government, you should all be free to use for any purpose. The Census Bureau does request to be cited as a source.
These files are available in various resolutions and are all derived from the 2010 census. The 500k files are the most detailed, but also the largest. The 20m files are the smallest, but at the cost of some dramatic simplification. The 5m files fall somewhere between the other two.
Features | 500k | 5m | 20m |
---|---|---|---|
US Outline | SHP, KML, GeoJSON | SHP, KML, GeoJSON | SHP, KML, GeoJSON |
US States | SHP, KML, GeoJSON | SHP, KML, GeoJSON | SHP, KML, GeoJSON |
US Counties | SHP, KML, GeoJSON | SHP, KML, GeoJSON | SHP, KML, GeoJSON |
US Congressional (see note) | SHP, KML, GeoJSON | SHP, KML, GeoJSON |
If the files you need are not here, don’t be afraid of going to the source and converting them yourself!
You can also look at this example of how to use the files.
I recieved the following comment from Mindy McAdams in April 2022:
I discovered today that the newest map files (2019 and 2020) are at this URL:
https://www.census.gov/geographies/mapping-files/time-series/geo/cartographic-boundary.html
I recieved the following comment from Charles Wright in August 2021:
Apparently there are some new places in Alaska and DC that have been given FIPS codes but for which the shapefiles are either non-existent or just plain hard to find. This wouldn’t be the first weird thing I’ve encountered while digging into the data, provided by NYT.
Here’s are a few examples:
- 11007 Ward 7, Washington, DC, United States
- 11008 Ward 8, Washington, DC, United States
- 11005 Ward 5, Washington, DC, United States
- 11004 Ward 4, Washington, DC, United States
- 24510 Homeless Encampment, Baltimore-Washington Parkway, Westport, Baltimore, Maryland, 21240, United States
I wasn’t aware that locations designated by FIPS could materialize from one day to the next like they seem to have done, but this is my first such project so WDIK?
Anyway, if I can’t find the shapes data for these I suppose I will have to aggregate them under a general FIPS for which I do have data. But if you have any ideas on where to obtain the data, I’d appreciate hearing about it.
If anyone has any insights to share with Charles, please let me know and I will pass them along.
I recieved the following warning from Carole MacDonald in October 2020:
The datasets are slightly out of date. Wade Hammond County, AK is now Kusilvak Census Area (0500000US02270) and Shannon County, SD is now Oglala Lakota County (0500000US46102).
I recieved the following notebook from Ali Ebrahim in April 2020:
Thanks for your helpful guide. Unfortunately, I wasn’t able to parse the counties file listed due to that encoding issue mentioned, so I wrote a colab which will also do the parsing.
I received an update of the link to the US Census Bureau files from Dan Raney in January 2020. The original link (https://www.census.gov/geo/maps-data/data/tiger-cart-boundary.html) now leads to a 404. Thanks, Dan!
I received the following note from Nick C in November 2019:
First of all, thank you! Thank you for providing the GeoJSON files for the US (states/counties). I am using them for my course project and they have saved me a ton of time!
As a courtesy, I did want to let you know that it appears the “US Counties” 500k GeoJSON file might have an incorrect encoding… when trying to load the file using the geopandas “read_file()” function I get the following error:
Traceback (most recent call last):
File "shapely_scratch.py", line 25, in <module>
us_county = gpd.read_file(us_county_path, driver='GeoJSON')
File "/home/christnp/.local/lib/python3.6/site-packages/geopandas/io/file.py", line 95, in read_file
gdf = GeoDataFrame.from_features(f_filt, crs=crs, columns=columns)
File "/home/christnp/.local/lib/python3.6/site-packages/geopandas/geodataframe.py", line 294, in from_features
for f in features_lst:
File "fiona/ogrext.pyx", line 1369, in fiona.ogrext.Iterator.__next__
File "fiona/ogrext.pyx", line 232, in fiona.ogrext.FeatureBuilder.build
TypeError: startswith first arg must be bytes or a tuple of bytes, not str
Upon further investigation (i.e., using the json package to load the json file), we can see that it is an ecoding issue:
json.load(open(us_county_path))
Traceback (most recent call last):
File "shapely_scratch.py", line 24, in <module>
json.load(open(us_county_path))
File "/usr/lib/python3.6/json/__init__.py", line 296, in load
return loads(fp.read(),
File "/usr/lib/python3.6/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf1 in position 13337350: invalid continuation byte
For now to overcome this error, I simply add the following to a try/except
cur_json = json.load(open(us_county_path, encoding='ISO-8859-1'))
path,ext = os.path.splitext(us_county_path)
new_path =path+"_new"+ext
with open(new_path,"w", encoding='utf-8') as jsonfile:
json.dump(cur_json,jsonfile,ensure_ascii=False)
us_county = gpd.read_file(new_path, driver='GeoJSON')
Hope this is helpful :-)
I received the following note from Peter T in February 2016:
I had to do a little housekeeping before JSON.parse() would correctly parse the data I’m using (with up-to-date Safari), even though the file passed a JSON validator. I want to pass on what I did. I’m using only data for the state of Georgia from the 20m file. I did two things:
- removed new-line characters, \n, between each of the counties, and
- removed several (around twenty) extra sets of square brackets, [ ], within the county coordinates vectors.
JSON.parse() now seems to work fine on the Georgia numbers. The attached (CleanGeorgia.txt) txt file is the cleaned version that I am using.
I received the following pointer to an exciting new tool in July 2015:
Your page helped me. Also, Matthew Bloch is putting a lot of effort into his Mapshaper program. I was able to use that to extract State boundaries from the USGS data for US and also to drop all the islands I was not interested in.
I received the following observation from MR in November 2013, so stay alert. I make no promises about the accuracy of these files, I just used the conversion tools listed above.
Was using these, gratefully, but just noticed that California districts are not accurate for 113th Congress. For instance, in gz_2010_us_500_11_20m.json look in the northern part of the state. Not sure of accuracy other states.
I believe the issue is that congressional redistricting from the census is not fully reflected in the 2010 files. If you are depending on congressional boundaries, be warned!