I guess we can call this part two of the global connectivity by flight experiment, which I have been working on the past couple of days. Once again, the data comes from openflights.org/data.html and is available for free to the public. It is a very detailed set of data that requires just a little bit of massaging to be usable in a desktop work environment. In this blog post I would like to take you through some quick helpful tips that should get you from downloading the data and cleaning it up to mapping it in ArcMap. In all honesty, I would have loved to completed this process in entirely open source applications but I had to resort to ArcMap’s “XY to Line” tool. I wish I could have replicated the result I’ve achieved using QGIS but alas, I am more familiar with ArcMap’s geoprocessing capabilities and so I fall back to my student licensed Arc product when I need to.
Fifty-eight thousand, two-hundred and eighty-eight flight routes
Yup! 58,288 flight routes… 1 Earth. If that’s not an impressive number I’m not sure what is! The number is almost as impressive as the map created by connecting all of the flight origins and destinations:
Without a basemap of the continents, we can clearly make out the shape of many major regions of the world. Also note that we can identify many unique spatial patterns of settlement and urbanization. Flights that leave or arrive in South America, Africa or Australia generally tend to fly to/from the coast. This is very reflective of the settlement patterns in these regions. The United States has flights arriving and departing all over the country, where as Canada has most flights located along the southern Canada-USA border (with a few continental flights travelling into the North). We can see that Australia has major urban centres spread out over its large land mass and that continental flights are very popular in Australia, connecting almost every coastal region to each other with very few flights landing in central Australia (what would be the point, right?).
We could talk about how flight routes reflect the settlement patterns all day but then I would never get to showing you how to make a map similar to this!
I will assume you have the following two applications installed on your computer:
- ArcMap 9.3x+ or 10.x+
- OpenOffice.org (OOo) or your favourite spreadsheet editor
Let's get started!
- download both airports.dat and routes.dat from openflights.org/data.html
- change the extension from ‘.dat’ to ‘.csv’ on both files
- create a new spreadsheet in OOo with two sheets/tabs
- open airports.csv, copy and paste the data into sheet 1
- re-name sheet 1 to airports
- open routes.csv and paste the data into sheet 2
- re-name sheet 2 to routes
Now to clean up the data:
- in sheet.airports: delete all columns BUT those containing the 3-letter IATA/FAA codes or 4-letter ICAO codes & latitude and longitude (columns 5–8)
- in sheet.routes: delete all columns BUT those containing origin and destination airport codes (columns 3 and 5)
- insert a new row above all existing data and input appropriate field names. ie sheet.airports{IATA_FAA, ICAO, Lat, Long}, sheet.routes{Orig, Dest}
Querying Route Origin and Destination XY Coordinates
What we are going to do is query sheet.airports from sheet.routes to obtain latitude and longitude values for each origin and destination of every route (wow, mouthful).
Working in sheet.routes:
- create 4 column field names {Orig_Lat, Orig_Long, Dest_Lat, Dest_Long} *should be cells C1:F:1*
- in cell C2, insert function:
=VLOOKUP($A2;airports.$A$2:$D$6630;3;0) - ‘fill’ the equation down the table by double-clicking in the bottom right hand of the cell
- in cell D2, insert function:
=VLOOKUP($A2;airports.$A$2:$D$6630;4;0) - ‘fill’ the equation down
- in cell E2, insert function:
=VLOOKUP($B2;airports.$A$2:$D$6630;3;0) - ‘fill’ the equation down
- in cell F2, insert function:
=VLOOKUP($B2;airports.$A$2:$D$6630;4;0) - ‘fill’ the equation down
The function VLOOKUP, allows us to ‘lookup’ cell values in arrays located inside your spreadsheet. In words =VLOOKUP($A2;aiports.$A$2:$D$6630;3;0) means, identify the value in cell A2 and locate it in sheet.aiports within the array A2:D6630. If found, the expression returns the value located in the 3rd column (column C; where column 1 = A; the first column in your array). What we are attempting to do is locate the route origin code in A2 and find it in sheet.aiports array A2:D6630. If it finds a match, it will take the value located in the same row albeit the 3rd column over as the expressions value. A slight modification to the function is needed to calculate the latitude of route origins and lat/long of route destinations.
More info on VLOOKUP here.
Note:
There a few things we must do to combat a few issues that may arise later on when we begin geoprocessing. There are approx 500 entries which returned no lat/long values for route origin, destination or both. These need to be deleted before geoprocessing so go ahead and sort your columns numerically, find those #N/A entries and delete the entire row. Also, we will have some difficulty geoprocessing 58K entries at once, so its best we split up our table into 6 separate tables. I split my table up into 10,000 entry chunks. Make sure when you do this that you don’t leave the field headings (row 1) out by accident on each subsequent table! Saving the tables as .csv allows us to open them up in ArcMap and perform geoprocessing operations on them.
In ArcMap:
- create a new filegeodatabase and name it appropriately (I’ll call it openflights)
- add all 6 tables to a new project
- search for the tool “XY to Line”, located in the data management toolbox, open it up and fill it out like so:
- input table > table 1
- output file > ‘openflights.gdb/routes_1’
- Start X > Orig_Long
- Start Y > Orig_Lat
- End X > Dest_Long
- End Y > Dest_Lat
- Method > Great Circle
- Rinse and repeat for all 6 tables
- Using the ‘Merge’ tool located in the analysis toolbox, merge all 6 route polyline shapefiles
- input files > routes_1 to routes_6
- field map > right-click on ‘shape length’ > merge rule > join
- output file > ‘openflights.gdb/routes_merged’
With all that said and done, you should have a flight route network of 58 thousand polylines! Fiddle around with the symbology and have some fun producing an eye-catching map!
Cheers!


Hey,
Nice post. Thanks for linking to the original data. I’m looking forward to exploring it.
While you link to the data works for the part 1 post, the link above is broken.
Regards,
Cyrille
Thanks Cyrille!
~ fixed the link
[…] out the update to this post here! Related Posts!4 Sep ’11 — Global Connectivity Revisited; mapping out 58,288 flight […]
Hi Michael,
Thank you to post this kind of tutorials.
I’m just starting with the whole GIS thing, so as a newbie I have some questions:
Can this be done using QGis?
if yes, Could you please take the time to make a version of this tutorial using QGis?
It is valid to use just some of the flights? for example, if I’m only interested in the flights from Venezuela to Cuba, how do I can use only that data to make the map?
Thanks in advance and sorry for bother you with this newbie questions.
Many
Hey Manuel,
Unfortunately, I don’t have a method for completing this in QGIS. I am not familiar with an equivalent QGIS tool for ArcGIS ‘xy to line’. But I can however direct you to this discussion which talks about mapping flight routes using a combination of QGIS and a PostGIS database http://underdark.wordpress.com/2011/08/20/visualizing-global-connections/
Cheers,
Michael
Manuel,
A follow up…
A combination of QGIS and GRASS will allow you to convert xy coordinates to line features. Take a look at this wiki page. http://grass.osgeo.org/wiki/Convert_points_to_lines