Determining Proximity in Historic Networks

The mechanics of measuring closeness between historic figures

How might it be possible to quantify the 'closeness' between individuals long since dead? Measuring degrees of friendship, or any type of human relationship, is most likely a dicey project but is inherently interesting non-the-less. Understanding who an individual called a friend, a co-worker, a romantic interest or a family member hints at their larger network within society. Understanding where a given individual sits within a social network points to possible opportunities and constraints they may have had at the time. It's intriguing to think we might gain some insight into an historic figures decision making process as we study history.

Because it's not possible to go back in time and conduct a census of the actors within a given network of interest, an alternative approach to gathering data is needed. Rather than conduct in-person survey's, this site looks at large collections of letters and attempts to draw conclusions about the types of relationships between individuals - or nodes in a social network. Read more about how these types of relationships are determined. Beyond the types of relationships, it's also worth determining the strength of relationships as not everyone in our social networks is created equal - some will naturally have more influence than others.

If you want to understand how an individual is affected by their social network, then the strength of relationship ties is clearly important. Unfortunately, measuring how close one person is to another (their proximity), is not so clear. What's needed is a ruleset to determine the strength of relationships and a mechanism to represent that measurement succinctly. For the purposes of this site, an individuals (aka the egos) proximity to another is derived from letters written by them. Each letter can generate multiple 'votes' by the ego for one or more people in their network (alters) in two distinct ways. First, by the very act of an ego writing a letter to an alter and second, whenever an alter is mentioned by an ego in the letter. So for example if George Washington writes a letter to his wife Martha Washington and in the letter mentions Thomas Jefferson two times, John Adams three times, and Alexander Hamilton once then a total of 7 'proximity votes' will be generated. However not every vote is considered equal since a letter to a friend is inherently stronger than a letter to an acquaintance. The table below illustrates the relative strength of proximity votes where a value of 1 is strongest.

Relative Strength of Proximity Votes
Action Relationship Category
Family Friendship Romantic Professional Geographic Physical None
Write 1112334
Mention 2223444
note that the above relationship categories are borrowed from the XFN standard

To determine how close an ego is to an alter, all the proximity votes, in all of the ego's letters are tallied per alter and compiled into an array with four indexes. Each index of a proximity array is populated by the sum of the votes with a corresponding weight. So for example if out of all George Washington's letters he mentioned his friend the Marquis de Lafayette 5 times and wrote him 16 times, the proximity array representing their relationship would be [ 16 , 5, 0, 0 ]. An example of a very weak relationship would be an individual who writes a third person and just happens to mention the Marquis de Lafayette but doesn't know him personally, work with him or have any other connection what so ever. This weak relationship would be represented by a proximity array of [ 0, 0, 0, 1] - assuming the Marquis' name was mentioned just once.

Proximity arrays are the mechanism that make ranking relationships between ego's and their alters achievable. Arrays are a compact form of data structure common in computer science so sorting an ego's relationships, once we've quantified them, becomes trivial.

It becomes obvious that the extent of what this method can determine about a person and their network is based entirely on how many letters are available for examination. At best what is presented is an incomplete view of an individuals social network. More letters to a greater breadth of alters results in a more accurate the view.

It's possible to export an ego's proximity array for your own use. An excellent, free tool for analyzing social network data is Gephi which can import several file formats - one of which is the xml based .gexf format. To export ego node and relationship edges for your own use in Gephi - simply use our api. For example:

http://www.familytales.org/gexf/proximity/?people=hbs,jrl

will generate an xml file with relationship edges for Harriet Beecher Stowe and James Russel Lowell. The xml file also creates node entries for all the alters they write to or mention. More information about exporting data from this site.

Export as much information as you want and be sure to tell us about your interesting projects. Nodes and edges change occasionally - so check back for updates.