| <HTML> |
| <!-- |
| Copyright (c) Jeremy Siek 2000 |
| |
| Distributed under the Boost Software License, Version 1.0. |
| (See accompanying file LICENSE_1_0.txt or copy at |
| http://www.boost.org/LICENSE_1_0.txt) |
| --> |
| <Head> |
| <Title>Kevin Bacon Example</Title> |
| <BODY BGCOLOR="#ffffff" LINK="#0000ee" TEXT="#000000" VLINK="#551a8b" |
| ALINK="#ff0000"> |
| <IMG SRC="../../../boost.png" |
| ALT="C++ Boost" width="277" height="86"> |
| |
| <BR Clear> |
| |
| <H1>Six Degrees of Kevin Bacon</H1> |
| |
| <P> |
| A fun application of graph theory comes up in the popular game ``Six |
| Degrees of Kevin Bacon''. The idea of the game is to connect an actor |
| to Kevin Bacon through a trail of actors who appeared together in |
| movies, and do so in less than six steps. For example, Theodore |
| Hesburgh (President Emeritus of the University of Notre Dame) was in |
| the movie ``Rudy'' with the actor Gerry Becker, who was in the movie |
| ``Sleepers'' with Kevin Bacon. Why Kevin Bacon? For some reason the |
| three students who invented the game, Mike Ginelli, Craig Fass, and |
| Brian Turtle decided that Kevin Bacon is the center of the |
| entertainment world. The Kevin Bacon game is quite similar to the <a |
| href="http://www.oakland.edu/~grossman/erdoshp.html">Erdös |
| Number</a> which has ``been a part of the folklore of |
| mathematicians throughout the world for many years''. |
| |
| <P> |
| The ``Six Degrees of Kevin Bacon'' game is really a graph problem. If |
| you assign each actor to a vertex, and add an edge between two actors |
| if they have appeared together in a movie, then you have a graph that |
| represents the data for this game. Then the problem of finding a trail |
| of actors to Kevin Bacon becomes a traditional graph problem: that of |
| finding a <I>path</I> between two vertices. Since we wish to find a |
| path that is shorter than six steps, ideally we would find the |
| <I>shortest path</I> between the vertices. One might think to apply |
| Dijkstra's shortest path algorithm to this problem, but that would be |
| overkill since Dijkstra's algorithm is meant for situations when each |
| edge has an associated length (or weight) and the goal is to find the |
| path with the shortest cumulative length. Since we are only concerned |
| with finding the shortest paths in terms of the number of edges the |
| breadth-first search algorithm will solve the problem (and use less |
| time than Dijkstra's). |
| |
| <P> |
| |
| <H2>Input File and Graph Setup</H2> |
| |
| <P> |
| For this example we will use a much smaller graph than all movies and |
| all actors. The full source code for this example is in <a |
| href="../example/kevin-bacon.cpp"><TT>examples/kevin-bacon.cpp</TT></a>. |
| We have supplied a file <TT>kevin-bacon.dat</TT> which contains a list |
| of actor pairs who appeared in the same movie. Each line of the file |
| contains an actor's name, a movie, and another actor that appeared in |
| the movie. The ``;'' character is used as a separator. Here is an |
| excerpt. |
| |
| <P> |
| <PRE> |
| Patrick Stewart;Prince of Egypt, The (1998);Steve Martin |
| </PRE> |
| |
| <P> |
| Our first task will be to read in the file and create a graph from |
| it. We use a <TT>std::ifstream</TT> to input the file. |
| |
| <P> |
| <PRE> |
| std::ifstream datafile("./kevin-bacon.dat"); |
| if (!datafile) { |
| std::cerr << "No ./kevin-bacon.dat file" << std::endl; |
| return EXIT_FAILURE; |
| } |
| </PRE> |
| |
| <P> |
| We will want to attach the actor's names to the vertices in the graph, |
| and the movie names to the edges. We use the <TT>property</TT> class to |
| specify the addition of these vertex and edge properties to the graph. |
| We choose to use an undirected graph, because the relationship of |
| actors appearing together in a movie is symmetric. |
| |
| <P> |
| <PRE> |
| using namespace boost; |
| typedef adjacency_list<vecS, vecS, undirectedS, |
| property<vertex_name_t, string>, |
| property<edge_name_t, string > > |
| > Graph; |
| </PRE> |
| |
| <P> |
| To access the properties, we will need to obtain property map |
| objects from the graph. The following code shows how this is done. |
| |
| <P> |
| <PRE> |
| property_map<Graph, vertex_name_t>::type actor_name = get(vertex_name, g); |
| property_map<Graph, edge_name_t>::type connecting_movie = get(edge_name, g); |
| </PRE> |
| |
| <P> |
| Next we can start to loop through the file, parsing each line into a |
| sequence of tokens using the <a href="../../tokenizer/index.html">Boost |
| Tokenizer Library</a>. |
| |
| <P> |
| <PRE> |
| for (std::string line; std::getline(datafile, line);) { |
| char_delimiters_separator<char> sep(false, "", ";"); |
| tokenizer<> line_toks(line, sep); |
| ... |
| } |
| </PRE> |
| |
| <P> |
| Each line of the input corresponds to an edge in the graph, with the |
| two actors as the vertices incident to the edge, and the movie name |
| will be a property attached to the edge. One challenge in creating the |
| graph from this format of input is that the edges are specified by a |
| pair of actor names. As we add vertices to the graph, we'll need to |
| create a map from actor names to their vertices so that later |
| appearances of the same actor (on a different edge) can be linked with |
| the correct vertex in the graph. The <a |
| href="http://www.sgi.com/tech/stl/Map.html"><TT>std::map</TT></a> |
| can be used to implement this mapping. |
| |
| <P> |
| <PRE> |
| typedef graph_traits<Graph>::vertex_descriptor Vertex; |
| typedef std::map<string, Vertex> NameVertexMap; |
| NameVertexMap actors; |
| </PRE> |
| |
| <P> |
| The first token will be an actor's name. If the actor is not already |
| in our actor's map we will add a vertex to the graph, set the name |
| property of the vertex, and record the vertex descriptor in the map. |
| If the actor is already in the graph, we will retrieve the vertex |
| descriptor from the map's <TT>pos</TT> iterator. |
| |
| <P> |
| <PRE> |
| NameVertexMap::iterator pos; |
| bool inserted; |
| Vertex u, v; |
| |
| tokenizer<>::iterator i = line_toks.begin(); |
| std::string actors_name = *i++; |
| |
| tie(pos, inserted) = actors.insert(std::make_pair(actors_name, Vertex())); |
| if (inserted) { |
| u = add_vertex(g); |
| actor_name[u] = actors_name; |
| pos->second = u; |
| } else |
| u = pos->second; |
| </PRE> |
| |
| <P> |
| The second token is the name of the movie, and the third token is the |
| other actor. We use the same technique as above to insert the second |
| actor into the graph. |
| |
| <P> |
| <PRE> |
| std::string movie_name = *i++; |
| |
| tie(pos, inserted) = actors.insert(std::make_pair(*i, Vertex())); |
| if (inserted) { |
| v = add_vertex(g); |
| actor_name[v] = *i; |
| pos->second = v; |
| } else |
| v = pos->second; |
| </PRE> |
| |
| <P> |
| The final step is to add an edge connecting the two actors, and record |
| the name of the connecting movie. |
| |
| <P> |
| <PRE> |
| graph_traits<Graph>::edge_descriptor e; |
| tie(e, inserted) = add_edge(u, v, g); |
| if (inserted) |
| connecting_movie[e] = movie_name; |
| </PRE> |
| |
| <P> |
| |
| <H2>A Custom Visitor Class</H2> |
| |
| <P> |
| We create a custom visitor class to record the Kevin Bacon |
| numbers. The Kevin Bacon number will be the shortest distances from |
| each actor to Kevin Bacon. This distance has also been referred to as |
| the ``Bacon Number'' in memory of the ``Erdos Number'' used to rank |
| mathematicians according to how many publications separate them from |
| Paul Erdos. The shortest distance to from Kevin Bacon to each actor |
| will follow the breadth-first tree. The BFS algorithm identifies edges |
| that belong to the breadth-first tree and calls our visitor's |
| <tt>tree_edge</tt> function. We record the distance to the target |
| vertex as one plus the distance to the source vertex. |
| |
| |
| <PRE> |
| template <typename DistanceMap> |
| class bacon_number_recorder : public default_bfs_visitor |
| { |
| public: |
| bacon_number_recorder(DistanceMap dist) : d(dist) { } |
| |
| template <typename Edge, typename Graph> |
| void tree_edge(Edge e, const Graph& g) const |
| { |
| typename graph_traits<Graph>::vertex_descriptor |
| u = source(e, g), v = target(e, g); |
| d[v] = d[u] + 1; |
| } |
| private: |
| DistanceMap d; |
| }; |
| // Convenience function |
| template <typename DistanceMap> |
| bacon_number_recorder<DistanceMap> |
| record_bacon_number(DistanceMap d) |
| { |
| return bacon_number_recorder<DistanceMap>(d); |
| } |
| </PRE> |
| |
| <P> |
| We will use a vector to store the bacon numbers. |
| |
| <P> |
| <PRE> |
| std::vector<int> bacon_number(num_vertices(g)); |
| </PRE> |
| |
| <H2>Apply the Breadth-First Search</H2> |
| |
| <P> |
| The BFS algorithm requires a source vertex as input; of course this |
| will be Kevin Bacon. We now call the BGL BFS algorithm, passing in the |
| graph, source vertex, and the visitor. |
| |
| <P> |
| <PRE> |
| Vertex src = actors["Kevin Bacon"]; |
| bacon_number[src] = 0; |
| |
| breadth_first_search(g, src, visitor(record_bacon_number(&bacon_number[0]))); |
| </PRE> |
| |
| <P> |
| We can output the Bacon number for each actor simply by looping |
| through all the vertices in the graph and use them to index into the |
| <TT>bacon_number</TT> vector. |
| |
| <P> |
| <PRE> |
| graph_traits<Graph>::vertex_iterator i, end; |
| for (boost::tie(i, end) = vertices(g); i != end; ++i) |
| std::cout << actor_name[*i] << "'s bacon number is " |
| << bacon_number[*i] << std::endl; |
| </PRE> |
| |
| <P> |
| Note that vertex descriptor objects can not always be used as indices |
| into vectors or arrays such as <TT>bacon_number</TT>. This is valid |
| with the <TT>adjacency_list</TT> class with <TT>VertexList=vecS</TT>, |
| but not with other variations of <TT>adjacency_list</TT>. A more |
| generic way to index based on vertices is to use the ID property |
| map (<TT>vertex_index_t</TT>) in coordination with the <A |
| HREF="../../property_map/doc/iterator_property_map.html"><TT>iterator_property_map</TT></a>. |
| |
| <P> |
| Here are some excepts from the output of the program. |
| <PRE> |
| William Shatner's bacon number is 2 |
| Denise Richards's bacon number is 1 |
| Kevin Bacon's bacon number is 0 |
| Patrick Stewart's bacon number is 2 |
| Steve Martin's bacon number is 1 |
| ... |
| </PRE> |
| |
| |
| |
| |
| <br> |
| <HR> |
| <TABLE> |
| <TR valign=top> |
| <TD nowrap>Copyright © 2000-2001</TD><TD> |
| <A HREF="http://www.boost.org/people/jeremy_siek.htm">Jeremy Siek</A>, Indiana University (<A HREF="mailto:jsiek@osl.iu.edu">jsiek@osl.iu.edu</A>) |
| </TD></TR></TABLE> |
| |
| </BODY> |
| </HTML> |
| <!-- LocalWords: gif Hesburgh Ginelli Fass Erd vertices cerr namespace vecS |
| --> |
| <!-- LocalWords: undirectedS Tokenizer sep tokenizer toks NameVertexMap pos |
| --> |
| <!-- LocalWords: bool Erdos BFS typename DistanceMap bfs const num BGL src |
| --> |
| <!-- LocalWords: indices Shatner's Richards's siek htm |
| --> |