Thursday, October 20, 2016

GHC16: How the World Bank uses Social Network Analysis to Support Entrepreneurship

GHC16: How the World Bank uses Social Network Analysis to Support Entrepreneurship

Kathy Qian, Data Scientist and Innovations Lab Consultant, World Bank

Kathy's session centered on the work she does for the World Bank in gathering and analyzing data to support entrepreneurship, particularly in developing countries. The World Bank is a consortium of about 180 countries that have pooled money and resources together with the expressed intention of eradicating poverty - so while the World Bank is a bank, it doesn't really act as one :-). Given that goal, its a little hard to see why the World Bank cares about tech start-ups. There's a simple explanation - policymakers around the world are very interested in creating jobs. And one sector that's been churning out more jobs than any other is the tech sector. In point of fact, only about 10% of those jobs actually involve directly working on technology - often, the jobs are created as a by-product of something that happens in the tech sector. For example, in the recent past, tons of people have been transforming themselves into Uber drivers and AirBnB hosts.
Even accounting for population, Kathy's data-set shows that growth rates in entrepreneurship are different across different cities - for example, Bogota's growth rate is much much lower than that of NYC's.
While policy makers tend to look at discrete pieces of infrastructure that come together in measuring city growth from the entrepreneurship perspective, the World Bank looks at the community as a whole. For instance, they take into account factors like the impact of community-building events (conferences), skill-building events and so on. Kathy also explained the difference between social media analysis, which involves looking at data from social media like Facebook or Twitter, and social network analysis, which looks at nodes (ie, organizations or start-ups) and edges (the relationships between those organizations).

The World Bank started off their research by looking at different types of centralities in social network analysis, such as degree centrality (how many people one is directly connected to), closeness centrality, and eigen-vector centrality (is the person that you're connected to very well-connected?).

One of the major problems that they face is that there is no good data source containing information on entrepreneurship. Developing countries in particular have a dearth of data. To overcome this, the World Bank first looks at cities that they are familiar with and that have a rich corpus of data (for example, New York City). (While Silicon Valley would be an obvious choice when it comes to entrepreneurship statistics, it is also unique in that it grew without much help or intervention from policy makers. So data from Silicon Valley isn't used in this study. New York, on the other hand, does have policy makers playing a huge role in its success, and was therefore a good choice.) Some stats that came to light - each founder starts about 1.12 companies, incubators have the highest eigen-vector centrality and start-ups the highest closeness centrality. The World Bank even ran regressions to extract data such as where start-ups are most likely to be founded. They were also able to show that you have a likelier chance of being funded if you already know someone else who received funding themselves (this way, you'll know the right people to get in touch with). This data is used in conjunction with data gathered from 12 cities around the world (Cairo, Bogota, Dar es Salaam, etc).

Other factors that the World Bank looks into: skills pipeline (are university students graduating with the right skill sets for the job market; do they tend to immigrate elsewhere after graduation, and so on), supporting infrastructure, such as what kind of support the government gives entrepreneurs (in Columbia, for instance, start-ups receive a ton of public funding, so that often they go from one program to another without actually having to make any money).

There are also several challenges that the team faces in using data for social good. For example, there's no thumb rule for analysis of data from different countries, even for something as simple as an address. Also, people tend to deeply distrust the government (in one instance, entrepreneurs refused to answer any World Bank surveys in the mistaken belief that they were going to be audited by the government!). Privacy laws also vary widely between countries, which makes anonymizing data that much more difficult. Plus, different cultural norms mean that the applicability of what is learned is often constrained. Agile testing and iteration are also next to impossible. Finally, there is often a data literacy gap between the various stakeholders.

Kathy believes that using data analytics to bring about social change is of prime importance, and invited those who share that belief to get in touch with her.

In summary, Kathy's presentation was both insightful and inspirational, and served to show how data analysis can be used for the greater good.

(This post was syndicated from

No comments:

Post a Comment