UNC MSA Alumni Database

Pandas
Folium
Spacy
Regex
UNC MSA Alumni Database Network Visualization

Project Overview

The UNC MSA Alumni Database project involved creating a comprehensive database of Muslim Students Association alumni dating back to 1952. The goal was to build a network visualization that would help current students connect with alumni in their fields of interest.

Data Collection & Processing

This project involved several data science challenges:

  • Collecting and normalizing historical records from various sources
  • Web scraping LinkedIn profiles to gather current professional information
  • Using natural language processing to categorize alumni by industry and expertise
  • Geocoding alumni locations for spatial visualization
  • Creating a data structure that could be easily maintained and updated

Technical Implementation

I used several Python libraries to process and visualize the data:

  • Pandas: For data cleaning, transformation, and analysis
  • Regex: For pattern matching and text extraction from unstructured sources
  • Spacy: For named entity recognition and text classification
  • Folium: For creating interactive maps showing alumni distribution
  • Proxi: For building the final interactive network visualization

Results & Impact

The completed database includes:

  • Over 1,200 alumni records spanning seven decades
  • Comprehensive professional information for 80% of recent graduates
  • Interactive network visualization showing connections between alumni
  • Geographic distribution map highlighting global alumni presence
  • Searchable database allowing current students to find mentors in their field

This project has significantly enhanced the MSA's ability to maintain alumni connections and has already facilitated several mentorship relationships between current students and alumni.