UNC MSA Alumni Database
Pandas
Folium
Spacy
Regex

Project Overview
The UNC MSA Alumni Database project involved creating a comprehensive database of Muslim Students Association alumni dating back to 1952. The goal was to build a network visualization that would help current students connect with alumni in their fields of interest.
Data Collection & Processing
This project involved several data science challenges:
- Collecting and normalizing historical records from various sources
- Web scraping LinkedIn profiles to gather current professional information
- Using natural language processing to categorize alumni by industry and expertise
- Geocoding alumni locations for spatial visualization
- Creating a data structure that could be easily maintained and updated
Technical Implementation
I used several Python libraries to process and visualize the data:
- Pandas: For data cleaning, transformation, and analysis
- Regex: For pattern matching and text extraction from unstructured sources
- Spacy: For named entity recognition and text classification
- Folium: For creating interactive maps showing alumni distribution
- Proxi: For building the final interactive network visualization
Results & Impact
The completed database includes:
- Over 1,200 alumni records spanning seven decades
- Comprehensive professional information for 80% of recent graduates
- Interactive network visualization showing connections between alumni
- Geographic distribution map highlighting global alumni presence
- Searchable database allowing current students to find mentors in their field
This project has significantly enhanced the MSA's ability to maintain alumni connections and has already facilitated several mentorship relationships between current students and alumni.