Shahan Ali Memon شاہان علی میمن

Present: I currently hold the position of Research Associate at New York University Abu Dhabi, under the guidance of Professor Bedoor AlShebli. My research focuses on the intricate dynamics of human interaction and networks, seeking to unravel and address complex societal issues through the analysis of large-scale datasets. My work is deeply rooted in network science, machine learning, and data science, and I am actively engaged in projects spanning the domains of Science of Science, Science of Collaboration, and Inequality. These endeavors are collaborative efforts with esteemed colleagues, including Professor Kinga R. Makovi, Professor Talal Rahwan, and Professor Wifag Adnan.

Past: I did my Master's in Language Technologies from Language Technologies Institute (LTI) at Carnegie Mellon University in Pittsburgh where I was mentored by Professor Kathleen M. Carley and Professor David R. Mortensen on my master's thesis on "Characterizing Misinformed Online Health Communities." I have also worked as a research scholar at LTI for two years under the supervision of Professor Rita Singh, and Professor Bhiksha Raj working on projects in the field of speech processing and voice forensics.
I did my Bachelor's in Computer Science from Carnegie Mellon University in Qatar where I was mentored by Professor Ingmar Weber and Professor Saquib Razak for my undergraduate thesis on "Lifestyle Disease Surveillance Using Population Search Behaviour."

Future: In my career, I have been fortunate enough to find amazing and caring advisors and collaborators who have collectively shaped my interests, and given me opportunities to tackle interesting questions. I try to optimize for long-term collaborations and mentorship. My academic tree can be found below. In terms of work, I often think about the implications of my research, and I hope to eventually be able to do something that makes some difference in the world -- the world where impact is not just measured by citations.

Home for me is where my mom (ami) is, and that's Pakistan's Hyderabad in Sindh, uncommonly known as "The Lion City", named in the honour of Ali (aka hayder), the fourth caliph.

I love collaborating with people. So don't hesitate to reach out if you would like to collaborate with me, or just talk about life.

a portrait of shahan ali memon

Ongoing Research

China and the U.S. produce more impactful AI research when collaborating together

Bedoor AlShebli*, Shahan Ali Memon, James A. Evans, Talal Rahwan*
Accepted for presentation at ICSSI (2023)
Accepted as a poster at IC2S2 (2023)
Under Review 
Slides (ICSSI)
Preprint Poster (IC2S2)

Fierce geopolitical tensions between China and the U.S. have led to policies that discourage cross-border collaboration and migration in the field of Artificial Intelligence. Despite this, we analyze a dataset of 363,000 AI scientists and 5,400,000 papers showing that China and the U.S. have been leading the field since 2000 in terms of impact, novelty, productivity, and workforce. Significant bidirectional migration is observed with both countries being primary destinations for one another. Collaborations between the two countries while increasing still represent a small fraction of their total productivity. Yet, we show that the two countries produce more impactful research when collaborating together, suggesting that promoting cross-border collaboration and migration could benefit the field of AI.

Characterizing the effect of retractions on careers of scientists

Shahan Ali Memon, Kinga Makovi*, Bedoor AlShebli*
Accepted for presentation at ICSSI (2023) **Best Paper Award**
Accepted for presentation at Frontiers of Network Science Workshop (2022 & 2023)
Accepted for presentation at IC2S2 (2022)
Under Review 
Recording (IC2S2) Slides (IC2S2) Slides (ICSSI)

Retracting academic papers is a fundamental tool for social control in the academy, and in the vast majority of cases happen only under the most extreme circumstances: when the science behind papers, or the integrity of authors come into question. While retractions do not completely erase papers from the academic record, they can have important implications for retracted scientists and their careers. In this project, we aim to uncover whether retracted authors (RQ1) retain fewer collaborators, (RQ2) gain fewer new collaborators, (RQ3) close fewer triads, and (RQ4) get penalized for public retractions, than their matched non-retracted scientists.


Perceptions of FIFA men’s world cup 2022 host nation Qatar in the Twittersphere

Susan Dun, Hatim Rachdi Shahan Ali Memon, Yelena Mejova, Ingmar Weber
International Journal of Sport Communication (2022; IF:1.59; Q-Index:Q2) 
Accepted for presentation at the 105th NCA 105th Annual Convention (2019)

We assessed the discussion around FIFA World Cup 2022 in the Twittersphere to shed some light on whether Qatar’s nation-branding and soft power attempts are reflected in public perceptions.

COVID-19 vaccine perceptions in the initial phases of US vaccine roll-out: an observational study on Reddit

Navin Kumar, Isabel Corpus, Meher Hans, Nikhil Harle, Nan Yang, Curtis McDonald, Shinpei Nakamura Sakai, Kamila A Janmohamed, Weiming Tang, Jason L Schwartz, S Mo Jones-Jang, Koustuv Saha, Shahan Ali Memon, Chris Bauch, Munmun De Chaudhury, Orestis Papakyriakopoulos, Joseph D Tucker, Abhay Goyal, Aman Tyagi, Kaveh Khoshnood, Saad Omer
BMC Public Health (2022; IF:3.98) 

The purpose of this analysis was to detail the behavior of top Reddit users, posts’ relationship with events early in the vaccine timeline, and the relationship between subreddits that shared COVID-19 vaccine posts. Research questions are as follows: What is the behavior of top Reddit users in regards to COVID-19 vaccines (RQ1)? What are Reddit posts’ relationship with events early in the vaccine timeline (RQ2)? What is the relationship between subreddits that shared COVID-19 vaccine posts (RQ3)?

Hierarchical routing mixture of experts

Wenbo Zhao, Yang Gao, Shahan Ali Memon, Bhiksha Raj, Rita Singh
25th International Conference on Pattern Recognition (ICPR 2020) 
Paper Slides

In regression tasks, the data distribution is often too complex to be fitted by a single model. In contrast, partition-based models are developed where data is divided and fitted by local models. These models partition the input space and do not leverage the input-output dependency of multimodal-distributed data, and strong local models are needed to make good predictions. Addressing these problems, we propose a binary tree-structured hierarchical routing mixture of experts (HRME) model that has classifiers as non-leaf node experts and simple regression models as leaf node experts.

Characterizing COVID-19 misinformation communities using a novel Twitter dataset

Shahan Ali Memon, Kathleen M. Carley
International Workshop on Mining Actionable Insights from Social Networks (MAISoN) (in conj. with CIKM 2020) 
Funded by Center for Machine Learning and Health (CMLH)
Paper Slides Data Codebook Recording

From conspiracy theories to fake cures and fake treatments, COVID-19 has become a hot-bed for the spread of misinformation online. It is more important than ever to identify methods to debunk and correct false information online. In this paper, we present a methodology and analyses to characterize the two competing COVID-19 misinformation communities online: (i) misinformed users or users who are actively posting misinformation, and (ii) informed users or users who are actively spreading true information, or calling out misinformation. The goals of this study are two-fold: (i) collecting a diverse set of annotated COVID-19 Twitter dataset that can be used by the research community to conduct meaningful analysis; and (ii) characterizing the two target communities in terms of their network structure, linguistic patterns, and their membership in other communities.

Characterizing sociolinguistic variation in the competing vaccination communities

Shahan Ali Memon, Aman Tyagi, David R. Mortensen, Kathleen M. Carley
International Conference on Social Computing, Behavioral-Cultural Modeling & Prediction and Behavior Representation in Modeling and Simulation (SBP-BRiMS 2020) 
Funded by Center for Machine Learning and Health (CMLH)
Paper Slides

Public health practitioners and policy makers grapple with the challenge of devising effective message-based interventions for debunking public health misinformation in cyber communities. Framing and personalization of the message is one of the key features for devising a persuasive messaging strategy. For an effective health communication, it is imperative to focus on preference based framing where the preferences of the target sub-community are taken into consideration. To achieve that, it is important to understand and hence characterize the target sub-communities in terms of their social interactions. In the context of health-related misinformation, vaccination remains to be the most prevalent topic of discord. Hence, in this paper, we conduct a sociolinguistic analysis of the two competing vaccination communities on Twitter: pro-vaxxers or individuals who believe in the effectiveness of vaccinations, and anti-vaxxers or individuals who are opposed to vaccinations

The phonetic bases of vocal expressed emotion: natural versus acted

Hira Dhamyal, Shahan Ali Memon, Bhiksha Raj, Rita Singh
Annual Conference of the International Speech Communication Association (INTERSPEECH 2020) 

Can vocal emotions be emulated? This question has been a recurrent concern of the speech community, and has also been vigorously investigated. It has been fueled further by its link to the issue of validity of acted emotion databases. Much of the speech and vocal emotion research has relied on acted emotion databases as valid proxies for studying natural emotions. To create models that generalize to natural settings, it is crucial to work with valid prototypes -- ones that can be assumed to reliably represent natural emotions. More concretely, it is important to study emulated emotions against natural emotions in terms of their physiological, and psychological concomitants. In this paper, we present an on-scale systematic study of the differences between natural and acted vocal emotions.

Lifestyle disease surveillance using population search behavior: feasibility study

Shahan Ali Memon, Saquib Razak Ingmar Weber
Journal of Medical Internet Research (2020; IF:7.08) 
Accepted for presentation at the Population Association of America (PAA 2021)
Accepted for presentation at CMU Qatar Meeting of the Minds (MoM 2017)
Paper Slides (PAA) Poster (MoM) Code Recording (PAA)
Slides for Google Trends Denormalization

As the process of producing official health statistics for lifestyle diseases is slow, researchers have explored using Web search data as a proxy for lifestyle disease surveillance. Existing studies, however, are prone to at least one of the following issues: ad-hoc keyword selection, overfitting, insufficient predictive evaluation, lack of generalization, and failure to compare against trivial baselines. The aims of this study were to (1) employ a corrective approach improving previous methods; (2) study the key limitations in using Google Trends for lifestyle disease surveillance; and (3) test the generalizability of our methodology to other countries beyond the United States.

Neural regression trees

Shahan Ali Memon*, Wenbo Zhao*, Bhiksha Raj, Rita Singh
International Joint Conference on Neural Networks (IJCNN 2019) 
Paper Slides

Regression-via-Classification (RvC) is the process of converting a regression problem to a classification one. Current approaches for RvC use ad-hoc discretization strategies and are suboptimal. We propose a neural regression tree model for RvC. In this model, we employ a joint optimization framework where we learn optimal discretization thresholds while simultaneously optimizing the features for each node in the tree.

Detecting gender differences in perception of emotion in crowdsourced data

Shahan Ali Memon, Hira Dhamyal, Oren Wright, Daniel Justice, Vijaykumar Palat, William Boler, Bhiksha Raj, Rita Singh
Paper Slides Recording

Do men and women perceive emotions differently? Popular convictions place women as more emotionally perceptive than men. Empirical findings, however, remain inconclusive. Most prior studies focus on visual modalities. In addition, almost all of the studies are limited to experiments within controlled environments. Generalizability and scalability of these studies has not been sufficiently established. In this paper, we study the differences in perception of emotion between genders from speech data in the wild, annotated through crowdsourcing. While we limit ourselves to a single modality (i.e. speech), our framework is applicable to studies of emotion perception from all such loosely annotated data in general. Our paper addresses multiple serious challenges related to making statistically viable conclusions from crowdsourced data. Overall, the contributions of this paper are two fold: a reliable novel framework for perceptual studies from crowdsourced data; and the demonstration of statistically significant differences in speech-based emotion perception between genders.

Public perception of a country: exploring tweets about Qatar

Shahan Ali Memon, Rohith Krishnan Pillai, Susan Dun, Yelena Mejova, Ingmar Weber
International ACM Conference on Web Science (WebSci 2017) 
Accepted for presentation at 2016 Qatar Foundation Annual Research Conference (QFARC)
Accepted for presentation at CMU Qatar Meeting of the Minds (MoM 2017)
Paper Poster Abstract

Is it possible to "hack" an image of an international entity by driving international and domestic media? Here, we present an image/brand monitoring tool for a country, Qatar, which presents an overview of the contexts and references to media in which it is mentioned on social media. Tracking dozens of languages, this tool allows a global understanding of the perceptions and concerns Twitter users associate with Qatar, and which mainstream media may be driving these sentiments.

Latest News

  • [Jul 18-20 2023] Attended and presented our work on "U.S. and China produce more impacful AI research when collaborating together" as a poster at IC2S2 2023 in Copenhagen.
  • [Jun 28 2023] Our paper on "Characterizing the effect of retractions on scientific careers" won "Best Student Paper Award" at ICSSI 2023.
  • [Jun 26-28 2023] Attended and presented our work on "Characterizing the effect of retractions on scientific careers" at the Networks Workshop at ICSSI 2023 at Northwestern University in Evanston.
  • [May 18-19 2023] Attended and presented our work on "Exploring the impact of retractions on academic reputation" at the Networks Workshop at NYU main campus.
  • [Jul 19-22 2022] Attended and presented our work on scientific retractions at IC2S2 2022 in Chicago
  • [May 18 2022] Attended and presented our work on scientific retractions at the Networks Workshop at NYU Abu Dhabi
  • [Apr 25 2022] Our submission on "Characterizing the effect of scientific retractions on collaboration networks" got accepted at IC2S2 2022.
  • [May 6 2021] Attended and presented at PAA 2021 (Remote)
  • [Feb 2 2021] Our JMIR paper on lifestyle disease surveillance got accepted for presentation at the Population Association of America (PAA) 2021.
  • [Oct 1 2020] Joined NYU Abu Dhabi as a Research Associate
  • [Aug 6 2020] Defended my Master's Thesis on Characterizing Misinformed Online Health Communities.
  • [May 17 2020] Graduated from CMU LTI with MSc. in Language Technologies
  • [Oct 21 2019] Presented our work on Speech Emotion Recognition from Voice in the Wild
  • at SEI Research Review 2019
  • [Aug 26 2019] Started master's in language technologies at CMU LTI
  • [Apr 9 2019] Won the Center of Machine Learning for Health (CMLH) Fellowship in Digital Health
  • [Jul 29 2017] Joined CMU LTI as a Research Scholar
  • [May 1 2017] Graduated from CMUQ with BSc. in Computer Science
  • [Aug 15 2013] Arrived @CMUQ to study Computer Science
More >


  • 2021 Spring, Course Design Assistant, Applied Data Science for Social Scientists (in Python) with ProfessorBedoor AlShebli
  • 2021 Spring, Teaching and Course Design Assistant, Computational Forensics & AI with ProfessorRita Singh
  • 2020 Spring, Teaching and Course Design Assistant, Computational Forensics & AI with ProfessorRita Singh
  • 2015 Spring, Teaching Assistant, Interpretation & Argument with ProfessorSilvia Pessoa
  • 2014 Spring, Programming Peer Tutor, Academic Resource Center (ARC), CMU Qatar
  • 2014 Fall, English Language Instructor, Language Bridges Program, CMU Qatar
  • 2009 Summer, Instructor, Taleem-e-Balighan (lit: Education for Adults) program, Ladies Club School Hyderabad


  • 2023 Best Paper Award at ICSSI
  • 2023 ICSSI Travel Grant
  • 2020 SBP-BRiMS Graduate Student Scholarship
  • 2019 Center of Machine Learning for Health Fellowship Winner
  • 2018 Finalist for Best Overall at HackPrinceton
  • 2018 Finalist for Best Design at HackPrinceton
  • 2017 College Honors for Undergraduate Research Thesis
  • 2017 Outstanding Service to the Computer Science Community
  • 2017 Audience Choice Award at NYUAD Hackathon for Social Good
  • 2017 Senior Student Leadership Awards
  • 2015 IMPAQT Cultural Ambassador
  • 2013 Dean's List at National University of Computer & Emerging Sciences
  • 2012 Second Position at Speed Programming Competition
  • 2012 Dean's List at National University of Computer & Emerging Sciences
  • 2009 Sixth Position among 30k+ students in District Hyderabad in 10th Grade


Journal Reviewer: eLife, 2023
Journal Reviewer: SAGE Communication & Sport, 2023
Journal Reviewer: Elsevier's Information Processing and Management Journal, 2021
Journal Reviewer: Journal of Medical Internet Research, 2020
Conference Reviewer: IEEE International Conference on Machine Learning and Applications, 2020
Moderator: SBP-BRiMS, 2020
Peer Health Advocate: Mental Health Advocate, Student Health Services, Carnegie Mellon University, 2016-2017
Co-founder/Co-designer: CMU Qatar Mindfulness Room, 2016-2017
Board Member: Academic Review Board and University Disciplinary Committee (ARB-UDC), Carnegie Mellon University, 2015-2017