Present: I'm a research associate in the Computational Social Science Lab at New York University in Abu Dhabi, advised by Prof. Bedoor AlShebli. I like to study people, their communication, and their networks to understand, uncover and solve the complex and inconspicuous societal problems using large-scale datasets. I use multiple network science, machine learning, and data science concepts and tools within my research. I'm currently working on projects in Science of Science, Science of Collaboration, and Inequality, in collaboration with Prof. Kinga R. Makovi, Prof. Talal Rahwan, and Prof. Wifag Adnan.
Past: I did my Master's in Language Technologies from Language Technologies Institute (LTI) at Carnegie Mellon University in Pittsburgh where I was mentored by
Prof. Kathleen M. Carley and Prof. David R. Mortensen on my master's thesis on
"Characterizing Misinformed Online Health Communities." I have also worked as a
research scholar at LTI for two years under the supervision of Prof. Rita Singh, and
Prof. Bhiksha Raj working on projects in the field of speech processing and
voice forensics.
I did my Bachelor's in Computer Science from Carnegie Mellon University in Qatar where I was mentored by
Dr. Ingmar Weber and Prof. Saquib Razak for
my undergraduate thesis on "Lifestyle Disease Surveillance Using Population Search Behaviour."
Future: In my career, I have been fortunate enough to find amazing and caring advisors and collaborators who have collectively shaped my interests, and given me opportunities to tackle interesting questions. I try to optimize for long-term collaborations and mentorship. My academic tree can be found below. In terms of work, I often think about the implications of my research, and I hope to eventually be able to do something that makes some difference in the world -- the world where impact is not just measured by citations.
Home for me is where my mom (ami) is. For now that's Pakistan's Hyderabad uncommonly known as "The Lion City", named in the honour of Ali (aka hayder), the fourth caliph.
I love collaborating with people. So don't hesitate to reach out if you would like to collaborate with me, or just talk about life.
Bedoor AlShebli*,
Shahan Ali Memon,
James A. Evans,
Talal Rahwan*
Accepted for presentation at ICSSI (2023)
Accepted as a poster at IC2S2 (2023)
Under Review
Preprint
Fierce geopolitical tensions between China and the U.S. have led to policies that discourage cross-border collaboration and migration in the field of Artificial Intelligence. Despite this, we analyze a dataset of 363,000 AI scientists and 5,400,000 papers showing that China and the U.S. have been leading the field since 2000 in terms of impact, novelty, productivity, and workforce. Significant bidirectional migration is observed with both countries being primary destinations for one another. Collaborations between the two countries while increasing still represent a small fraction of their total productivity. Yet, we show that the two countries produce more impactful research when collaborating together, suggesting that promoting cross-border collaboration and migration could benefit the field of AI.
Shahan Ali Memon,
Kinga Makovi*,
Bedoor AlShebli*
Accepted for presentation at ICSSI (2023)
Accepted for presentation at Frontiers of Network Science Workshop (2022 & 2023)
Accepted for presentation at IC2S2 (2022)
In Preparation
Recording (IC2S2)
Slides (IC2S2)
Retracting academic papers is a fundamental tool for social control in the academy, and in the vast majority of cases happen only under the most extreme circumstances: when the science behind papers, or the integrity of authors come into question. While retractions do not completely erase papers from the academic record, they can have important implications for retracted scientists and their careers. In this project, we aim to uncover whether retracted authors (RQ1) retain fewer collaborators, (RQ2) gain fewer new collaborators, (RQ3) close fewer triads, and (RQ4) get penalized for public retractions, than their matched non-retracted scientists.
Susan Dun,
Hatim Rachdi
Shahan Ali Memon,
Yelena Mejova,
Ingmar Weber
International Journal of Sport Communication (2022; IF:1.59; Q-Index:Q2)
Accepted for presentation at the 105th NCA 105th Annual Convention (2019)
Paper
We assessed the discussion around FIFA World Cup 2022 in the Twittersphere to shed some light on whether Qatar’s nation-branding and soft power attempts are reflected in public perceptions.
Navin Kumar, Isabel Corpus, Meher Hans, Nikhil Harle, Nan Yang,
Curtis McDonald, Shinpei Nakamura Sakai, Kamila A Janmohamed, Weiming Tang,
Jason L Schwartz, S Mo Jones-Jang, Koustuv Saha,
Shahan Ali Memon,
Chris Bauch,
Munmun De Chaudhury,
Orestis Papakyriakopoulos,
Joseph D Tucker, Abhay Goyal,
Aman Tyagi,
Kaveh Khoshnood,
Saad Omer
BMC Public Health (2022; IF:3.98)
Paper
The purpose of this analysis was to detail the behavior of top Reddit users, posts’ relationship with events early in the vaccine timeline, and the relationship between subreddits that shared COVID-19 vaccine posts. Research questions are as follows: What is the behavior of top Reddit users in regards to COVID-19 vaccines (RQ1)? What are Reddit posts’ relationship with events early in the vaccine timeline (RQ2)? What is the relationship between subreddits that shared COVID-19 vaccine posts (RQ3)?
Wenbo Zhao,
Yang Gao,
Shahan Ali Memon,
Bhiksha Raj,
Rita Singh
25th International Conference on Pattern Recognition (ICPR 2020)
Paper
Slides
In regression tasks, the data distribution is often too complex to be fitted by a single model. In contrast, partition-based models are developed where data is divided and fitted by local models. These models partition the input space and do not leverage the input-output dependency of multimodal-distributed data, and strong local models are needed to make good predictions. Addressing these problems, we propose a binary tree-structured hierarchical routing mixture of experts (HRME) model that has classifiers as non-leaf node experts and simple regression models as leaf node experts.
Shahan Ali Memon,
Kathleen M. Carley
International Workshop on Mining Actionable Insights from Social Networks (MAISoN) (in conj. with CIKM 2020)
Funded by Center for Machine Learning and Health (CMLH)
Paper
Slides
Data
Codebook
Recording
From conspiracy theories to fake cures and fake treatments, COVID-19 has become a hot-bed for the spread of misinformation online. It is more important than ever to identify methods to debunk and correct false information online. In this paper, we present a methodology and analyses to characterize the two competing COVID-19 misinformation communities online: (i) misinformed users or users who are actively posting misinformation, and (ii) informed users or users who are actively spreading true information, or calling out misinformation. The goals of this study are two-fold: (i) collecting a diverse set of annotated COVID-19 Twitter dataset that can be used by the research community to conduct meaningful analysis; and (ii) characterizing the two target communities in terms of their network structure, linguistic patterns, and their membership in other communities.
Shahan Ali Memon,
Aman Tyagi,
David R. Mortensen,
Kathleen M. Carley
International Conference on Social Computing, Behavioral-Cultural Modeling & Prediction and Behavior Representation in Modeling and Simulation (SBP-BRiMS 2020)
Funded by Center for Machine Learning and Health (CMLH)
Paper
Slides
Public health practitioners and policy makers grapple with the challenge of devising effective message-based interventions for debunking public health misinformation in cyber communities. Framing and personalization of the message is one of the key features for devising a persuasive messaging strategy. For an effective health communication, it is imperative to focus on preference based framing where the preferences of the target sub-community are taken into consideration. To achieve that, it is important to understand and hence characterize the target sub-communities in terms of their social interactions. In the context of health-related misinformation, vaccination remains to be the most prevalent topic of discord. Hence, in this paper, we conduct a sociolinguistic analysis of the two competing vaccination communities on Twitter: pro-vaxxers or individuals who believe in the effectiveness of vaccinations, and anti-vaxxers or individuals who are opposed to vaccinations
Hira Dhamyal,
Shahan Ali Memon,
Bhiksha Raj,
Rita Singh
Annual Conference of the International Speech Communication Association (INTERSPEECH 2020)
Paper
Can vocal emotions be emulated? This question has been a recurrent concern of the speech community, and has also been vigorously investigated. It has been fueled further by its link to the issue of validity of acted emotion databases. Much of the speech and vocal emotion research has relied on acted emotion databases as valid proxies for studying natural emotions. To create models that generalize to natural settings, it is crucial to work with valid prototypes -- ones that can be assumed to reliably represent natural emotions. More concretely, it is important to study emulated emotions against natural emotions in terms of their physiological, and psychological concomitants. In this paper, we present an on-scale systematic study of the differences between natural and acted vocal emotions.
Shahan Ali Memon,
Saquib Razak
Ingmar Weber
Journal of Medical Internet Research (2020; IF:7.08)
Accepted for presentation at the Population Association of America (PAA 2021)
Accepted for presentation at CMU Qatar Meeting of the Minds (MoM 2017)
Paper
Slides (PAA)
Poster (MoM)
Code
Recording (PAA)
Slides for Google Trends Denormalization
As the process of producing official health statistics for lifestyle diseases is slow, researchers have explored using Web search data as a proxy for lifestyle disease surveillance. Existing studies, however, are prone to at least one of the following issues: ad-hoc keyword selection, overfitting, insufficient predictive evaluation, lack of generalization, and failure to compare against trivial baselines. The aims of this study were to (1) employ a corrective approach improving previous methods; (2) study the key limitations in using Google Trends for lifestyle disease surveillance; and (3) test the generalizability of our methodology to other countries beyond the United States.
Shahan Ali Memon*,
Wenbo Zhao*,
Bhiksha Raj,
Rita Singh
International Joint Conference on Neural Networks (IJCNN 2019)
Paper
Slides
Regression-via-Classification (RvC) is the process of converting a regression problem to a classification one. Current approaches for RvC use ad-hoc discretization strategies and are suboptimal. We propose a neural regression tree model for RvC. In this model, we employ a joint optimization framework where we learn optimal discretization thresholds while simultaneously optimizing the features for each node in the tree.
Shahan Ali Memon,
Hira Dhamyal,
Oren Wright,
Daniel Justice,
Vijaykumar Palat,
William Boler,
Bhiksha Raj,
Rita Singh
arXiv
Paper
Slides
Recording
Do men and women perceive emotions differently? Popular convictions place women as more emotionally perceptive than men. Empirical findings, however, remain inconclusive. Most prior studies focus on visual modalities. In addition, almost all of the studies are limited to experiments within controlled environments. Generalizability and scalability of these studies has not been sufficiently established. In this paper, we study the differences in perception of emotion between genders from speech data in the wild, annotated through crowdsourcing. While we limit ourselves to a single modality (i.e. speech), our framework is applicable to studies of emotion perception from all such loosely annotated data in general. Our paper addresses multiple serious challenges related to making statistically viable conclusions from crowdsourced data. Overall, the contributions of this paper are two fold: a reliable novel framework for perceptual studies from crowdsourced data; and the demonstration of statistically significant differences in speech-based emotion perception between genders.
Shahan Ali Memon,
Rohith Krishnan Pillai,
Susan Dun,
Yelena Mejova,
Ingmar Weber
International ACM Conference on Web Science (WebSci 2017)
Accepted for presentation at 2016 Qatar Foundation Annual Research Conference (QFARC)
Accepted for presentation at CMU Qatar Meeting of the Minds (MoM 2017)
Paper
Poster
Abstract
Is it possible to "hack" an image of an international entity by driving international and domestic media? Here, we present an image/brand monitoring tool for a country, Qatar, which presents an overview of the contexts and references to media in which it is mentioned on social media. Tracking dozens of languages, this tool allows a global understanding of the perceptions and concerns Twitter users associate with Qatar, and which mainstream media may be driving these sentiments.
Journal Reviewer: Elsevier's Information Processing and Management Journal, 2021
Journal Reviewer: Journal of Medical Internet Research, 2020
Conference Reviewer: IEEE International Conference on Machine Learning and Applications, 2020
Moderator: SBP-BRiMS, 2020
Peer Health Advocate: Mental Health Advocate, Student Health Services, Carnegie Mellon University, 2016-2017
Co-founder/Co-designer: CMU Qatar Mindfulness Room, 2016-2017
Board Member: Academic Review Board and University Disciplinary Committee (ARB-UDC), Carnegie Mellon University, 2015-2017