1. Questions
This bibliometric analysis addresses two questions, 1) Is use of Twitter data declining in urban research? 2) What might be lost for urban research if Twitter data remains less accessible?
The focus on Twitter (now ‘X’ but referred to as Twitter) within urban research is a result of recent changes to data access. In 2023, Twitter introduced monetized access tiers and a limited ‘free’ tier. The number of Tweets that can be collected for free reduced from 500,000 to 1,500 per month (Mehta 2023). For the monetized tiers, ‘Basic’ costs $100 a month for 10,000 Tweets, ‘Pro’ costs $5,000 per month for 1m tweets, and above these costs are negotiable for projects/organizations (X 2024). Generally, urban research that have previously used Twitter data would fall into the ‘Pro’ or above categories, as it is typical to use more than 1m tweets i.e. 11m Tweets (Claramunt et al. 2011) or 275m Tweets (Murty, Gross, and Pensavalle 2016).
As a result of these changes, Twitter’s attractiveness, and accessibility as a data source for urban research has been potentially undermined (Davidson et al. 2023). To understand the implications of these changes, it is useful to investigate the trends in use of social media data within urban research.
2. Methods
Papers reporting urban research that use social media (SM) data were extracted from Web of Science and Scopus, up until 1st January 2025. To search for and analyze the papers, a bibliometric analysis procedure was adopted based on Donthu et al. (2021) methodology. Firstly, search terms were used to identify journal articles and conference papers, employing terms related to 'urban’, ‘social media’ and data analysis or data types (i.e. ‘mining’, ‘geotag’). Searches were made within the title, abstract and keywords, and filtered using urban related topics, retrieving 2,812 papers. A qualitative review was performed on all abstracts to identify SM data sources. During which, irrelevant papers were removed, and a total 1,409 papers remained. A further search identified 122,398 urban research papers, by reducing search terms to ‘urban’ keywords and filtering for urban related topics.
Topic modeling was performed on the 1,409 urban SM abstracts using BERTopic model (Bidirectional Encoder Representations from Transformations) (Grootendorst 2022). BERTopic converts all documents to embedding representations using a pre-trained language model (Grootendorst 2022). Embeddings are then clustered, and topic representations extracted using a term frequency-inverse document frequency (TF-IDF) measure (Grootendorst 2022). Each abstract was assigned a topic, whilst outliers (not assigned a topic) were redistributed using BERTopic outlier reduction function. Similar topics were merged, and topic names were derived from keywords and representative documents.
3. Findings
Of the 1,409 urban SM research papers, 76 different SM platforms were identified as a data source. Twitter was observed as most frequently used, accounting for 35.7% of all papers, followed by Weibo (15.5%) (Table 1). In comparison, an earlier review by Stock (2018) on SM data in geospatial research, reported a larger share for Twitter (54.2%) and limited use of Weibo (2.9%). The popularity of Chinese social media platforms in urban research (Weibo, WeChat, Dianping, Tencent QQ) was also not previously identified (Stock 2018).
Observing temporal trends in publications, Figure 1 shows steady growth in SM data in urban research from 2011. Growth peaks between 2018-2021, as the publication rate exceeds urban research, highlighting widespread use of SM data. Publication rate then starts to decline from 2021. Despite this, publication rate is above earlier levels, suggesting SM remains a relevant data source however its growth is shifting.
Urban research using Twitter data shows a similar trend but with some key differences. It’s growth starts earlier than SM data, and rises more sharply between 2013-2018, indicating it’s popularity over other SM data sources (Figure 1). Publication rate also declines earlier and shows a more appreciable drop from 2022, suggesting Twitter data is declining more intensely.
When Twitter is compared to other SM data sources in urban research, Figure 2 shows in more detail the declining share of Twitter from 2022. Most other SM platforms fluctuate in publications across the period, but have showed previous drops in usage (Facebook, Foursquare and Flickr). Only Weibo and ‘Other’ platforms are substantially increasing their share.
To explore what might be lost for urban research from declining use of SM data, topics were identified in the urban SM research papers (Table 2). From the topics, broad application areas can be defined which align to previously reported urban SM research areas (Niu and Silva 2020). Mobility/Activity flows (30.3% of papers) are the most common focus, typically leveraging spatio-temporal data to analyze urban movement and activity. Research on urban perception is also frequently studied (26% of papers), often utilizing text and image data to capture experiences and emotions about urban spaces.
SM platforms contribute differently to urban topics, as shown in Figure 3. Twitter is frequently used across most topics. Previous findings have shown the overwhelming use of the ‘geotag’ in geospatial research, which are commonly collected from Twitter (Stock 2018). Foursquare, Flickr and Instagram are often applied to urban perception topics (3,7), due to providing geo-located image data. Chinese SM platforms (Weibo, WeChat and Tencent QQ) are present across many topics, but contribute frequently to a few (5, 9, 12). The variation in SM data source by urban topic, suggests changes in data accessibility could possibly reshape not only the volume of research but also the types of urban phenomena that can be studied.
This analysis highlights a broad decline in the use of SM data in urban research, with a notable drop in Twitter data. The reasons for this shift remain unclear, though reduced accessibility and declining interest in the platform may be contributing factors. This trend could continue, with active U.S. users of X decreasing by 8.4% since October 2024, while its competitor Bluesky has grown its user base by 1,064% (Boyd 2024). It will remain to be seen whether the decline in Twitter data access will prompt researchers to look elsewhere for urban analytics, and whether new forms of data can fill the gap that is left.