Skip to main content

Unintended consequences: data practice in the backstage of social media

Abstract

Through an ethnographic study of Chinese IT professionals who integrate a form of data culture into the digital platforms they design, maintain, and operate daily within one of China’s tech giants, this paper reveals numerous overlaps and interrelations between the data practices of Chinese IT professionals and the broader social implications that arise from them. The aim is to foster a more productive dialogue between the social studies of quantification and platform studies. This original research proposes the backstage as a potent methodology for inquiring into the role of Chinese IT professionals and domestic tech giants in advancing measuring systems and audit culture. This paper concludes by suggesting that such an approach can also be applied to wider studies of the paradox in quantification between its general claims and specific effects.

Introduction

The precarious working experience of gig workers in platform-based labor markets has frequently hit the headlines in China, as has also been observed elsewhere, reflecting domestic public anxiety about the social consequences of “platformization” (Mayer-Schönberger and Cukier 2013; Poell et al. 2019; Vallas and Schor 2020). Algorithmic management, which refers to the use of numbers to coordinate and control various aspects of work in platform-based labor markets, has emerged as a prominent concept in platform studies. This approach has been used to elucidate the way in which gig workers’ working experience has been governed by algorithms, with research focusing on user perspectives (Rosenblat 2018; Christin 2020; Duffy 2020). An additional strand of research has expanded our understanding of algorithms as not only a technical tool but also as culture (Coleman 2013; Dourish 2016; Gillespie 2016; Seaver 2017, 2018, 2022; Boellstorff 2015). These researchers are examining how algorithms have become cultural artifacts, being influenced by and subsequently expressing the beliefs, norms, and practices of their developers and data scientist communities, and are embedded in broader social and normative value.

Following such a cultural approach, this paper investigates the internal data practices of Chinese IT professionals who design, operate, and maintain a digital platform within everyday life. By actively engaging with platform studies and cross-disciplinary social studies of quantification, this research demonstrates the value of conceptualizing data as having a social and cultural life, shifting attention to the context of the social-technical practice through which data were generated, presented and acted upon the world rather than taking the technology of data as an abstract black box. One of the major advantages of such an approach is that it reveals how the social consequences of platformization—such as the ubiquity of clickbait mechanisms (Miller 2000) and captivating algorithms (Seaver 2018) across online infrastructures and metrification of users’ labor (Sun 2019)—are intertwined with the internal data practices of IT professionals. This paper also contributes to the social studies of quantification by conceptualizing digital platforms as nascent social entities in facilitating the unrestricted penetration of quantification into various spheres of society, including the private sphere, complementing previous studies that largely focused on the public sector.

Big data and algorithms are merely the latest waves in the long history of quantification. Sociological studies of quantification, including subfields such as “governance by numbers", have shown how data are transformed into authoritative knowledge, imbued with power, and employed as a mode of governance that shapes social relationships and subjectivity (Rose 1990; Miller 2001; Shore and Wright 2015a). Historical examinations have traced the emergence of measurement systems in the eighteenth and nineteenth centuries Europe, where statistics played a pivotal role in nation-state formation and colonial governance (Desrosiéres 1998; Hacking 1990; Porter 1986). The early twentieth century witnessed the introduction of scientific management principles by Frederick Taylor, leading to standardized performance metrics for productivity. Performance measurement systems later gained traction in the public sector as governments aimed to improve accountability and effectiveness (Power 1994). In the twenty-first century, real-time digital analytics, such as dashboards and data management tools, revolutionized performance measurement, particularly in technology companies. Quantification has expanded beyond economic domains to encompass a shift from the nation-state to global governance (Merry 2016, 2011), an extension from public affairs to the private realm (Lupton 2015, 2016; Neff and Nafus 2016; Schüll 2018), and even the way it has reached into the realms of nature (Verran 2010). Others have shown how quantification serves as a cognitive infrastructure (Hirschman and Berman 2014). The understanding of numbers as an objective description of reality outside of interpretation was a project of modernity (Poovey 1998).

The increasing dominance of quantification in various domains of society is fueled by a blend of naturalism and capitalist interests in maximizing profits. The naturalist view assumes that all knowledge should follow the epistemology and practices of the natural sciences; thus, all phenomena should be reducible to numbers. This perspective is grounded in the Enlightenment belief in the power of rationality, as noted by MacKenzie (2008), and is an expression of epistemological positivism. The logic of the market demands that everything be measured and evaluated in terms of its monetary value, which has led to the commodification of many areas of social life (Chong 2018). Moreover, public institutions and the private sector have increasingly relied on data as a tool for management since data promises to bring transparent information, evidence-based decision-making, fair labor evaluation and accountability, and a democratic environment (Power 1997; Shore and Wright 2015b; Strathern 2000).

However, it is crucial to address the paradoxes between the claims made on behalf of data and the evidence for the consequences of data (Douglas‐Jones et al. 2021; Hoeyer 2023). Critical data studies challenge the ability of data to accurately represent the world by examining the cultural habits of data scientists (Lowrie 2018); the expertise and relationships involved in data collection, processing, and analysis; and how these factors can compromise the apparent objectivity of data (Walford 2021). The difference between the intended and actual effects of quantification is noteworthy. For example, Klaus Hoeyer’s research revealed that intensified data collection in Denmark’s health care system did not produce the intended benefits but had the opposite effect: despite the promises of reduced workload and increased automation, many data initiatives burdened medical practitioners with increased data-related responsibilities (2023). James Scott also argues that quantification can introduce simplification and standardization to state measurement systems, potentially leading to the failure of large-scale social projects. An example is found in the history of scientific forestry, where the radical simplification of forests into a single commodity eliminated our understanding of the consequences of cohabitation among various natural species (1998). The power of metrics is ingrained in the discourse, knowledge, and Foucauldian “regimes of truthFootnote 1” surrounding quantification. These contradictions partly arise from the fact that people who are subject to measurement often consciously alter their behaviors to conform to, appease, or manipulate the metrics (Espeland and Stevens 2007; Sauder and Espeland 2009). The prevalence of quantification in governing social life raises significant questions about the key social actors and institutions driving its widespread adoption and the mechanisms through which this is achieved. Furthermore, this approach prompts an exploration of why data sometimes fail to fulfill their promises, resulting in unintended consequences.

Here, I reference the backstage of a digital platform, which offers a privileged site for the conduct of such an inquiry by bringing the social studies of quantification and platform studies into more productive dialogue. Such a dialogue builds on other new work in showing how ethnography can provide key evidence and new perspectives in understanding data as having an inherently paradoxical social life that underlies the discussion of digital platforms as key participants in “governance by number”. Since 1980, with the state’s retreat from the provision of public service and digital systems, IT firms have quietly taken these up as tools of bureaucratic governance to become leading players in digital politics (Knox, forthcoming). Platforms have reconstructed virtually every aspect of our social activities and mediated our relationships with the world (Dijck et al. 2018). This relationship between platforms and governance is particularly complex in China. In China, the platform economy signals a transition within domestic technology companies from mainly capitalizing on providing IT solutions to increasingly providing and creating digital platforms that enable the flow of information, labor, and resources. It is notable that the IT professionals examined in this paper are “not only objects of data (about whom data is produced) but that they are also subjects of data” (Ruppert 2017).

The following section outlines the methods used to undertake my analysis. Then, the data practices of IT professionals that occurred backstage of the platform are examined in the ethnographic material. The second half shifts attention to how the leakage of such data culture has brought more facets of social life into these quantitative relationships. This paper concludes by discussing the wider empirical and theoretical implications of this paper.

Methodology

This research is based on 11 months of intensive ethnographic field research conducted within a project team of a prominent Chinese internet company. The research involved attending numerous meetings, industrial workshops, and data collection and processing activities with data experts. Unstructured interviews were conducted with various data practitioners, including data scientists, operational specialists, project managers, and developers of different seniority and position. Additionally, attention was given to nonhuman agents such as cloud documents. Projects such as “hacking growth”Footnote 2and technologies such as event tracking are also under ethnographic scrutiny because they not only materialize but also constitute the expert mode of organizational processes and subjectivities (Riles 2000, 2006; Hull 2012; Latour 2007).

Most of the materials discussed in this paper were drawn from direct observation of the actual process of data collection, the metrification of labor, the implantation of data consciousness into design and the penetration of quantification as the predominate ideology within and beyond the backstage of social media. This approach has been inspired by two threads of academic debate—one derives from STS scholars who have underscored the significance of the backstage “construction site” of knowledge and technology, such as labs and the involved expert communities—because they reveal not only “how the political, economic, and social effects and possibilities of data are determined by the plethora of decisions and transformations involved in the design of its platform” (Ruckenstein and Schüll 2017; Suchman 2011; Gregory and Bowker 2016) but also how such possibilities are constricted with wider sociopolitical conditions. Another thread with which this paper has engaged is the recent interest in “the social life of methods”, which concerns the way in which methods are invented, travel and have effects in the world (Ruppert et al. 2013; Savage 2013; Nafus and Knox 2018:6–7).

The primary ethical issue raised by this research project was clearly the involvement of what might be regarded as the trade secrets of the examined company alongside the privacy and interests of the individuals involved. In response to these issues, all the evidence presented in this paper is given in more generic quantitative form rather than as actual specific numbers, and only mainstream indicators and techniques in the industry are discussed. The term "Stacker" was used to refer to multiple platforms that were involved in this fieldwork. Finally, all of the interviewees' names were anonymized, and the details were changed so that they could not be recognized.

Data practice in the backstage

This section aims to outline the internal data practices of IT professionals on the Stacker platform. Specifically, the data collection methodology prioritizes clicking over other users' experiences. The work performance evaluation system, which seeks to align individual employee work with institutional goals, produces outcomes that are counterproductive to the intended objectives.

Event-tracking: a technology of mining data

For people who are promoting the rise of big data, the typical claim is that this provides a more accurate reflection of the world and an enhanced source of scientific knowledge. However, the evidence presented in this section questions these assumptions as to how data collection is shaped by subjective judgments, technological reductionism, divergent cultural perspectives, and professional expertise.

In data collection, subjective judgment about the data has been recognized as a source of authority for challenging the accuracy of machinery-generated data. IT professionals have used the term "data sense" to refer to the subjective interpretation of data, which is now explicitly recognized as a required soft skill in many IT job descriptions. For the data scientist Kang and other IT professionals, there are many instances where the objectivity of machine-generated data is questioned by subjective judgment (O'Reilly 2016). On Monday, Kang started his typical workday by surveilling platform traffic through large monitors while sipping coffee to prepare his body for the fast-paced working tempo. The line representing page views showed an unexplainable downward trend, suggesting that homepage visitors plummeted drastically within one week. With users both joining and quitting the platform, a certain fluctuation is expected and acceptable, but this more drastic change appeared quite unnatural to Kang. Instinctively, he felt that something must have gone wrong with the computational way of calculating the website’s traffic, more specifically within the event tracking technology.

His subjective feeling about the way data were being managed is valued within the IT workplace because it is well acknowledged among the community of IT professionals that the datafication of online activities is less objective and value neutral than commonly imagined, with both mechanical and cultural factors involved. As such, there is room for subjective judgment to intervene and for IT professionals to exercise their "data sense" in interpreting the data.

The starting point has to be the reductive role that machines play in capturing people’s online experience. Event trackingFootnote 3 is the standard methodology adopted by internet companies. This method enables the transformation of users' online activities into various datasets, which can be subsequently used by IT professionals to analyze users' trends and patterns. The problem is that, in practice, such transformations privilege certain perspectives over others because this computation of users' online experience through human–machine interactions inevitably reduces the multifaceted experience of users into a single decontextualized recording of how users click elements of the webpage, thus depriving users of their social context.

Various cultural categories come to play a role in defining the boundaries of these online activities. For example, visiting a platform is a common category among monitored events where the event is the technical unit of human‒machine interaction. However, this requires a predetermined classification and standard as to whether a login or the browsing of an unsigned user counted as a single event or whether a user who is employing both the PC and mobile versions should be counted as one or two events. Diverging understandings of online behavior between individuals and institutions can often result in data incongruence. In this case, owing to Kang’s data sense, programmers identified how the intern Wu set an improper reporting time, which caused the loss of data that they had observed. Report time refers to the moment when user behavior triggers a condition, and the computer automatically counts this value once an event occurs. The intern Wu set the reporting time as the moment when the user received data from the server; in computational terms, this means that a user has opened the website and kept strolling down. However, such a setting excludes users who might have closed the webpage before the required data were sent back from the server. They were no longer counted as page views. This debug was resolved when the front-end programmer reset the report time to the moment when users opened their webpages. However, this was not simply a case of mistaken coding causing an inaccurate estimation of the actual flow of web usage. The intern had not been wrong but simply chosen a different definition of what constitutes visiting the website. He had interpreted visiting the website as a user fully reviewing the information on the webpage rather than just opening the window. Kang's programming approach might well have been considered acceptable by other technical teams, as there are differing perspectives on this same problem of definition. Often, such incidents occur with interns simply because of their limited familiarity with the company's own data culture. As the front-end programmer explained, event tracking is a process that translates human actions into a language that computers can understand, bridging the gap between the complexity of human behavior and an arbitrary quality in the programming language. This gap creates the space where programmers' expertise becomes crucial. This is why the term “visiting the website” can actually refer to various different possibilities. The technical aspect is always intertwined with culture because there is inherent interpretation and judgment involved in systems of classification, the selection of metrics, the weighting of elements, and decisions regarding which baseline to use for comparisons (Merry 2016).

Quantification leads to the generation of new forms of data politics. Given that the platform companies I examined predominantly employ flatter, decentralized organizational structures,Footnote 4 data-driven decision-making introduces novel dynamics in the democratization of the workplace. The ability to collect and interpret data, colloquially referred to as 'data sense' among my informants, serves as a source of cultural capital for individuals seeking career success within the organization. Diverse interpretations of the data generate significant controversy among professionals. However, in this case, data interpretation is no longer solely the prerogative of high-level positions. Instead, ordinary employees frequently harness their data expertise to contest decisions made by upper management. Many colleagues voluntarily write documents about data knowledge to help other colleagues because these documents are beneficial for enhancing their "personal influence" in 360-degree evaluations. Outside the organization, most metrics employed by the IT industry in China arise from the “hacking growth” methodology invented in Silicon Valley due to the global influence of that region. As a result, ordinary users rarely have a chance to unpack the opacity of these measurement systems or to fundamentally reshape the measurement system.

This case study contributes to academic debates about how datafication, such as event-tracking technology, involves more than simply converting what can be the continuity of reality into numerical entities; it also determines what is considered valuable or worthy of measurement (Verran 2010). The predominance of click counts over other types of data reflects the privileging of quantifiable data over unquantifiable, affective dimensions of online activity, such as a sense of belonging or emotional resonance. By attending to the microprocesses through which the data were collected, the categories were defined, the phenomena were named, and the expertise was enacted, the paradox of the data (Hoeyter 2023) was revealed. Big data seems to present an aura of objectivity and a neater picture of the world, but mere data collection is already an inherently value-laden and subjective process that involves making judgments about what to collect, how to measure it, and what to prioritize. These judgments reflect the values, assumptions, and intentions of those involved in the datafication process (Gitelman 2013) and the regimes of power within which they are formed. This approach also exemplified how the social dimension of technical infrastructure on digital platforms could be examined empirically. I will discuss how such a reductive effect of datafication can lead to consequences such as clickbait mechanisms and captivating algorithms in the following section.

Governing the platform by numbers

This section shows how an unquantifiable ethical project, “empower developers by platform”, was dismantled into different measurable metrics related to content production and consumption through the Objectives and Key Results (OKRs)Footnote 5 system. Arguably, this is the process of dismantlement, in which the indicators for measuring individuals’ work performance shift people’s attention to meeting their situated metrics rather than overall goals.

Knowing how to articulate one's work achievements using the language of Objectives and Key Results is a critical aspect of IT professionals' career advancement since it forms the foundation upon which their work performance is evaluated, often more than what they actually do.Footnote 6 At project team Stacker, every Wednesday night, Cici and her colleagues fill out their work reports in a cloud-based document using the OKR framework. Cici's work report demonstrates how the OKR framework helps to clarify the connection between individual work and institutional goals.

The key result of my work was that the daily active content creatorFootnote 7 has increased from 838 to 921 (10%), which aligns with the User Operation department's objective to increase daily active creators and content production. This outcome also supports the overall objective of Project Stacker, which is to increase daily active usersFootnote 8 by 10%.” (Cici, 22, female, operation specialist, user operation department).

For Cici, the 10% increase is just one of many ways to narrate her work, but it fails to represent the full range of contributions that she made to Stacker. She finds true fulfillment in her work by resolving user feedback by passing it on to product-design coworkers, building interpersonal relationships with content creators through daily chatting, and maintaining a sense of community among users by manually deleting hate speech. These unquantifiable aspects of her work cannot be included in an OKR file. These OKR files were prepared for a weekly alignment conference held every Thursday, where employees communicate and share their OKR data via a cloud-based document, ensuring that individual work is aligned with institutional goals and promoting transparency and accountability. However, on these occasions, some employees are required to state their OKRs in front of all their colleagues. One informant mocked this as "a collective art experimental behaviour," implying that it may be superficial or performative.

Cici’s work report manifested the dismantling of an institutional goal into multiple layers. First, the Stacker project has translated its ethical pursuit of “building the most influential platform for Chinese developers” into the number of daily active users (DAU).Footnote 9 Second, the metric of daily active users has been further dismantled into content-related metrics for different departments, such as the number of active content creators. At the individual level, each IT professional was evaluated based on their provision of key results, numerical evidence of their work, and the ways in which they align with departmental and institutional goals. Cici’s work report is an example of this (Fig. 1).

Fig. 1
figure 1

Details about the OKR system

At its inception, the Stacker project team held a lofty vision of creating the most influential platform for Chinese developers and a mission of empowering them with excellent content and tools. This vision has much to do with the platform, which was not expected to generate revenue through ads or subscription fees after its acquisition by a Chinese internet giant as a strategic investment in 2019. Even though the Stacker project team's vision and mission initially appear to be morally driven, commercial metrics such as the number of daily active users (DAU) have been employed as practical indicators of the platform's value to developers. Second, the Stacker project was organized around intensifying users' content production, circulation, consumption, and interaction,Footnote 10 and each department was dedicated to promoting various content-related metric growth. The user operation department handled user feedback, maintained content creator relationships, and launched campaigns to increase content creation and user engagement. Their evaluation relied on content production and interaction metrics. The product design department improved the user experience through its technical design and was evaluated based on content circulation and consumption metrics. The product development team implemented prototypes, and their coding work was assessed using the error rate.

In June 2020, Stacker integrated a personalized recommendation algorithm into its homepage newsfeeds. However, the launch of the system resulted in a flood of negative user feedback, with many users complaining that their personalized newsfeeds were filled with old content that they had previously viewed but in which they no longer had an interest. James, the product owner of this recommendation system, attributed this failure to this choice in content pool filtering rather than to the algorithm per se.

Considering the relationship between content pools and the algorithm, the content pool is a curated collection of user-generated content, and different pools can be established based on various criteria, such as popularity, posting time, theme, comment number, and length. The content pool has a crucial role in determining the scope of information users receive. The algorithm acts as a distributor, circulating information from the content pool to users' devices. Thus, James and the other data scientists created a customized content pool. This content pool was filtered from all user-generated content based on a high click-through rate (CTR). The algorithm distributes the content from this customized pool into a newsfeed according to the users' preferences (Fig. 2).

Fig. 2
figure 2

Details about recommendation system

Why do they collectively choose to use the click-through rate over other factors as the filter when designing the content pool? The project decision to use the click-through rate as the primary factor for filtering the content pool is related to their metrics. At the time, the Stacker team had a quarterly objective of increasing daily active users (DAU) by 15%, with the product design department responsible for achieving a 15% increase in content circulation. James, as an individual, was assigned the task of boosting content circulation in the newsfeed. They chose the click-through rate as a filter because it is a commonly used metric for measuring content distribution efficiency and helps to improve the evaluation of their performance.

James emphasized the traction of metrics, stating, "For instance, when we introduced personalized recommendations, we aimed to increase the efficiency of content circulation, so we built the content pool using articles with the highest historical click-through rate. However, if we had set retention rateFootnote 11 as our target metric, the approach would have been entirely different, and we might have used a different content pool, such as the latest or most popular articles. "(32, male, product manager, product design department).

The metric-driven design has unexpected consequences. The CTR was defined as the ratio between the number of times a piece of content was displayed in the newsfeed and the number of times users clicked on it to read the full article. The metric favors older content that has already had time to accumulate clicks and does not accurately reflect the quality or relevance of more recently posted content. The content pool filtered by the CRT, as a critical component of the recommendation system, disrupted the sequence of newsfeeds, with newer content receiving low exposure rates and older content with high CTRs repeatedly appearing. A similar case would occur if Google Scholar were to sort academic articles according to only the number of citations. Older articles with a high number of citations would always appear on the first page, while more recent publications would appear on later pages and be less likely to be cited in the future due to a lack of exposure. Users were frustrated when they found that the new algorithm was feeding them outdated information, and content creators were unable to reach as wide an audience as they had hoped.

The performance evaluation system in Stack constitutes a broader trend toward an “audit society” (Power 1997) wherein ‘the principles and techniques of accountancy and financial management are applied to the governance of people and organisations’ (Shore and Wright 2015a, b) with an expectation of transforming employees into self-managed and calculating individuals.

A shift from the original moral objective of "building the most influential platform for Chinese developers" to prioritizing click counts exemplifies the “performativity” of metrics (MacKenzie 2008). The “DAU “ incentivize designers and engineers to prioritize features or designs that maximize those metrics, even if they do not necessarily align with the overall user experience or long-term goals. Empirical studies have shown the emergence of various "traps" devised and implemented in website and platform design, including consumer baits such as promotional gifts (Lupton 2016), automated prompts (Schüll 2016), and other micronudges aimed at regulating and reinforcing specific user behaviors; however, the design intent of these traps has been simplified to include profit seeking and data gathering. However, the empirical evidence here shows that these traps may merely arise from the temptation of metrics without being directly linked to profit and produce an effect against the supposed goal. The user response to such a trap mechanism will be discussed in the next section.

The overspill of data culture

This section will delineate the overspill of quantification practices among IT professionals. I define the overspill effect as the unrestricted employment of quantification’s own logic, values and concomitant relationships with the broader social sphere that was not intended to be governed by these principles, such as the transformation of users into quantified labor through dashboards and the management of private life by OKR languages.

Data dashboard: the metrification of users’ performance

This section delves into the responses of users to these "traps." In particular, it investigates the impact of user dashboards, which display visual representations of users' online activities. Digital platforms structure users' online behaviors through various governing instruments, such as interfaces (Bucher and Helmond 2018), algorithms (Beer 2017), service policies and incentive mechanisms, which constitute multiple forms of platform governance (Gillespie 2018; Gorwa 2019). One of the governance instruments examined here is the users' dashboard, which is a data visualization of their online behaviors. The design of a user's dashboard engineers users’ perceptions of their online experience. As a user-generated content (UGC) platform, the Stacker platform's functionality design and daily maintenance are organized around content production and consumption. The project team also measures the daily operation of its platform through various content metrics, as previously mentioned. Therefore, the user data dashboard mainly displays data related to content creation behavior, including pageviews, comments, likes, forward time, followers, etc. However, the emphasis on certain behaviors through the dashboard results in an unbalanced representation of users' overall online experience and distracts users from other important aspects of their platform usage. Many users initially joined Stacker with the intention of writing coding blogs for personal use, exchanging skills with other code enthusiasts, and staying updated on the technology industry, according to multiple user surveys conducted in 2020. However, the data dashboard that quantified their digital footprint has changed users’ priority to the number of comments, likes, reposts, and followers of their new posts (Fig. 3).

Fig. 3
figure 3

Details about user dashboard

The data shown in the users' dashboard did not provide a consistent source of information about users’ performance in platform Stackers but created additional puzzles for users. Initially, content creators generally gauge the success of their content circulation on the platform through two metrics, namely, the exposure rate and pageviews. The exposure rate refers to the number of times that their content is displayed on the homepage newsfeed, and users have no access to this rate. In contrast, pageviews refer to the number of readers who access the full-text page of the content through the newsfeed. The algorithm determines the extent to which a piece of content is exposed in the newsfeed, which can be translated into an exposure rate. The quality of the content determines the number of readers who click on the content page, which is indexed by pageviews.

However, after adjustments were made to the recommendation algorithm (mentioned in Sect. "Governing the platform by numbers"), the algorithm itself became too complex, making it difficult for both users and platform designers to understand which content would receive more exposure. Many content creators expressed their dissatisfaction with the observed decrease in pageviews. Different data access has also resulted in different interpretations by content creators and platform designers regarding this decline in pageviews. Users have access only to their pageviews, so they suspect that the algorithm has reduced the exposure of their new content, resulting in a decrease in their page views. Platform designers, on the other hand, are concerned only with the overall trend of changes in exposure and clicks and do not examine the correlation between individual content exposure rates and pageviews. Thus, they believed that the decrease in pageviews was due to the content itself not being attractive enough and not related to the algorithm. Additionally, more new users entered the newsfeed after the new algorithm was implemented because the project team increased spending on advertising in the same period.

To address the concerns of the content creators, the platform designer Xiao introduced the exposure rate and other traffic-related metrics to the users' dashboards so that they could understand the factors contributing to the fluctuations in pageviews, either the algorithm or the quality of the content. However, Xiao adopted a new way of calculating the exposure rate. If a user scrolls up and down the newsfeed, the content will fade in and out on the screen, appearing twice for a single user. Previously, this was considered one-time exposure for backstage workers, but it was considered two exposures on the users' dashboard (Fig. 4).

Fig. 4
figure 4

Details about newsfeed and full-text page

The introduction of the exposure rate into users' dashboards and the new calculation of the exposure rate provide users with the quantified representation of their behaviors. These dashboards not only enable users to reassess their online experience but also influence and shape their behaviors (Mackenzie 2005; Beer 2017). The dashboard offered in-depth data analysis on each piece of content they posted; they were able to recognize patterns in popular content and adjust their styles accordingly, shifting their focus to the pageviews of their content and gradually becoming obsessed with factors that cause changes in pageviews, although many users initially use this platform for learning front-end technology and keeping personal notes. Transforming “content productionFootnote 12” into a traceable and enjoyable task, offering real-time feedback to stimulate the desire to level up, exemplifies how gamification has been integrated into platform design to align user behavior with the platform's inherent goals, specifically individual professional work metrics. Similar mechanisms, including honor badges, reputation points, and progress graphs, essentially serve as surveillance apparatuses through which the platform channels users into an endless, never-ending levelling-up process (Whitson 2013).

To hone their quantified results, these content creators form informal partnerships by engaging in mutual activities, such as clicking, commenting on, and liking each other's content, with the goal of having their content labeled “trendy” by the platform's algorithm (Petre et al. 2019). This practice, termed tactical quantification (Irani and Silberman 2015), is emblematic of the broader landscape where diverse platform workers employ various strategies to manipulate algorithms by fine-tuning the data they contribute. These practices are prevalent within the extended concept of platform labor (Vallas and Schor 2020; Howcroft and Bergvall-Kåreborn 2019). This applies to content producers and influencers (Duffy 2017), gig workers such as food couriers (Sun 2019; Chen 2022) and ride-hailing drivers (Rosenblat 2018), and less visible microtask workers such as Turkers (Irani 2015) because their work conditions have predicted how their performance has been metrified algorithmically. The potential pitfall in these studies lies in oversimplifying the association between algorithmic labor control and a platform's pursuit of profit. Sometimes, algorithmic control may not yield the intended profits for the corporation. Exploring the algorithm's design and the normative framework by which technology architects view their work reveals that aligning the interests of digital companies with platform infrastructure may not unfold as smoothly as envisioned. Factors such as collective data culture and individual career pursuits skew the original intent of the process. This section adds complexity to the ongoing discussion by shifting the focus from the users' perspective to the technology architects deliberately imposing performance measurement systems on users. This illustrates that the intention behind this imposition is to transform users into laborers who may unwittingly contribute to the improvement of IT professionals' work.

Quantified self: the silent revolution in the private sphere

Mostly in their 20 s and early 30 s, these professionals were born after the historical transition from a planned to a market economy, which left them feeling more insecure in their lives as the state withdrew from many public domains; they also sought to move up in the world through hard work, thrift, and ability (Pieke 2014). This section will illustrate how IT professionals apply the same quantification logics, jargon, and methods used to measure the value of the platform to navigate their relationships with work, relationships, and self.

Digitalization of workload is the most common form of self-quantification among these IT professionals. On January 1, 2021, thousands of IT professionals posted screenshots of their personal workload data analytics generated from their remote-working software on WeChat moments. One of my informants also shared his analytics on social media; these indicated that he had sent 3,723 messages and hosted cloud meetings for 20,193 min; he had created 972 documents and received 1,142 likes from his colleagues. The analytics also showed to what extent he had outperformed a certain percentage of his colleagues in different work tasks; for example, he had surpassed 92% of his colleagues in the number of times that he received emoji replies from work messages, created and commented on documents, and attended cloud meetings. Surprisingly, IT professionals who are being monitored have embraced it without questioning its intrusion or the potential privacy issues it posed; moreover, none of them ever brought up these issues to the original designer, who might sit at the desk next to them. Most of the informants shared these numbers as evidence of their hard work and self-achievement, with one informant claiming, "I posted it online to share how hardworking I was in the previous year." On the other hand, many informants viewed the public sharing of their workload analytics as an implicit way to build their signal of membership within the IT community. Workload data analytics refers to IT professionals’ cultural totem because the internet industry is one of the few sectors where most work activities take place in a digital environment. As a result, there is a wealth of data available for the monitoring and assessment of work performance, providing the possibility for almost every aspect of work to be easily chronicled, analyzed, and compared in a quantifiable manner.

Among these IT professionals, some have developed a more sophisticated quantified mechanism for managing their intimate relationships. During one lunch break, Gavin, the marketing specialist of social media, perching on a bar stool, propagated the “productizationFootnote 13 of the romantic relationship” to her close coworkers. Gavin began with the underlying presumption, “Like any social relationship, there is SOP,Footnote 14 templates ensure at least 80% efficiency of your relationship. The productization of romantic relationships helps me focus on the quantified aspect of my relationship.”

Gavin established an intricate quantified system to monitor her relationship with her lesbian partner. This system included a digital calendar, a joint handwriting OKR (Objectives and Key Results) journal, and a cloud document titled "Aha moment." The digital calendar recorded the time they spent together, categorized into family day, artistic leisure such as visiting galleries, and intimate moments. The calendar generated data visualization about the actual hours, percentage, and fluctuation of various activities, through which they could rearrange their schedule if the calendar showed a lack of time together. Second, the OKR system was set up to help her partner "erase childhood daunting memories and self-acceptance as someone who could receive unconditional love." The key results measured how this major objective was achieved, from small things such as sharing bedtime stories and naming their pets to major life events such as furnishing a home and global travel. Finally, the "Aha moment" cloud document was used to record moments of significance in their relationship.

In Garvin’s account, the act of productizing her romantic relationship involved converting the intangible elements of love, intimacy, and trust into a trackable project with multiple quantified indicators that monitor the dynamics of their romance. By creating a system that could track and measure the dynamics of their relationship, Garvin believed she could identify potential issues or risks before they became more serious problems. This short feedback loop allows them to adjust their relationship before things get out of hand. Garvin expected that data could render aspects of the "somewhat inaccessible world of feelings and problems more tangible and comparable" (Sharon and Zandbergen 2017), which helped to maintain the optimal state of their unmeasurable love.

Gavin is among the wider IT community of people who believe that every aspect of life can be managed like an enterprise. The use of OKR, discussed earlier, has been adopted as a form of self-technologyFootnote 15 (Foucault 1988) by these individuals, allowing them to break down their life pursuits into measurable steps. A common format for this practice is the creation of spreadsheets containing the overall objective, several key results, and accompanying to-dos. The objective is typically focused on self-development, such as bodybuilding, developing career skills, or expanding relationships. Each key result is accompanied by an indicator to measure the extent to which the objective is being achieved, and the to-dos are planned actions that will facilitate the observable indicators. The use of internet jargon, such as "retention," "optimization," and other industry-specific terms, constitutes an important linguistic aesthetic in these OKR files (as presented in the graph). The authors intended to impress their audience with their expertise in internet-based modes of thinking and to signal their membership in a professional community. Self-tracking was appreciated as an aesthetic practice in which bits of the self, extracted and abstracted, become material for seeing and experiencing the self differently (Sherman 2016), enhancing and enlivening self-narratives (Ruckenstein 2014). In 2020, the community organized an internal competition called "OKR diary: A more fulfilling self," and thousands of participants submitted their OKR files for anonymous voting. The winning entries were distinguished by their precise jargon and meticulous quantification methods. This community embraces OKRs with faith that life as an enterprise could be achieved and enhanced by mechanical formulas and accurate calculations (Fig. 5).

Fig. 5
figure 5

Details about self-quantification

Foucault defines technologies of the self as "practices by which individuals’ effect by their own means a certain number of operations on their bodies, souls, thoughts, conduct and ways of being, so as to transform themselves in order to attain a certain state of happiness, purity, wisdom, perfection, or immortality" (1988:18). According to his account, technologies of the self-operate through various techniques, confessional practices, and scientific and medical discourses that shape and govern individuals' experiences of their own bodies, desires, and pleasures according to societal norms and power relations. This concept has reemerged in the examination of how neoliberal governance operates by encouraging individuals to constantly seek ways to discipline themselves and transform themselves into "calculative selves” (Miller 2014). However, similar practices of quantified audits had appeared long before the term neoliberalism was invented. Furthermore, Andrew Kipnis makes the important observation that “placing Chinese audit cultures in the framework of neoliberal governmentality reduces them to a derivative of a set of ideas that diffused from the West”. He proposes that anthropologists return to their observations of the actual process of auditing and reveal the nuanced intention of those audits and those being audited (2018).

Obviously, it is rather less surprising to posit self-quantification among IT professionals as having become critical to the project of transforming them into manageable, enterprising individuals (Li and Ong 2018). A close ethnographic examination of the material formats of these self-quantification processes reveals how the files and social attributes of these processes transcend their instrumental aspects. What was found was the way in which the metrics compiled for self-presenting and socializing with peers became more common than self-monitoring. Using open-source and interactive (visitors could easily insert trackable comments) processes, individuals could select metrics that demonstrably improved their aesthetic performance and showcase specific aspects of themselves. In practice, this practice did not truly facilitate further dialogs around scientific ways of acting or taking actual action in self-regulating, as might be argued by those seeking to legitimate such practices. Rather, this amounted to a kind of romanticization of OKR as a self-management tool that offered a common language that these IT professionals can relate to (Sharon 2017), signaling their membership in the professional community, upon which their identity categories and data sociality (Ruckenstein and Schull 2017) were produced. Through ethnographic observation, we can clearly discern the discrepancies between the claims being made about the consequences of datafication in these private reams and their actual consequences.

Conclusion

The central contribution of this paper is the set of ethnographic observations that can contribute original findings to certain key contemporary debates (Srnicek 2017; Couldry and Mejias 2019; De Kloet 2019) about the impact of new digital platforms and the culture of quantification. The current emphasis in platform studies is often on the critique of their consequences and the way they embody certain forms of power and economic control. However, such a critique may imply that these processes are indeed what they purport to be and correspond to certain internationalities. Therefore, it is important to demonstrate through case studies the degree to which quantification and datafication often do not enact those intentions and instead create unexpected consequences, without detracting from this critique. As such, this paper focuses on the backstage of such platform construction, building on new anthropological work that focuses upon the cultural nature of datafication and algorithms (Seaver 2017; Hoeyter 2023) as part of this dialog between critical data studies and platform studies.

By demonstrating the interrelation between the consequence of platformization and the inner quantification practice of backstage IT professionals, the primary contribution of this paper is that it provides a more nuanced empirical understanding of platformization as the penetration of digital platforms’ internal logic, inherent attributes, and governing principles in different economic sectors and spheres of life (Poell 2019). Such penetration is intimately related to the context of work. IT professionals tend to design platforms in such a way that their particular work achievements can be more easily identified by these quantitative methodologies. A further consequence of this is that this leads to users often being channeled into performing tasks that create traffic according to the “trap” inserted by designers. Due to the commercial success of these IT companies, an essentially quantitative methodology has increasingly become the prevalent epistemology in wider social domains.

The pervasiveness of such “attention-seeking mechanisms” (Miller 2000; Horst and Miller 2012:27) in online infrastructure is not new. However, this paper provides a clear picture of the underlying drivers behind this trend. These might not come directly from an IT company’s pursuit of profit but rather from individual professional tacit responses to the labor performance system within which they work. Such responses might sometimes unexpectedly lead to the weakening of capital extraction. The possibility of revealing such a paradox of quantification resides in the extension of research on platformization from focusing on platform users as the subject of abstract algorithmic control to including platform designers who have devised these algorithms. This approach is another way of accounting for the social embeddedness of technology (Hughes 1993) by emphasizing the current institutional operation of the platform as revealing continuity with the broader history of how organizations are transformed by audit culture. Additionally, this research builds on others in demonstrating the value of treating big data or algorithms as having a social life traversing the experience of users and designers, the boundary between the backstage and boarder social worlds, rather than treating these as decontextualized technical artifacts. This perspective echoes the analytic scope of STS, which emphasizes how norms/scripts are encoded by attending to designers and the design process (Hughes 1983; Pinch and Bijker 1987). As the internet is just part of the political economy of attention, such an approach could be applied to wider studies of infrastructure of structuring and manipulating our attention for political or economic purposes (Pedersen et al. 2021), such as gambling machines (Schüll 2012), advertisements (McCreery 1995), and roads (Harvey and Knox 2015).

This paper also contributes to social studies of quantification by critically examining the role of platform companies and IT professionals in the furtherance of the measurement system in contemporary China by recognizing that platform companies not only are subjected to audit culture but also actively contribute to its advancement by devising the tool of intensive digitalization of social phenomena, being the nexus market of economic activities, and promulgating and fetishizing the distinct IT data culture, such as growth-centered methodology (Troisi et al. 2020) and the OKR system, as the generalized cognitive infrastructure. Furthermore, this invites a criticism of growth: while the expansion of the IT industry has long been conceptualized as technological innovation driven in popular discourse, this study reveals that such growth has occurred (Hirsch 2022) through less creative work on eliciting a numerical increase in users’ engagement.

Additionally, this paper proposes "backstage ethnography" as a promising methodology for exploring the multifaceted aspects of platformization and other key social agents in quantification. This is inspired by the “social life of numbers” (Day et al. 2014). This approach involves a meticulous examination of the microprocesses of quantification and the expert community involved, including data collection and indicator uniformity (Merry 2016). This paper thereby elucidates the intricate dynamics between the inner practices of IT professionals and the external consequences that arise from those practices without denying the power asymmetry between ordinary platform users and IT professionals as a technocratic class.

Availability of data and materials

Not applicable.

Notes

  1. Foucault introduced the idea of "regimes of truth" to describe the ways in which specific forms of knowledge and truth are constructed and maintained within a given society or institution.

  2. "Growth hacking”, originated in the tech startup world, denotes a collection of non-traditional and imaginative marketing techniques designed to attain swift and scalable expansion for a business, product, or user base.

  3. For computers, the event represents the basic unit of users' interaction with webpage elements through keyboards, mouse clicks, or touchpad swipes. Elements are used to create the structure of a web page and are used to present information in various ways.

  4. The internet company has practiced the” agile development” as the mainstream framework to manage cross departmental software development. This decentralized model emphasizes on a flat organization structure, with few or no levels of middle management between staff and executives compared with the traditional hierarchical organization. Technology companies not only use the "objectivity" of data for business decisions but also utilize data to evaluate employees’ performance.

  5. The OKR (Objectives and Key Results) system is a goal-setting framework used in many organizations to define and track objectives and their outcomes. It is a simple and effective tool that helps align individual and team goals with the company's overall mission, vision, and strategy. OKRs consist of two main components: Objectives, which are specific and measurable goals that organizations want to achieve, and Key Results, which are specific, quantifiable metrics used to measure progress toward the objective. It widely held that companies can ensure that every employee is working toward the same set of goals and can measure their progress in a transparent and consistent way.

  6. The OKR system was popularized by companies like Intel and Google and has gained widespread adoption in various industries and organizations, ranging from start-ups to large corporations. Ideally, the OKR system provides a framework for setting ambitious goals, promoting alignment, and fostering a results-oriented culture within organizations. It encourages transparency, accountability, and regular communication, allowing organizations to adapt and respond to changing circumstances effectively. The gap between the model and practice of OKR system invited a further analysis. However, when this system was introduced into Chinese technology company, more emphasis was placed on the alignment between individuals’ quantified, trackable progress and the overall institution’s goal. During my fieldwork, I determined that the OKR system functioned as not only a labor management system but also a discursive apparatus through which individual professionals leverage for negotiating resource allocation, career development, etc.

  7. A content creator is a person or entity that produces and publishes content on a platform, such as social media, video sharing, or blogging platforms. The content can take many different forms, including written articles, images, videos, audio recordings, and other types of media. On many platforms, content creators are a vital part of the ecosystem. They may generate content that other users find interesting, informative, or entertaining, which can help to attract and retain users. Content creators may also play a role in building a community around a platform, as they may engage with other users and encourage conversations and interactions.

  8. DAU stands for Daily Active Users, which is a metric commonly used in the tech industry to measure the number of unique users who engage with a particular platform or application on a daily basis. DAU is just one of many metrics that companies use to understand user engagement and behavior on their platforms.

  9. In most cases, user-generated content platforms are commercially evaluated based on the scale of their daily active users (DAU). This evaluation criterion makes the audience market scalable for potential advertisers.

  10. Platforms like Stacker and other social media platforms rely heavily on user-generated content to drive user engagement as a constant stream of fresh and authentic content encourages users to keep returning and interacting with the content.

  11. Retention rate is the percentage of users who continue engaging with an app over time. This app metric is typically measured at 30 days, 7 days, and 1 day after users first install the app. App retention rate is calculated by dividing an app's monthly active users by its monthly installs.

  12. Online content production involves crafting and distributing a range of content formats, such as articles, videos, podcasts, and social media posts.

  13. Productization in the IT industry refers to the process of transforming a service or solution into a marketable product. In other words, it is the conversion of a customized or bespoke solution that was created for a specific client into a standardized product that can be offered to a wider audience.

  14. SOP, a typical internet jargon, stands for the standard operating procedure. The conjuring up of this term was suggestive of the possibility of repetitively employing a golden business principle in other sceneries.

  15. Foucault's discussion of technologies of the self is primarily found in his late works and seminars. One notable lecture series where he explicitly addresses this topic is his 1982–1983 lectures at the Collège de France, titled "The Hermeneutics of the Subject.

Abbreviations

OKR:

The objectives and key results (OKR) system is a goal-setting framework used in many organizations to define and track objectives and their outcomes. It is a simple and effective tool that helps align individual and team goals with the company's overall mission, vision, and strategy. OKRs consist of two main components: objectives, which are specific and measurable goals that organizations want to achieve; and key results, which are specific, quantifiable metrics used to measure progress toward the objective. It is widely held that companies can ensure that every employee works toward the same set of goals and can measure their progress in a transparent and consistent way

DAU:

DAU stands for daily active users, which is a metric commonly used in the tech industry to measure the number of unique users who engage with a particular platform or application on a daily basis. DAU is just one of many metrics that companies use to understand user engagement on their platforms

SOP:

SOP is typical internet jargon and stands for the standard operating procedure. The creation of this term was suggestive of the possibility of repetitively employing a golden business principle in other scenarios

CTR:

CTR stands for click-through rate, CTR is the number of clicks the content receives divided by the number of times the content is shown

References

  • Beer, David. 2017. The social power of algorithms. Information, Communication and Society 20 (1): 1–13.

    Article  MathSciNet  Google Scholar 

  • Boellstorff, T. 2015. Making big data, in theory. In Data, Now Bigger and Better!, ed. Tom Boellstorff and Bill Maurer, 205-217. Chicago: Prickly Paradigm Press.

  • Bucher, Taina, and Anne Helmond. 2018. The affordances of social media platforms. In The SAGE handbook of social media, ed. Jean Burgess, Alice Marwick Jean, and Thomas Poell, 233–254. London: Sage Publications.

    Chapter  Google Scholar 

  • Chen, Long. 2022. Labor order under digital control: Research on labor control of take-out platform riders. The Journal of Chinese Sociology 9 (1): 17.

    Article  CAS  Google Scholar 

  • Chong, Kimberly. 2018. Best practice: Management consulting and the ethics of financialization in China. Durham: Duke University Press.

    Book  Google Scholar 

  • Christin, Angèle. 2020. Metrics at work: Journalism and the contested meaning of algorithms. Princeton: Princeton University Press.

    Book  Google Scholar 

  • Coleman, E. Gabriella. 2013. Coding freedom: The ethics and aesthetics of hacking. Princeton: Princeton University Press.

    Book  Google Scholar 

  • Couldry, Nick, and Ulises A. Mejias. 2019. Data colonialism: Rethinking big data’s relation to the contemporary subject. Television and New Media 20 (4): 336–349.

    Article  Google Scholar 

  • Day, Sophie, Catherine Lury, and Nina Wakeford. 2014. Number ecologies: Numbers and numbering practices. Distinktion: Scandinavian Journal of Social Theory 15 (2): 123–154.

    Article  Google Scholar 

  • Desrosières, Alain. 1998. The politics of large numbers: A history of statistical reasoning. Cambridge: Harvard University Press.

    Google Scholar 

  • Dijck, Van, Thomas Poell José, and Martijn de Waal. 2018. The platform society: Public values in a connective world. Oxford: Oxford University Press.

    Book  Google Scholar 

  • Douglas-Jones, Rebecca, Andrew Walford, and Natalie Seaver. 2021. Introduction: Towards an anthropology of data. Journal of the Royal Anthropological Institute 27 (S1): 9–25.

    Article  Google Scholar 

  • Dourish, Paul. 2016. Algorithms and their others: Algorithmic culture in context. Big Data and Society 3 (2): 1–11.

    Article  Google Scholar 

  • Duffy, Brooke Erin. 2017. (Not) Getting paid to do what you love: Gender, social media, and aspirational work. New Haven: Yale University Press.

    Book  Google Scholar 

  • Duffy, Brooke Erin. 2020. Algorithmic precarity in cultural work. Communication and the Public 5 (3–4): 103–107.

    Article  Google Scholar 

  • Espeland, Wendy N., and Michael Sauder. 2007. Rankings and reactivity: How public measures recreate social worlds. American Journal of Sociology 113 (1): 1–40.

    Article  Google Scholar 

  • Foucault, Michel. 1988. Technologies of the self. In Technologies of the self: A seminar with Michel Foucault, vol. 18, ed. Luther H. Martin, Huck Gutman, and Patrick H. Hutton. London: Tavistock Publications.

    Google Scholar 

  • Gillespie, Tarleton. 2016. #trendingistrending: When algorithms become culture. In Algorithmic cultures, ed. Robert Seyfert, 64–87. New York: Routledge.

    Google Scholar 

  • Gillespie, Tarleton. 2018. Custodians of the internet: Platforms, content moderation, and the hidden decisions that shape social media. New Haven: Yale University Press.

    Google Scholar 

  • Gitelman, Lisa, ed. 2013. Raw data is an oxymoron. Cambridge: MIT Press.

    Google Scholar 

  • Gorwa, Robert. 2019. What is platform governance? Information, Communication and Society 22 (6): 854–871.

    Article  Google Scholar 

  • Gregory, Judith, and Geoffrey C. Bowker. 2016. The data citizen, the quantified self, and personal genomics. In Quantified: Biosensing technologies in everyday life, ed. Dawn Nafus and Richard León, 211–226. Cambridge: MIT Press.

    Chapter  Google Scholar 

  • Hacking, Ian. 1990. The taming of chance. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Harvey, Penny, and Hannah Knox. 2015. Roads: An anthropology of infrastructure and expertise. New York: Cornell University Press.

    Google Scholar 

  • Hirsch, Eric. 2022. Acts of growth: Development and the politics of abundance in Peru. Stanford: Stanford University Press.

    Book  Google Scholar 

  • Hirschman, Daniel, and Eli P. Berman. 2014. Do economists make policies? On the political effects of economics. Socio-Economic Review 12 (4): 779–811.

    Article  Google Scholar 

  • Hoeyer, Klaus. 2023. Data paradoxes: The politics of intensified data sourcing in contemporary healthcare. Cambridge: MIT Press.

    Book  Google Scholar 

  • Horst, Heather A., and Daniel Miller, eds. 2012. Digital anthropology. London: Routledge.

    Google Scholar 

  • Howcroft, Debra, and Birgitta Bergvall-Kåreborn. 2019. A typology of crowdwork platforms. Work, Employment and Society 33 (1): 21–38.

    Article  Google Scholar 

  • Hughes, Thomas P. 1983. Networks of power: Electrification in western society 1880–1930. Baltimore: Johns Hopkins University Press.

    Book  Google Scholar 

  • Hull, Matthew S. 2012. Documents and bureaucracy. Annual Review of Anthropology 41: 251–267.

    Article  Google Scholar 

  • Irani, Lilly. 2015. The cultural work of microwork. New Media and Society 17 (5): 720–739.

    Article  Google Scholar 

  • Irani, Lilly C, and Michael S. Silberman. 2013. Turkopticon: Interrupting worker invisibility in amazon mechanical Turk. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '13), ed. Mackay, 611–620. NY: Association for Computing Machinery.

  • Kipnis, Andrew B. 2008. Audit cultures: Neoliberal governmentality, socialist legacy, or technologies of governing? American Ethnologist 35 (2): 275–289.

    Article  Google Scholar 

  • Kloet, De., Thomas Poell Jeroen, Guohua Zeng, and Yiu Fai Chow. 2019. The platformization of Chinese society: Infrastructure, governance, and practice. Chinese Journal of Communication 12 (3): 249–256.

    Article  Google Scholar 

  • Knox, Hannah. 2021. Hacking anthropology. Journal of the Royal Anthropological Institute 27 (S1): 108–126.

    Article  Google Scholar 

  • Latour, Bruno. 2007. Reassembling the social: An introduction to actor-network-theory. Oxford: Oup Oxford.

  • Lowrie, Ian. 2018. Becoming a real data scientist: Expertise, flexibility, and lifelong learning. In Ethnography for a data-saturated world, ed. Hannah Knox and Dawn Nafus, 62–81. Manchester: Manchester University Press.

    Google Scholar 

  • Lupton, Deborah. 2015. Quantified sex: A critical analysis of sexual and reproductive self-tracking using apps. Culture, Health and Sexuality 17 (4): 440–453.

    Article  PubMed  Google Scholar 

  • Lupton, Deborah. 2016. The quantified self. Hoboken: John Wiley and Sons.

    Google Scholar 

  • MacKenzie, Adrian. 2005. The performativity of code: Software and cultures of circulation. Theory, Culture and Society 22 (1): 71–92.

    Article  Google Scholar 

  • MacKenzie, Donald. 2008. An engine, not a camera: How financial models shape markets. Cambridge: MIT Press.

    Google Scholar 

  • Malaby, Thomas. 2011. Making virtual worlds. New York: Cornell University Press.

    Google Scholar 

  • Mayer-Schönberger, Viktor, and Kenneth Cukier. 2013. Big data: A revolution that will transform how we live, work, and think. Boston: Houghton Mifflin Harcourt.

    Google Scholar 

  • McCreery, John. 1995. Malinowski, magic, and advertising: On choosing tetaphors. In Contemporary. Marketing and consumer behavior: An anthropological sourcebook, ed. John F. Sherry, 309–329. Thousand Oaks: SAGE Publications.

    Google Scholar 

  • Merry, Sally Engle. 2011. Measuring the world: Indicators, human rights, and global governance. Current Anthropology 52 (S3): S83–S95.

    Article  MathSciNet  Google Scholar 

  • Merry, Sally Engle. 2016. The seductions of quantification, 21. Chicago: University of Chicago Press.

    Book  Google Scholar 

  • Miller, Daniel. 2000. The fame of trinis: Websites as traps. Journal of Material Culture 5 (1): 5–24.

    Article  Google Scholar 

  • Miller, Daniel. 2003. The virtual moment. Journal of the Royal Anthropological Institute 9 (1): 57–75.

    Article  Google Scholar 

  • Miller, Peter. 2014. Accounting for the calculating self. Cambridge: MIT Press.

    Google Scholar 

  • Miller, Peter. 2001. Governing by numbers: Why calculative practices matter. Social Research, 379–396.

  • Mitchell, Timothy. 2002. Rule of experts: Egypt, techno-politics, modernity. Berkeley: University of California Press.

    Book  Google Scholar 

  • Nafus, Dawn, and Hannah Knox, eds. 2018. Ethnography for a data-saturated world. Manchester: Manchester University Press.

    Google Scholar 

  • Neff, Gina, and Dawn Nafus. 2016. Self-tracking. Cambridge: MIT Press.

    Book  Google Scholar 

  • O’Reilly, Jessica. 2016. Sensing the ice: Field science, models, and expert intimacy with knowledge. Journal of the Royal Anthropological Institute 22 (S1): 27–45.

    Article  Google Scholar 

  • Pedersen, Morten Axel, Kirstine Albris, and Nick Seaver. 2021. The political economy of attention. Annual Review of Anthropology 50: 309–325.

    Article  Google Scholar 

  • Petre, Caitlin. 2021. All the news that’s fit to click: How metrics are transforming the work of journalists. Princeton: Princeton University Press.

    Book  Google Scholar 

  • Petre, Caitlin, Brooke Erin Duffy, and E. Hund. 2019. “Gaming the system”: Platform paternalism and the politics of algorithmic visibility. Social Media+ Society 5 (4): 95.

    Google Scholar 

  • Pieke, Frank N. 2014. Anthropology, China, and the Chinese century. Annual Review of Anthropology 43: 123–138.

    Article  Google Scholar 

  • Pinch, Trevor, and Wiebe Bijker. 1987. The social construction of facts and artifacts. In The social construction of technological systems, ed. Wiebe E. Bijker, Thomas Parke Hughes, and Trevor Pinch, 17–50. Cambridge: MIT Press.

    Google Scholar 

  • Poell, Thomas, David B. Nieborg, and José van Dijck. 2019. Platformization. Internet Policy Review 8 (4): 1–19.

    Article  Google Scholar 

  • Poovey, Mary. 1998. A history of the modern fact: Problems of knowledge in the sciences of wealth and society. Chicago: University of Chicago Press.

    Book  Google Scholar 

  • Porter, Theodore M. 1986. The rise of statistical thinking 1820–1900. Princeton: Princeton University Press.

    Book  Google Scholar 

  • Power, Michael. 1994. The audit explosion. London: Demos.

    Google Scholar 

  • Power, Michael. 1997. The audit society: Rituals of verification. Oxford: Oxford University Press.

    Google Scholar 

  • Riles, Annelise. 2000. The network inside out. Ann Arbor: University of Michigan Press.

    Book  Google Scholar 

  • Riles, Annelise, ed. 2006. Documents: Artifacts of modern knowledge. Ann Arbor: University of Michigan Press.

    Google Scholar 

  • Rose, Nikolas. 1990. Governing the soul: The shaping of the private self. Abingdon: Taylor and Francis/Routledge.

    Google Scholar 

  • Rosenblat, Alex. 2018. Uberland: How algorithms are rewriting the rules of work. Berkeley: University of California Press.

    Book  Google Scholar 

  • Rosenblat, Alex, and Luke Stark. 2016. Algorithmic labor and information asymmetries: A case study of Uber’s drivers. International Journal of Communication 10: 27.

    Google Scholar 

  • Ruckenstein, Minna. 2014. Visualized and interacted life: Personal analytics and engagements with data doubles. Societies 4 (1): 68–84.

    Article  Google Scholar 

  • Ruckenstein, Minna, and Natasha Dowd Schüll. 2017. The datafication of health. Annual Review of Anthropology 46: 261–278.

    Article  Google Scholar 

  • Ruppert, Evelyn, John Law, and Mike Savage. 2013. Reassembling social science methods: The challenge of digital devices. Theory, Culture and Society 30 (4): 22–46.

    Article  Google Scholar 

  • Ruppert, Evelyn, Engin Isin, and Didier Bigo. 2017. Data politics. Big Data and Society 4 (2): 2053951717717749.

    Article  Google Scholar 

  • Sauder, Michael, and Wendy N. Espeland. 2009. The discipline of rankings: Tight coupling and organizational change. American Sociological Review 74 (1): 63–82.

    Article  Google Scholar 

  • Savage, Mike. 2013. The “social life of methods”: A critical introduction. Theory, Culture and Society 30 (4): 3–21.

    Article  Google Scholar 

  • Schüll, Natasha Dow. 2012. Addiction by design. Princeton: Princeton University Press.

    Google Scholar 

  • Schüll, Natasha Dow. 2016. Data for life: Wearable technology and the design of self-care. BioSocieties 11: 317–333.

  • Schüll, Natasha Dow. 2018. Self in the loop: Bits, patterns, and pathways in the quantified self. In A networked self and human augmentics, artificial intelligence, sentience, ed. Zizi Papacharissi, 25–38. Abingdon: Routledge.

    Chapter  Google Scholar 

  • Scott, James. 1998. Seeing like a state: How certain schemes to improve the human condition have failed. New Haven: Yale University Press.

    Google Scholar 

  • Seaver, Nick. 2017. Algorithms as culture: Some tactics for the ethnography of algorithmic systems. Big Data and Society 4 (2): 2053951717738104.

    Article  Google Scholar 

  • Seaver, Nick. 2018. What should an anthropology of algorithms do? Cultural Anthropology 33 (3): 375–385.

    Article  Google Scholar 

  • Seaver, Nick. 2019. Captivating algorithms: Recommender systems as traps. Journal of Material Culture 24 (4): 421–436.

    Article  Google Scholar 

  • Seaver, Nick. 2022. Computing taste: Algorithms and the makers of music recommendation. Chicago: University of Chicago Press.

    Book  Google Scholar 

  • Sharon, Tamar. 2017. Self-tracking for health and the quantified self: Re-articulating autonomy, solidarity, and authenticity in an age of personalized healthcare. Philosophy and Technology 30: 93–121.

    Article  Google Scholar 

  • Sharon, Tamar, and Daniel Zandbergen. 2017. From data fetishism to quantifying selves: Self-tracking practices and the other values of data. New Media and Society 19 (11): 1695–1709.

    Article  Google Scholar 

  • Sherman, Jamie. 2016. Data in the age of digital reproduction: Reading the quantified self through Walter Benjamin. In Quantified: Biosensing Technologies in everyday life, ed. Dawn Nafus, 27–42. Cambridge, MA: MIT Press.

    Chapter  Google Scholar 

  • Shore, Cris, and Susan Wright. 2015a. Governing by numbers: Audit culture, rankings and the new world order. Social Anthropology/anthropologie Sociale 23 (1): 22–28.

    Article  Google Scholar 

  • Shore, Cris, and Susan Wright. 2015b. Audit culture revisited: Rankings, ratings, and the reassembling of society. Current Anthropology 56 (3): 421–444.

    Article  Google Scholar 

  • Srnicek, Nick. 2017. Platform capitalism. Cambridge: Polity Press.

    Google Scholar 

  • Strathern, Marilyn. 2000. Audit cultures: Anthropological studies in accountability, ethics and the academy. London: Routledge.

    Google Scholar 

  • Suchman, Lucy. 2011. Anthropological relocations and the limits of design. Annual Review of Anthropology 40: 1–18.

    Article  Google Scholar 

  • Sun, Ping. 2019. Your order, their labor: An exploration of algorithms and laboring on food delivery platforms in China. Chinese Journal of Communication 12 (3): 308–323.

    Article  Google Scholar 

  • Troisi, Orlando, Gennaro Maione, Mara Grimaldi, and Francesca Loia. 2020. Growth hacking: Insights on data-driven decision-making from three firms. Industrial Marketing Management 90: 538–557.

    Article  Google Scholar 

  • Vallas, Steven, and Juliet B. Schor. 2020. What do platforms do? Understanding the gig economy. Annual Review of Sociology 46: 273–294.

    Article  Google Scholar 

  • Verran, Helen. 2010. Science and an African logic. Chicago: University of Chicago Press.

    Google Scholar 

  • Walford, Antonia. 2021. Data aesthetics. In Lineages and advancements in material culture studies, ed. Timothy Carroll, Antonia Walford, and Shireen Walton, 205–217. Oxfordshire: Taylor & Francis.

    Google Scholar 

  • Whitson, Jennifer R. 2013. Gaming the quantified self. Surveillance and Society 11 (1/2): 163–176.

    Article  Google Scholar 

  • Zhang, Li., and Aihwa Ong, eds. 2008. Privatizing China: Socialism from Afar. Ithaca: Cornell University Press.

    Google Scholar 

Download references

Acknowledgements

I thank all IT professionals who have participated in my ethnographic research in Beijing since 2020. Special thanks are due to Daniel Miller, Yueyue Gao, and Charles Pollock for their invaluable suggestion for this paper.

Funding

The author gratefully acknowledges support from the Southern University of Science and Technology with funding from the 9th National Anthropology Graduate Fieldwork Grant.

Author information

Authors and Affiliations

Authors

Contributions

The research presented here is based on the author’s 11 months of fieldwork within a project team of a Chinese Internet giant. During this period, the author used both participant observation and unstructured interviews with data practitioners such as data scientists,project managers and developers. The author read and approved the final manuscript.

Corresponding author

Correspondence to Ken Zheng.

Ethics declarations

Competing interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zheng, K. Unintended consequences: data practice in the backstage of social media. J. Chin. Sociol. 11, 11 (2024). https://doi.org/10.1186/s40711-024-00210-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40711-024-00210-2

Keywords