header
vol. 15 no. 3, September, 2010

 


Proceedings of the Seventh International Conference on Conceptions
of Library and Information Science—"Unity in diversity"

Trusting tags, terms, and recommendations


Jens-Erik Mai
Faculty of Information, University of Toronto, Canada


Abstract
Introduction. The paper traces the commonalities of three research traditions within information science, professional indexing, social tagging and recommending, and argues that they face the shared challenge of establishing trust with users. It is argued that information science must approach representation from a pluralistic approach in which it is recognized that there are multiple interpretations and equal correct representations of any individual resource, since interpretations and representations are context-dependent.
Method and appoach The paper takes a critical analytic approach, which is informed by a social constructivist understanding
Analysis. The three approaches to generating metadata are analysed to determine how, why, and when trust issues come into play in those approaches.
Conclusions. The paper argues that users of information systems can only trust other people’s tags, terms and recommendations in information systems that are transparent and it is concluded that information science must unite in establishing transparency and openness as key components of information systems.


Introduction

Users find resources in information systems primarily through representations of the resources (by which I mean any means by which ideas, opinions, claims, or facts are represented in books, journal papers, film, Web pages, or any other medium). The quality and usability of such representations have been explored in the past, often from the perspectives of their consistency and relevance to users. The main criteria for evaluating representations have been their correctness and correspondence to the resources' reality. Information science has mostly pursued to understand representations of resources' reality with a rigid mind, to whom, 'the world is a set of discrete entities separated from another by gaps' (Zerubavel 1991: 70) and as such sought to represent resources based on their inherent essences.

Much work in information science is based on a naive theory of representation, which 'says that things come in well-defined kinds, that the kinds are characterized by shared properties, and that there is one right taxonomy of the kinds' (Lakoff 1987: 121) in opposition to that view is a social constructivistic and pluralistic view which forwards the notion that 'it is we ourselves who create categories and force reality into supposedly insular compartments' (Zerubavel 1991: 76). If we accept the notion that it is not the resources themselves that possess attributes that are used when representing them, but that any representation is based on an interpretation and selection process (Mai 2001), it would require that we address 'why… we choose certain groups of similarity attributes over and above others' (Bryant 2000: 61). The truthfulness of any representation, therefore, depends on whom, where, and when the resource is represented, for instance,

A stone on a field contain different information for different people (or from one situation to another). It is not possible for information systems to map all the stone's possible information for every individual. Nor is any one mapping the 'true' mapping (Hjørland 1997: 111).

When people in engage in representing and using resources, we must remember that, 'we usually do it not as humans or as individuals, but as members of societies' (Zerubavel 1991: 76-77) and that we share understandings of the world with other members of that society or culture. We must, however, give up 'the idea that some ways of classifying are more correct or 'logical' than others' (Zerubavel 1991: 65) and we must design information systems that accommodate users from diverse societies, cultures, and understandings.

While this shift in understanding, from a naive theory of representation to an appreciation of the complexities of the world, has been accepted by many scholarly traditions, it seems that information science lacks a bit behind. Scholarship and practice would be richer if it became accepted as the foundation that different people have different understandings of the same resource, and that the aim of information systems and services is to provide diverse and plural representations of resources. That would not only recognize the interpretive nature of resources, but it would also allow users more ways to access resources.

In a world where everyone (see Shirky 2008) participates in the interpretations and representations of resources, it is possible to bring forward more perspectives of resources but at the same time the authority of professionals are being questioned and the notion of which representations and resources should be trusted shift. If the production of representations are not solely an expert activity that professionals engage with, but an activity that everyone participates in (Shirky 2008) and have an opinion about, then a core challenge for designers and providers of information systems and services becomes, not what is the correct representation (as it was the case yesteryear) but how and when will users trust the information system and the representations?

This paper will explore three different approaches to representation (professional indexing, social tagging, and recommender systems) and suggest that each approach face the common challenge of establishing trust with users. The paper argues that a key question today is to understand when and how to trust and use other people's tags, terms, and recommendations of resources.

Professionalism: indexing and matching

Systems for the retrieval of resources have always been based on the assumption that a user expresses a need or desire for some resources that meet certain criteria (author, title, subject) (for example, Cutter 1876) and that a match of some sort takes place between the representation of the resources and the user's expressed need (cf. Taylor 1968). It is well known that users generally find it difficult to choose, or guess, the terms that have been used to represent the resources that they might want. Furthermore, users often find it difficult to choose the words that best represent what they desire or need (e.g., Belkin et al. 1982; Ingwersen 1992).

To counter that challenge and design systems that are user-friendly, the notion of user- and domain-centered indexing was promoted in an effort to represent resources with the users' words and from the users' world-view (Fidel 1994; Hjørland and Albrechtsen 1995; Mai 2005). These user- or domain-centered approaches could prove successful in environments with particular, stated, clear, objective, and specific purposes, where the users have particular, similar interests, beliefs, positions, knowledge, expertise, etc., and where these can which can be known, understood, and articulated by those in charge of the collection or service. However, in environments with large quantities of resources that are used by unrecognizable many people; people with varied interests, beliefs, positions, knowledge, expertise, etc. a user- or domain-centered approach might prove less successful.

At a more general level, we can think of indexing as the process of determining and representing the subject matter of a resource and indexing can formally be defined as: 'preparing a representation of the subject matter' and the indexer then 'describes its content by using one or several index terms, often selected from some sort of controlled vocabulary' (Lancaster 2003: 6). The indexing literature often distinguishes between approaches to indexing that are resource-centered and the aforementioned user- or domain-centered approaches. Resource-centered approaches argue that the indexer should determine the subject matter of resources independently of particular contexts and uses. They regard resources as containers of information and maintain that indexers' task is to extract the information and translate it into a set of terms. Indexers are therefore advised to focus on the 'entity and its faithful description' (Soergel 1985: 227) and 'stick to the text and the author's claims' (Lancaster 2003: 37). The resource-centered approaches have been criticized for de-contextualizing the indexing and for the assumption that resources are containers of information (Mai 2005).

The goal of professional indexing, regardless of which approach one might take, is to provide access to resources and ultimately to support users' activities. The various approaches differ in their expectations of users and in their assumptions about the information-seeking process. Resource-centered approaches assume that resources can and should be represented neutrally and objectively with the aim to produce exact representations of resources. Users are expected to learn the terminology and structure of the fields of study and the information system when searching for information. Conversely, the user-centered and domain-centered approaches assume that determination of the subject matter of resources depends on the context, domain, and use of the resources and argue that indexers need to place the resources in context when determining the subject matter.

The practice of professional indexing has been stuck conceptually on the notion that selected professionals are asked either to provide context-independent interpretations and representations of resources (in the resource-centered traditions) or have been asked to interpret and represent resources on behalf of a user community, using their language and world-view (in the user- and domain-centered traditions). Regardless of tradition and approach, indexers are given the almost impossible challenge to determine what future users might be interested in and which words they would use to seek resources of interest.

Users, on the other hand, are in an equally difficult situation. Users must attempt to describe the content of resources they desire and must, therefore, describe something that is unknown to them (Belkin et al. 1982). This has sometimes been described as a process of uncovering users' actual information need and expressing this in a compromised form as a request (Taylor 1968), and as discovering and developing topics of interest through interactions with the search mechanism (Kuhlthau 1993). Users furthermore have to express information needs as search requests using appropriate search terms and they are thereby forced to guess which terms indexers might have chosen for resources that contain relevant information.

Explorations of professional indexing has traditionally been limited by the fact that resource-centered approaches were based on the conceptually flawed notion that resources' subject matter could be determined independently of context. User- and domain-centred approaches were limited by the fact that users' interests, beliefs, knowledge, and expertise can only be known when serving a small population where the domain specific language and knowledge claims can be mastered and understood by the indexer. In large, diverse collections where the users' interests, knowledge, and expertise are vast and varied any attempt to make meaningful interpretations and representations fall short of being objective or user-centred. These explorations have created several tensions in the indexing literature, mainly because 'our theoretical work has most often spent time prescribing a theoretical position that weights one 'correct' conceptualization' (Tennis 2009: 198).

While the matching approach to information representation and seeking has served well and is very appropriate in many environments and for many types of situations, it is limited by its authoritarian underpinnings; users need to figure out how the system has decided to interpret and represent resources of interests and match those interpretations and representations. Professional indexing has, as it basis, the assumption that a particular group of individuals (indexers) have particular knowledge (about the resources, the users, or the domain) that enables them to interpret and represent resources on behalf of a user community. If the view that any particular resource can have multiple, perhaps even infinite, meanings across user communities, would it then not be unreasonable and limiting to trust one set of individuals to represent resources for everyone else?

Sharing: social tagging

Beginning in the 1990s with democratic indexing and spreading more widely since the mid-2000s, there has been a growing interest in letting users play a more active and direct role in the interpretation and representation of resources – by letting users assign index terms, or tags, to resources. The basic idea is that the representation should 'no longer [be] derived from document texts and expert understanding of terminology in a given subject area, but derived from responses of users' (Brown et al. 1996: 118).

Social tagging, also known as folksonomies, has emerged to be a powerful way for people to provide context-dependent interpretations of and access to resources on the Web. Tagging systems have been implemented in many different kinds of websites and services, Delicious, for instance, allows users to tag websites, Flickr enables users to tag and share photos, Citeulike facilitates tagging of scholarly references, and LibraryThing enables tagging of books. It has been observed (Kipp and Campbell 2006) that people's tags often are rather idiosyncratic and less than helpful to others looking for information. Others celebrate this diversity of tags and find that by letting users assign tags to resources, a greater variety of interests and voices are being heard and that we collectively achieve a fuller and better understanding and representation of the resources (Shirky 2005).

Ding et al. (2009) note that the evolvement of the social Web, aka Web 2.0, has been predicated on the morphing of communication on the Web 'from one-way communication to human-to-human communication' where users are 'actively involved in online communication' and they find that 'one of the most popular ways of communicating via Web 2.0 is tagging' which they define as 'the act of adding keywords to online resources' (Ding et al. 2009: 2388). While professional indexing has been grounded and explored conceptually, with the aim of finding the one 'correct' conceptualization, as noted by Tennis (2009), social tagging has come about bottom up and been born out of features introduced on websites (Quintarelli 2005) and by users

faced with an ever-increasing flow of possibly useful information, [they] (i.e., non-information organization specialists) took matters in their own hands. By assigning 'tags' – identifiers and at times reminders of meaningful information – users unwittingly gave rise to a new information organization system (Wichowski 2009).

While it must be noted that people add tags to resources for many and varied reasons and that they are sometimes not engaged in 'indexing' when tagging (Nov & Ye, forthcoming), the result of tagging, the tags, can be used to organize and retrieve resources. In this sense, we ought not to let the reasons and motivations of taggers direct what we do, or do not do, with the tags, though we should recognize the circumstances of the tags. Similarly, the aforementioned debates about the various approaches to professional indexing contained discussions of the indexer's role and motivations or lack thereof. The motivations for people to engage in tagging does not come from a satisfaction in seeing resources indexed and classified, but from being part of a collective whole and conversations, and it has been argued that tagging systems become more appealing as they incorporate opportunities for uses to interact with each other (Furnas et al. 2006).

Following the notion that we regard tagging as a form of indexing, Furner (2010: 1859) has more formally characterized tagging as,

a form of manual, ascriptive, natural language, democratic indexing, which is typically undertaken by resource creators and resource users who have low levels of indexing expertise, high levels of domain knowledge, and widely varying motivations, and which is commonly used to represent non- or quasi-subject-related properties, and frequently (but far from exclusively) applied to resources such as images that do not contain verbal text.

In this sense, tagging is a sort of indexing that shares characteristics with professional indexing and can be couched in the language, tools, and techniques used in the indexing literature and tradition. However, social tagging is vastly different from professional indexing on at least two accounts:

  1. The objective of professional indexing is to produce truthful and correct representation of resources (at least to some degree and in some contexts), whereas the purpose of tagging is to engage in conversations about the resources, and allow users to express themselves. The result of those conversations and expressions can be harvested and used to create information organization and retrieval systems. The sole purpose of professional indexing is to produce representations to ease information retrieval.
  2. Social tagging is done from the perspective of what a specific resource means and does for a particular user. The person tags the resources in reaction to questions such as: 'What does it mean to me?' or 'How or for what do I use it?', professional indexers, on the other hand, address the question: 'What it is?', or, if one takes a domain-centered approach, the indexer asks: 'What is it used for in this context?'

Social tagging systems let the users make sense of the resources, interpret them and decide on what they mean and how they are or can be used. Users play an active role in the interpretation and representation of the resources and as such the possibilities for a more diverse understanding and set of search terms are increased.

Resources can be assigned tags by everyone, and anyone who does so remain anonymous; and any tags carries the same weight regardless of whether the tag is assigned by a leading expert on the subject matter or by a newcomer. It would therefore be reasonable to ask whether users can really trust tags; do the tags represent the resources' real content or do the tags merely represent a particular person's idiosyncratic view?

More metadata: recommending

While indexing and tagging systems can identify what a resource is, is about, or is for, users will in many search situations retrieve either too much or too little information. In these situations users can keep submitting keywords to the system and alter their search results to get closer to what they are looking for, or they can ask for a recommendation. We ask for recommendations in many situations in life,
most people rely on their friends for recommendations of restaurants, movies, music and other items or services. If I have a friend who recommended a movie to me that I like, and another friend who recommended a movie to me that I hated, I'm more likely to ask the first friend for advice next time I go to the movies because our taste seem to be similar (Farkas 2007: 127).

This realization has given birth to recommender systems that actively seek to suggest resources to users. As such, recommender systems can be thought of 'as a subset of the class of retrieval systems. Recommender systems are then those retrieval systems that effect retrieval specifically through analysis of the judgment of utility made by previous users' (Furner 2002: 748).

Recommender systems are fairly common today and used in almost all Web-based information systems (except library catalogues; almost no library catalogue system includes recommendations). The most common and best-known recommender system is probably Amazon.com's 'customers who bought this book also bought these books' and Netflix's movie recommender system, which gained some interests in the popular press due to the Netflix Prize competition. Recommendations are also built into for example Epinions, a site that allows people to review a wide range of products, CiteUlike a site to enable people manage and share their scholarly references, Flickr a site to organize and share photos, LibraryThing a site to catalogue and share information about books, and thestar.com the online version of Toronto Star newspaper. And almost any other web-based service.

However, unlike when we ask a friend for a recommendation, recommender systems often are not personalized. Everyone looking at a particular resource will receive the same recommendations; the recommendations are not tailored to the particular users, but to the particular resource and the particular service. The success of recommender systems are limited by their algorithms and the data; for some people in some situations the recommendations might seem perfectly fine and for other people, the recommendations might be of limited use and interests. There are serious limitations to the extend similar purchase or rating patterns can predict future preferences; one cannot assume that just because two people borrow, purchase, and like five books that they will also agree on the sixth. Recommendation predictions are typically based on
(1) content, i.e., recommending items with content that is similar to the content of the items already consumed by the users; (2) social networks, i.e., providing items related to people who are related to the user, either by explicit familiarity connection (e.g., by being connected in a social network system), or by some kind of similarity (e.g., by using similar tags, consuming similar documents, or having similar tastes as expressed in item rating) (Guy et al. 2009: 53-54).

While the first approach (content-based) assumes that resources are clustered in sets of similar and related items, the second approach (social networks) assumes 'that 'similar' users share mutual interests' (Guy et al. 2009: 54). Similarity, or connectedness, between users can be determined along several lines, including friends, colleagues, interests and activities. While content-based recommendations can supports users in discovery of resources based on what other users use, social network based recommendations can in some ways be closer to the recommendations we get from friends.

In social tagging systems people respond to the questions: 'What does it mean to me?' or 'How or for what do I use it?' and in recommender systems, users add further metadata to resources through their activities (such as borrowing, buying, or looking at) or through ostensive ratings and comments. In all three situations is a decision made whether a particular resource 'is a member of [a] class of documents that are 'related', possibly in various way' (Furner 2002: 754) regardless of whether the class consists of index terms, tags, classification codes, or ratings. As such, one could argue that professional indexing, social tagging and recommending are all instances of the related 'recommending activity by which preference orderings are expressed in the form of judgments of utility, relevance, relatedness, or approval' (Furner 2002: 749). Recommender systems can further aid users in the discovery of resources by suggesting particular relevant resources. Users can be actively engaged with each other in reviewing and rating the resources and thereby creating even more metadata and increase the interpretations, understanding, and findability.

While many people turn to their trusted friends for recommendations on a host of issues, it is not entirely clear why people would trust anonymous, generic recommendations made by information systems. Why trust a recommendation for a particular resource when it is not clear on what basis the recommendation is made and who is making the recommendation?

Challenge: trust and authority

When users interact with information systems they often do so with a purpose and with the hope of finding the best 'means to his end' (Wilson 1968: 21), users are not just interested in finding something, they want to find what is best for them in a particular situation. That is why we often to turn to friends when looking for recommendations for movies, books, restaurants, etc.; we know their taste, what they have liked and recommended in the past, and we trust their recommendations. However, people might, or might not, trust search results in a library catalog or the recommendations that Amazon.com offers. We therefore need to explore when and how people trust other people's interpretation and representation of resources.

Wilson (1983: 13) suggests that 'some people know what they are talking about, other do not' and those who know what they are talking about 'are my cognitive authorities'. When deciding which movie to see, we might ask our friends for recommendations, but there are certain friends who we regard as knowledgeable about movies or with whom we share taste in movies, those are the friends we listen to. These are our cognitive authorities; we let them influence our thinking. There are 'others who are not cognitive authorities [who] may also influence me. The difference between them and the cognitive authorities is that I recognize the latter's influence as proper and the former's as not proper' (Wilson 1983: 14). When users use an information system, that system might influence their decisions to obtain certain resources based on how the resources have been represented and what is said about them. The challenge for the user is to distinguish between proper cognitive authorities and those that are not.

Professional indexers hold certain authority through their positions; they have been selected by some body to interpret resources and create representations. In that sense they hold an administrative authority, which is 'a recognized right to command others, within certain prescribed limits' (Wilson 1983: 14) and while users might recognize this right (regardless of whether they are aware of the fact or not), they might not necessarily recognize professional indexers as cognitive authorities. Likewise, while professional indexers would be considered experts by most accounts, in the sense that they have specialized training, are employed by respectable organizations, possess specific knowledge, etc., we need to remember that being an expert is different than being a cognitive authority for 'one can be an expert even though no one else realizes or recognize that one is' (Wilson 1983: 13), when we turn to particular people for information and advice, we turn to those we recognize as knowledgeable and trustworthy.

While professional indexers historically have had the sole right to interpret and represent resources, the rise of social tagging has enabled users to participate in the interpretation and representation. This has opened up for the possibility to explore the ''spectrum of connotation', based on the range of possible meanings available in society at a particular moment' (Rafferty & Hidderley 2007: 406) and thereby challenged professional indexers' authority and right to interpretations and representations. The authority of social tagging comes 'from the agreement of its users: its warrant comes from the constructive interpretation of its users' (Rafferty & Hidderley 2007: 406). In other words, when using a social tagging system to find resources, users need to trust fellow users' interpretations and representations, because 'sharing information requires that users … trust one another' (van House 2002: 100). When we do not know people personally, we assess their trustworthiness by asking about their occupational and educational credentials, reputation, and so forth (Wilson 1983). While such an approach and requirement might work in professional settings, in social media settings assessment of trustworthiness might be oriented more towards 'shared orientations and values' (van House 2002: 104).

In addition to explicit representations of resources achieved through professional indexing or social tagging, users can also find resources through recommendations, which can be seen as another avenue to add more metadata to the resource. Many recommender systems aggregate users' activities or ratings to form recommendations that are not personalized in the sense that everyone receives the same recommendations based on similarity between their activities or ratings and everyone else's. However, 'in real life people mostly seek advice from people they know' (Guy et al. 2009: 53) and by harvesting information from social network sites, it would be possible to show recommendation made by friends, colleagues and other cognitive authorities.

Users are more likely to turn to information systems they trust, and common for the three traditions is a need to increase users' trust. Any system that retrieves and recommends resources seek to influence users' thinking, and as such, ask that users regard it as a cognitive authority. Laypersons, especially, might have difficulties in judging the relative authority of a system and would require transparency about the system and its functions to be able to decide whether to trust it.

Conclusions

Historically, professional indexers, and other information specialists, were given the authority to interpret and represent resources on behalf of users. While this authority technically was solely an administrative authority, it could be taken as a cognitive authority as well. With the rise of social media it has become necessary to explicitly distinguish between the two authorities and information professionals are only now being asked to establish themselves as cognitive authorities. While some have argued that, 'librarians already have a reputation as authoritative, not authoritarian' (Lankes 2008: 679) most work by librarians and other information professionals are not transparent and open for augmentation by users.

As information science accepts diversity, embraces a pluralistic approach, and starts 'recognizing the inherently open potential of essence, it avoids freezing entities in any one mental context by assigning them fixed meanings' (Zerubavel 1991: 121). Instead of starting from the assumption that the goal of information systems is to procedure correct, context-free representations, a more fruitful approach would be to approach the challenge with a flexible mind and recognize that any resource will mean different things to different people in different contexts. There is no one true representation.

When information systems and services are designed and presented not as given access to resources through true and correct representations, but instead are designed and presented as giving access through possible interpretations and representations, the burden shifts from one of being neutral and objective to one of being fair and accommodating. Information systems that provide possibilities for multiple representations created through a variety of mechanisms, incl. professional indexing, social tagging, and recommendation, are more likely to be thought of as fair and accommodating. But the real challenge is to establish trust with users. Even if the information system is perceived as fair and accommodating, it is not necessary that users will trust the system. In order for users to trust other people's tags, terms, and recommendations, they need to understand how, who, and on what basis the representations are created. Transparency is required. An information science that accepts pluralistic interpretations and multiple correct meanings of resources, faces the challenge of figuring out what it means that information systems are transparent and how to facilitate transparent representations.

The information systems of tomorrow look significantly different than those of yesteryear in terms of their focus on interactivity, collaboration, conversation, interpretation, and plurality. Various traditions within information science, therefore, need to unite in an effort to establish transparency and openness as key com ponents of information systems.

About the author

Jens-Erik Mai is Associate Professor in the Faculty of Information, University of Toronto, Canada. He received his bachelor and master degrees from the Royal School of Library and Information Science, Denmark and his PhD from the University of Texas at Austin. He can be contacted at: [email protected]

References
How to cite this paper

Mai, J-E. (2010). "Trusting tags, terms, and recommendations" Information Research, 15(3) colis705. [Available at http://InformationR.net/ir/15-3/colis7/colis705.html]
Find other papers on this subject



logo Bookmark This Page

Hit Counter by Digits
© the author, 2010.
Last updated: 7 September, 2010
Valid XHTML 1.0!