NEW YORK, April 4, 2012 /PRNewswire/ -- While nearly all participants in a study, from the Council for Research Excellence, of digital publishing data-collection practices engaged in some form of user-data collection, there is no common "data owner" – a department, individual or function – charged within these organizations with controlling the publisher's user data. In some cases, these functions are dispersed among different departments, individuals and/or data components.
The lack of common data ownership is seen as one of many reasons only a few digital publishers have developed techniques for utilizing their data for audience measurement, advertising targeting, content refinement or other meaningful purposes.
The study, to advance digital audience measurement, was conducted by Ernst & Young LLP under the direction of the Council for Research Excellence (CRE), a diverse group of senior-level research professionals from throughout the media and advertising industries dedicated to advancing the knowledge and practice of audience measurement methodology.
The objective of the study, undertaken during fourth quarter 2011, was to examine how various digital publishers capture and maintain user data and to understand how these data can supplement existing research-panel data. The effort to assess current data-collection practices, as well as best practices, is designed to strengthen "hybrid" (panel-based/server-based) digital audience measurement. The CRE seeks to foster a better understanding of how commonly collected publisher/user data can augment existing panel data so as to bring greater confidence to publishers' data-centric activities.Among key findings of the study:
- Data conflicts can and do occur, though very few publishers have resolution policies;
- The majority of publishers have minimal or no formal quality and validation practices for the handling of user data;
- Few digital publishers appear to provide user information externally, and few – even though they typically collect user data via third parties such as social media – provide first-party collected data to third parties, making third-party data uni-directional;
- Some publishers anticipate future potential uses of data such as geo-targeting or behavioral targeting – but there is no clear common expectation on how user data may be utilized near-term; and
- Publishers feel the potential to leverage user data is inhibited by a lack of interest or sophistication on the "buy-side" – advertisers and agencies – and are reluctant to develop their processes until they have better sense of what the buy-side wants.
- Very few participants require users to register and provide declared data – though nearly all employ optional user registration;
- The amount and type of data requested of users vary greatly, with few universal data elements; email address and zip code are more commonly used;
- Digital publishers largely conclude there is a need to provide the user with a reason or some value to incent the user to provide information; and
- Certain data elements can have multiple definitions. "Geography," for example, can be defined as a coffee shop from which a user is posting to Facebook via a mobile device – or an IP address associated with a user's desktop device. These definitions can impact an advertiser's interest in the user.
- Enact data edits or evaluations at the time of collection to determine if the response is valid – such as validating a zip code based on reference to a USPS database;
- Review declared data for illogical or suspect responses -- commonly bogus birth dates such as "January 1," zip codes such as "90210" and telephone numbers such as "867-5309";
- Enable users to review their collected data so they can update, correct or remove from their profile;
- Establish a data "time to live" ("TTL") policy that takes into consideration differing data types, association (first or third party sources) and derivation (declared or inferred data), setting a point at which such data must either be refreshed or discarded; and
- Centralize the oversight of data collection, quality and use across an organization to serve multiple disciplines such as research and CRM.