16 Dec 2024 | Mattia Fosci

The ID bottleneck: This is why publisher first-party data isn’t reaching its potential

Opinion

First-party data is still marginal in the programmatic ecosystem due to four key perceived problems. But the real problem is the industry’s addiction to ID.

Back when the whole world thought Google was going to kill the third-party cookie, open web publishers reacted with a mix of preoccupation and excitement. The most reliant on open market buying were expecting a blood bath.

But premium publishers saw an opportunity to leverage their first-party data in a supply-constrained environment. The more tech-savvy ones openly showed enthusiasm at the prospect of ridding the market of the data leakage that had undermined their inventory and intermediated their relationship with advertisers.

Rewinding back to 2022, monetisation solutions for publishers’ first-party data were all the rage and the IAB Tech Lab even published a standard to support publisher signals, Seller-Defined Audiences (SDAs), as a way to push these new signals into the mainstream.

Despite some progress, and even Google dabbling with publisher provided signals, first-party data is still marginal in the programmatic ecosystem.

4 alleged problems

There are some good reasons why publisher audiences have not imposed themselves as the default solution.

The most obvious problem is that publishers’ first-party data doesn’t solve measurement, so it can hardly be used for performance campaigns. A user on the publisher’s site is a different user on the advertiser’s site and that’s it.

A second problem is that not all first-party data is equally valuable. Vertical publishers often have more valuable data (ie. linked to segments with well-defined purchasing behaviours) but more limited scale, while the opposite is true for generalist publishers.

A third problem is that it’s difficult to standardise first-party audiences: one publisher’s definition of “in-market fashionistas” might be very different from another, so aggregating these signals can lead to confusion and open the door to mislabelling and mis-selling.

The fourth and most often cited problem is that publishers’ first-party data “doesn’t scale”, even when it is aggregated across publishers, eg. through curation. Buyers would have to cut a thousand deals with all the publishers in the land — and who has time for that?

The real problem: Addiction to IDs

While there is some merit to these criticisms, they are entirely solvable with fairly minor interventions.

First, the measurement problem is the most difficult to solve, but it can be solved. Browser APIs, cohort-based solutions like Anonymised and clean rooms are all capable (to different degrees) of connecting publishers’ and advertisers’ data in a post-cookie internet.

The value question is fair, but overblown. A generalist news publisher’s data might not command the highest CPM on the market, but the quality of the inventory and the scale of the audience more than make up for a less valuable segment.

Moreover, value would remain where the data is produced, instead of leaking to adtech and being used to spend on cheap made-for-advertising (MFA) sites.

The standardisation question is also easily solved by establishing a standard and adopting independent technologies that use a single methodology and whose only incentive is to perform well (not to inflate or deflate CPMs).

Finally, the accusation that first-party data does not scale is not just wrong but nonsensical.

Publishers can view and monetise close to 100% of their audiences, so they naturally have more scale than third-party-data-based audiences. The reason these audiences do not scale has nothing to do with the sellers and everything to do with the buyers’ platforms.

ID check at the door

Advertisers are led to believe that publishers’ first-party data does not scale because their buying platforms are unable or unwilling to buy it.

Supply-side platforms (SSPs) and demand-side platforms (DSPs) have a strong bias towards ID-based traffic. Not only does the bid request sent by the publisher have to contain an ID, but it generally has to be an ID that has been previously seen — and therefore recognised — by the SSP and DSP.

Publishers’ first-party data does not often meet this requirement. If it is sent as an SDA, first-party data doesn’t contain an ID at all. If it is sent as a first-party cookie or some equivalent first-party ID, the SSP and DSP will struggle to recognise that user unless the first-party cookie had been previously associated with the third-party cookie.

This creates two scenarios: either the first-party data is not recognised and the bid request is not sent, or the first-party data is combined with a third-party cookie, the bid request is sent to the DSP but the impression is attributed to the third-party cookie.

To add insult to injury, in the second case, the publisher’s first-party data is leaked to adtech. The seller-defined audience of “in-market fashionistas” would be combined with the third-party cookie and the new interest stored in the platform’s database, for that user to be targeted anywhere on the web.

Removing the ID bottleneck

Repeated cases of bots, fraud and MFAs (yes, it’s still a massive problem) are pushing advertisers away from mass-scale purchase of low-grade inventory at insanely cheap CPMs.

Premium inventory is extremely valuable and premium publishers’ audiences are both performant and scalable. In fact, the closer you keep the data to where it is generated (the publisher), the less contamination it has from third-party probabilistic data and the higher its quality.

All the problems affecting publishers’ first-party data are solvable or have already been solved. But to unleash the promise of first-party data, we need to challenge the assumption that a trillion-dollar industry can only run on the outdated, fast-declining, regulation-skirting currency of interoperable user IDs.

Neither advertisers nor publishers are fond of using IDs that leak their data to third parties and neither advertisers nor publishers are seeing positive return on investment from the technology. Both groups can ask their platforms to add ID-less buying and measurement to their roster of options. They will reap great benefits from it.

Mattia Fosci is CEO at Anonymised

The ID bottleneck: This is why publisher first-party data isn’t reaching its potential

The ad-vantages of streaming

Media Jobs