Privacy and collaboration

BLOG

Open collaboration is one of America's greatest strengths. But its realization on the internet has felt constraining and icky. The pathway of AI to a more open and collaborative internet.

Scroll to read

4/28/2024

The internet has a control problem.

Our data, powering personalized experiences and ads on the open internet, flows to anyone who can pay for it or get invited by an exclusive co-op to share in it.

We’ve slouched toward accepting this is how the world works. It feels icky. Some, like Privacy professor Daniel J Solove think it's hopeless: "managing one’s privacy is a vast, complex, and never-ending project that does not scale." An open internet trading limited personal data for usage in contexts we’ve never discussed.

Early methods to try to control it have felt frustrating and left us with cookie fatigue. It seems the natural response then is to call for more privacy by law or self-preferencing corporate policy.

We believe this is a mistake.

The trouble is that privacy lacks consistent definition. Ask different people what it is and you'll get different answers. Privacy is, however, a specific expression of control. The real issue isn't what's known about us for what and by whom, per se.

The issue is who gets to decide.

This blog explores how privacy, control, and open collaboration intersect in the digital age, and how language models could provide a path forward for enabling controllable collaboration while respecting privacy preferences.

‍

Open collaboration

Control is hard if you want open collaboration.

And we want open collaboration. It has a rich history in America. It's one of our greatest traits. Alexis de Tocqueville admired America's collaborative spirit in Democracy in America.

An association unites into one channel the efforts of divergent minds and urges them vigorously towards the one end which it clearly points out.

The most natural privilege of man, next to the right of acting for himself, is that of combining his exertions with those of his fellow creatures and of acting in common with them. The right of association therefore appears to me almost as inalienable in its nature as the right of personal liberty.

Toward adopting a notion of collaboration in a technical setting, we look to behavioral science that takes collaboration as

a process through which parties who see different aspects of a problem can constructively explore their differences and search for solutions that go beyond their own limited vision of what is possible.

This definition maps nicely into the mechanics of digital collaboration. We pose our own definition, saying

Digital Collaboration happens between two or more Data Controller agents who

see and share different information sets, instructions, processes or algorithms
all toward solutions superior than those otherwise resolved independently.

We introduce GDPR Data Controller language because collaboration involves the sharing of data and context amongst different people or entities. Each of these entities have the ability to take this information and do what they like with it. The legal definition of Data Controller captures and formalizes this capability.

Open digital collaboration, however, is one of the hardest and most important problems on the internet today. How do we create a way to make the internet richer and more abundant through collaboration while still enabling effortless agency and control among consumers and businesses who wish to collaborate?

We saw the depth of this challenge last week when Google delayed its cookie deprecation for a third time.

Third party cookies and related fingerprinting technologies are how the open internet does frictionless targeted ads and measurement, but it comes at the cost of consumer control. In this case, advertisers, publishers, and data management platforms (and indirectly consumers!) are data controllers who each have a different view of the consumer, context, and goal. They Collaborate toward more relevant and engaging ads and experiences.

Google’s Privacy Sandbox initiative increased privacy but came at the apparent cost of open collaboration.

Our sense, however is that it’s not even about privacy.

‍

Privacy is about control

Privacy is a nebulous concept.

Many in law, philosophy and culture have attempted and failed to find a satisfying conception of privacy. It’s been over 50 years since Arthur Miller complained privacy as so “difficult to define because it is exasperatingly vague and evanescent” and even today’s privacy law experts concede there’s no unified conception of privacy.

America’s first articulation of a right to privacy –– Warren and Brandeis’ 1890 piece in the Harvard Law Review –– still reads true though a little dramatic today

Gossip is no longer the resource of the idle and of the vicious, but has become a trade, which is pursued with industry as well as effrontery.

What would Warren and Brandeis think of the $10+ billion data broker industry or the $300+ billion ads business trading on estimations of who you probably are or what you probably like?

Nearly 60 years later Harry Kalven Jr.’s critique of the complexity and confusion of privacy as a concept still has teeth

Conceptually I am not sure I see what the invasion of privacy consists of. Is it an invasion of privacy to say falsely of a man that he is a thief? What is his grievance? Is it that … it would have been an invasion if true and no one had known it? Or is it that it is false? Or is it that his name has been used in a public utterance without his consent?

Even contemporary privacy offerings are hard to understand. Technologists clamor for "local-only AI" as favorite privacy-first devices like iPhone deliver content availability through the cloud. Companies advertise privacy protections like differential privacy but reportedly choose privacy parameters not in keeping with those considered acceptable by research communities. Either way it's unclear if consumers would even understand what that means. Despite a century of writing of privacy, no one seems to know exactly what it means nor understand what it entails when otherwise promised.

Our view is that privacy is an instance of control. You can have privacy but not control (Privacy Sandbox) and control but not privacy (deciding to attend Santa Con or gay pride).

As we build new technologies enhancing consumer agency or privacy, it’s critical we know what specifically the tech is responsible for.

Economics’ view of privacy is more tractable.

‍

Tradeoffs of sharing

Economists have been interested in privacy since the 1960s, but specifically in a study of the trade-offs arising from protecting or sharing of personal data. We wrote about one of those tradeoffs last week.

Analyzed as economic goods, privacy can be revealed to have interesting and sometimes unusual properties e.g.,

Privacy is a "public good", namely someone else not protecting their privacy could end up affecting your privacy!
Privacy has a confusing mix of tangible properties (like getting coupons from sharing data) and intangible properties (“that felt icky”)
Privacy lacks a clear way to value it: is it worth
- the minimum amount of money you'd accept to give it up
- how much you might pay to protect it
- the personal cost of it being exposed to the public
- the marginal expected profit someone else gets from acquiring it
- how much positive or negative publicity you get from sharing

As we think through privacy at Crosshatch, and the mechanisms we might need to secure it (if we decide to characterize it as an action of protecting privacy at all!) we find ourselves preferring the economic framing of the concept for its simplicity and tractability. This framing also clearly motivates a user-centric view of privacy.

Privacy is given by tradeoffs defined by users.

This centers privacy really as a question of agency – of sharing (and un-sharing).

But what does privacy have to do with collaboration or AI?

‍

Controllable collaboration

The core question for the age of AI and abundance where

Data Controller collaborators of different known contexts can come together to exchange information for the production of greater value than they could alone

is what are necessary conditions to Collaborate?

On the other hand, necessary conditions could be too strict. What's necessary could depend on consumer preferences. Instead we may wish to pose this in an economics setting. What information is inexpensive enough to be revealed in a given collaboration setting?

For instance, in real life we readily share more than what's strictly necessary for a collaboration. Our neighborhood florist knows our partner's birthday. The coffee shop knows we like cold brew in the morning but cortados as a treat in the late afternoon. In real life we don't constrain engagements to the minimum necessary information but informally bound it according to the context of the engagement.

So the open question is what data beyond that which is necessary is sufficiently low cost to share for a given collaboration paradigm?

Language models offer a unique path to begin to explore this.

‍

AI as a trusted facilitator

Before language models, the technical requirements necessary to digitally collaborate were much greater and framented. Business collaborators needed to collect troves of consumer data, combining it in their own compute environments to create models and systems they used to compete to serve us.

Language models, on the other hand, appear as a route to resolving this fragmentation. They have a more forgiving interface than any prior machine learning system and can execute programs of natural language. Their broad democratization – available to anyone not just organizations who've collected a bunch of our data – weaken the power dynamics that motivated massive data collection in the first place. Data gravity is shifting to the consumer anyway.

It seems then that in the age of AI open collaboration under Collaborator control could simply mean

the well-structured and secure combination of collaborator instructions and private information “context” for a given AI resource with rules for what Collaborators can see at the conclusion of a given Collaboration.

To collaborate with language models as a trusted intermediary, all collaborators need to do is define what context could be available for what purpose.

With the Crosshatch model, apps show consumers the potential benefits of allowing some data collaboration. Consumers can then decide exactly what information they want to share and for what purpose (or defer to an authorized agent). Crosshatch acts as an intermediary, facilitating these collaborations through secure language models while enforcing the consumer's chosen sharing preferences.

We’re particularly excited about this path because it’s not even a new construct.

This collaborative model powered by language models is reminiscent of the "clean room" approach used in ad-tech for privacy-preserving data sharing. However, our proposed framework extends this concept further by creating a consumer interface and making configuring controlled collaboraton effortless. We then focus on

How easily can agents enter into Collaborations?
Do collaborators need to change how they work to Collaborate? What about to further minimize data shared as a part of a Collaboration?

The future of open collaboration on the internet will be defined by the rigor of our framing of collaboration –– what actually is it and who really are the Collaborators who have a rightful seat at the table –– and the effortless control Collaborators have to enable rich collaborations.

The AI-enabled collaboration model is a promising path forward. Taking secured AI as trusted compute environments for the execution of collaborator-aligned instruction-based collaborations is a simple way to begin to enable flexible open collaboration under collaborator control.

Taking privacy as an economic object shows privacy not as some exasperating vague object but counter-intuitively, one that enables more sharing and collaboration the more inexpensively controllable it is.

And with language models, a simpler 18th century posing of privacy

It is certain every man has a right to keep his own sentiments, if he pleases. He has certainly a right to judge whether he will make them public, or commit them only to the sight of his friends.

has a real chance of coming to fruition. And if we can be forgiven for being so excited by Tocqueville, can help grow our distinctive American richness

The most democratic country on the face of the earth is that in which men have, in our times, carried to the highest perfection the art of pursuing in common the object of their common desires and have applied this new science to the greatest number of purposes.

See what Crosshatch can do for your business.

Crosshatch for businesses

Collecting our thoughts

Vibe personalization

Algorithms have cooked our personal style. Personal AI without complete context is about to do the same.

Personalized AI Gateway

AI gateways sit between users and AI to govern and scale AI applications. Personalized AI gateways add user-owned runtime context.

"P" in MCP stands for privacy

No it doesn’t. MCP has security vulnerabilities: it’s also a privacy nightmare.

Free agents

Everybody deserves a free and incentive-aligned AI agent. Finance it by letting apps call your agent.

Introducing Playground

See the internet’s first personalized inference gateway in action.

The end of first-party data

Why Walmart's vast first-party data still failed to deliver the personalization AI promised—and what this means for your customer data strategy.

D.R.AI.

Today’s AI agents own no inventory and are not directly responsible for outcomes. Who provides services we trust? Apps.

MeDP

Help consumers save time with better software and portable preferences, not automating clicks.

All blogs

Our blogs cover the history, philosophy, economics, and technology of internet personalization (with Crosshatch). Copy all 40+ into your favorite long-context AI like Claude Opus to learn more.

Explore the blog

CREATE UNIQUE

EXPERIENCES

FOR EVERY USER

start building