Data gravity

AI moves to where the data live. Consumers have the most data. You are the sun.

Scroll to read

Last week Brad Gerstner and Bill Gurley discussed language model switching costs. What they and others have observed is that while many enterprise start prototyping with private language models, many are switching to open source models run in their own systems for security and control.

While switching costs do not generally furnish user surplus, they are important ingredients to shareholder value. Bill Gurley rated LLMs as having a 2 out of 100 score for switching costs.

Brad later says that switching costs will not live with the LM but rather “where your data resides.”

“It seems to me that AI is finding its way back to the data” as a sort of “data gravity.”

Data Gravity is old and simple idea that, to our knowledge, was introduced in 2010 by technologist and author Dave McCrory.

Data Gravity. Dave McCrory. 2010

Consider Data as if it were a Planet or other object with sufficient mass.  As Data accumulates (builds mass) there is a greater likelihood that additional Services and Applications will be attracted to this data. This is the same effect Gravity has on objects around a planet.  As the mass or density increases, so does the strength of gravitational pull.

Services and Applications can have their own Gravity, but Data is the most massive and dense, therefore it has the most gravity.  Data if large enough can be virtually impossible to move.

Dave saw latency as a driving force of this gravitational pull, but increasingly it's appearing that security and control are similarly forceful.

Today, enterprise are choosing LMs that live close to where the largest or most important set of data live. For internal enterprise applications, that seems to be the cloud or bare metal where organizations host their data.

But for consumer use-cases, things become less clear.

In consumer, organizations use first-party customer data to try to anticipate customer needs and serve them more relevant content on their properties or others’.

But as we’ve written before, first party Walled Garden data is unlikely to be enough for AI. Even for organizations with lots of first party data, organizations only know what customers have done with them. Organizations only know what we do together.

Users have busy lives outside of what they do with any given organization.

Could the user actually have the data gravity?

Center of the universe

Traditionally, organizations have built personalization systems by centralizing user data and training machine learning models over all their users’ data.  Think Netflix’s famous collaborative filtering model. They built it by combining user trait information with past movie ratings and used it to estimate what movies you might like based on what people like you liked. This helped organizations like Netflix get a better understanding of its users, and sometimes outcompete organizations with weaker user understanding.

This pattern powered a rise of switching costs that investors loved.  This is because consumers became accustomed to organizations that could serve them better (e.g., TikTok knows what we like) and became resistant to switch to organizations with weaker consumer understanding.  

But if a consumer could pull all her data together – all her data across contexts – then she’d have more data on herself than even any large organization has.

AI would gravitate to her.

And with AI situated next to her data, she could reverse the client server model of the internet.

Reverse client-server

In the traditional client-server model, the server ("APP") hosts the application logic, data and resources, processing requests from clients and sending back results. Clients ("User") interact with the app to access these resources.

Left: Traditional client-server model. Right: Reverse client-server "Headless Personalization".

The inverted client-server model reverses these roles. The user (client) controls their data and flexible application logic, today in the form of AI. When the user submits a request to an application, the application sends a request with instructions to the user's AI resource. The AI resource processes the user's data according to the request and user-defined controls, and returns the personalized output to the application. The application then combines this output with its own data or logic to provide the service, without directly accessing the user's underlying data.

You are the sun

This access pattern follows the Data Gravity. With Crosshatch, users have the means to bring all their data with them to any application.  Following Brad Gerstner, AI follows the data gravity. While organizations have some first party data, users can bring much more. The reverse client server model is just a reflection of the Data Gravity.

This shift in data gravity and rise of AI to function as a competent application logic engine makes the reverse client server (as described by All In Pod last year) possible.  Data gravity is why we’re so excited about Cross-Pollinating Walled Gardens, where third parties will be able to anticipate our needs based on everything we do and share, not just what we’ve done together with one company.  Headless Personalization is the architecture for this activating data gravity, where consumers wield the data and a capable data execution engine – AI. It also has nice economic benefits, in that it is it delivers rich personalization (essentially "Jarvis") for a lower cost and in a way consumers trust.

If you’re interested in building new experiences that anticipate customer needs, give us a ring – we or one of our trusted partners would love to help! Reach out!

See what Crosshatch can do for your business.

Crosshatch for businesses

Collecting our thoughts

an internet
Made for you