Open Indie

Writing about open & equitable product development

The multi-polar Social Web of my dreams has been beautifully exemplified in two recent articles.

First, there’s How decentralized is Bluesky really? by Christine Lemmer-Webber, co-author of the ActivityPub protocol.

Christine’s article opens with:

recently I have received some direct encouragement from a core Bluesky developer that they have found my writings insightful and useful and would be happy to see me write on the subject. So here are my thoughts.

She also goes on to praise Jay Graber, the Bluesky CEO:

For that matter, I think the part of Bluesky I probably respect most personally is Jay Graber. I was not surprised when she was awarded the position of leading Bluesky; she was the obvious choice given her leadership in the process and project, and every interaction I have had with Jay personally has been a positive one. I believe she leads her team with sincerity and care. Furthermore, though a technical critique and reframing follows, I know Jay's team is full of other people who sincerely care about Bluesky and its stated goals as well.

In conclusion..

Bluesky is built by good people who care, and it is providing something that people desperately want and need. If you are looking for a Twitter replacement, you can find it in Bluesky today.

This post was positively received by the Bluesky team, lauded for its deep detail and even-handedness. It filled a void that had been created by a flurry of reactionary takes written in bad faith, motivated by us-vs-them binaries and tribal protectionism.

A few days later the aforementioned bridge-builder and ‘core Bluesky developer’ Bryan Newbold responded with his Reply on Bluesky and Decentralization, which opened thusly:

This is a reply to Christine Lemmer-Webber's thoughtful (and widely read) “How decentralized is Bluesky really?” blog post.

I am so happy and grateful that Christine took the time to write up her thoughts and put them out in public. Her writing sheds light on substantive differences between protocols and projects, and raises the bar on analysis in this space.

Fellow netizens, this is what prosocial engagement grounded in mutual respect and curiosity looks like. It is exactly the kind of adulting I want to see (and frankly expect) from our protocol elders — a title they’ll just have to accept, even if begrudgingly.

In closing, a note on Bryan’s musings on appropriate terminology:

Overall, I think federation isn't the best term for Bluesky to emphasize going forward (…)

What would be a better term? At some point we started using “social web” more, and I think that matches the atproto architecture well. There is some tension around that term because it is used by the W3C Social Web Community Group, and the recently launched Social Web Foundation, both of which are ActivityPub / Fediverse projects.

The amicable exchange that just happened between Bryan and Christine is the web at its most social, and it took place on several different platforms and protocols, interlinked by the mighty URL.

That’s the social web I always have and will continue to be part of.

A decade ago I embarked on a journey to Rashidieh, a mixed but primarily Palestinian refugee camp in southern Lebanon. I spent three months there as a volunteering youth envoy of ‘Palestinakomiteen i Norge’ together with the close friend who had invited me along.

Though it’s referred to as a ‘camp’, Rashidieh is a dense city of brick & cement, housing over 30,000 people, same as Molde, the biggest city an hour away from my tiny home town. Established in 1936, Rashidieh camp is nearly a century old. As such it is an unusual place with its own flow of time.

I had done this type of longer-term stay abroad a handful times before; a rare privilege afforded to me as a worldly Norwegian citizen. While I do believe in the genuine altruism of myself and others, these journeys have always been for a selfish reason at heart. An escape. A search.

This time I was searching for meaning in the wake of my mother’s passing a year prior. In that community I was met with heartfelt compassion from people for whom the loss of family members – whole families even – was a brutally regular occurrence of life. There was no comparing my bereavement to theirs, yet we grieved together all the same, and in that grief we were equals.


For the past year I’ve kept a certain distance to the apocalyptic destruction of Palestine. I joined some of the protests and read some of the articles, but for the most part I retreated to my work for the sake of my sanity: Stay the course and focus on what you can control. Grow strong enough to lift others up when you’re able.

The invasion of southern Lebanon however shook something loose in me. So much of my work in my adult life has been driven by a desire to give back to that place, down in the south, now under siege. I had dreamed up some Big Plans for how I was going to be a good little helper. It seems now I may be too late.

Earlier this week I spent half a day just staring into empty space, sobbing. In the midst of all that sadness, it felt good and right to be emotionally connected to that place and those people again.


Yesterday I participated in the first call for the Post Growth Entrepreneurship incubator. In a small breakout group where we were encouraged to check in with each other, I spoke those feelings aloud for the first time and teared up once more.

By the end there was relief. I realized this is something very real that I’m processing, not just some imagined empathy borne out of good-boy solidarity with the oppressed.

I’m not done with that place. I haven’t given it my all yet. But I may have missed my opportunity to be the giver I imagined myself to be, and there’s a deep, heartbreaking sense of inadequacy in that recognition.

Hence the words on this page, to make space for the guilt, the anger, and the shame. I can’t do my work in the world as an ally before I’ve let these emotions pass freely through me – not to be shed as waste, but rather to be integrated with the whole of my being, like tattoos on the heart.

There’s no quick resolution to be found here. The plan failed, but my resolve as a waking citizen of the global village remains unshaken.

Back in June I wrote about an exciting confluence of digital auth tech:

Social sign-in for indies

The focal point of Weird Netizens was the convergence of OIDC, Rauthy and FedCM as open identity technologies. I've dabbled in online activism for a long time and never before have I experienced these kinds of ripple effects.

  1. February: A contributor to the development of FedCM raises awareness about a potential fork in the road for the FedCM spec, which would make it yet another Big Tech exclusive if the wider internet community did not engage. The call to action is amplified by another activist a week later.

  2. March: One of the FedCM spec authors invites indie developers to demonstrate the viability FedCM as a completely provider-agnostic technology. If no one answers the call, the spec writers may consider the indie use case void.

  3. April: After a month of silence we designate a Weird collaborator to begin work on FedCM. This kicks off a flurry of activity that to this day shows no sign of stopping.

  4. May: Experimental FedCM support has landed in Rauthy, obligator, Solid and IndieAuth!

As a cherry on top, this meeting of identity-savvy minds has led to a pending update in the IndieAuth spec which makes it compatible with OIDC, and by extension Rauthy.

For anyone unfamiliar with IndieAuth and FedCM, simply put they are different types of web sign-in, which is the ability to sign in to websites using your personal web address, without having to use your e-mail address.

IndieAuth

IndieAuth is a federated login protocol for Web sign-in, enabling users to use their own domain to sign in to other sites and services. IndieAuth can be used to implement OAuth2 login.

Federated Credential Management

FedCM is a Web Platform (browser) API that allows users to login to websites with their federated accounts in a privacy preserving manner.

While there’s some overlap, they mostly solve two different, mutually complementary problems, and can be used in tandem.


Three months after my post in June, we’re in great shape:

  • The IndieAuth specification has been updated for greater OAuth/OIDC compatibility.
  • The FedCM specification is now an official W3C First Public Working Draft.
  • All Chrome-based browsers support FedCM.
  • Independent identity providers like Weird and LastLogin can be used for real-world testing.

In short, it is now easier than ever to log into web applications using your own website as an identity provider. Or at least, it would be, if only your favorite web apps supported these agency-enhancing technologies.

The folks at Google still feel like we need more evidence of RP/client (auth-speak for web app) interest:

We are still actively pushing this and interested to move it forward. Chrome just launched the Multiple IdP #319 origin trial, which is a pre-requisite here.

From an ecosystem perspective, we are still lacking evidence of demand / product market fit with relying parties. It is clear to me that browsers, users and IdPs would be motivated to use this extension, but it is not yet clear whether relying parties [i.e. web apps] would. We got webmention.io, which helped us build a proof of concept, but we are still lacking RPs to give this a try organically.

We could really use 3-5 real RPs that we could use to help us co-design this in an origin trial against real users.

Is that something that you feel you could help us activate this part of the ecosystem?

So here I am, 👴🏻 Once Again asking for the support of my fellow indie agitators. We need live applications, already in production use, to experimentally support FedCM. Possibly also IndieAuth while you’re at it.

This is an emerging web standard; all you need is already in the (Chrome-based) browser:

simple as.

Live Applications

Who exactly is this post talking to? Essentially any independent or open source application that offers a legitimate (service-oriented) alternative to the incumbents which are Too Big to Care.

Top of mind for me are:

Bluesky

Though currently in the throes of a (very friendly) Brazilian invasion, once the Bluesky devs have capacity to spare there’s probably no one better suited to lead this charge. Domain names as handles is a flagship feature of the Bluesky network. It follows rather naturally that users ought also be able to log into the network using their own domains.

Discourse

As the most widely used forum software today, Discourse is quietly one of the biggest indie social networks around; it’s just not an interconnected super-network, though that’s gradually changing as they’re adopting the ActivityPub protocol. With its deep roots in internet geekery, Discourse powers many communities whose participants would jump at the opportunity to log in to their favorite forum instances with their very own identity provider.

Codeberg

As a passionate advocate of open source values, Codeberg avoids proprietary technology to the greatest extent possible:

Dependencies on commercial, external, or proprietary services for the operation of the platform are avoided, in order to guarantee independence and reliability.

Even so, they pragmatically provide login-via-GitHub as an option, presumably because of the undeniable accessibility/onboarding gains realized by GitHub’s massive network size. Enabling independent domain logins would allow them to chip away at this undesirable status quo.

WordPress

Bastion of the personal webpage, WordPress already has mature plugins for an instance to operate as its own OIDC or IndieAuth provider. There’s a straight shot from there to OIDC-FedCM or IndieAuth-FedCM.

Mastodon/Fediverse

It’s already possible to log into an experimental RP with a fediverse account, as demonstrated by FedIAM.

Going the other way around – logging into a fedi instance via FedCM – might be closest within reach for a single-user server like Hollo.


Now or never

But what if no one uses it? What if Google-corp pulls the rug? What if macroeconomic factors beyond our control brings everything to a halt!?

There’s no guarantee that this will work, but if we don’t try now it’ll be another 5-10 years before the opportunity comes along again. And if it does work we will have successfully nudged the web we love one step further towards greater agency and equal access. If there ever was a time…

Mark Zuckerberg has proclaimed that Open Source AI Is the Path Forward. He's not wrong.

At the same time, he's absolutely not in it for primarily selfless reasons. When you're late to the tech trend, the best way to catch up in both R&D and mindshare is open source your stuff, so that's what Meta is doing.

Even though Mark doesn't yet have an innate understanding and appreciation for The Commons, I'm cheering for Meta's big bet on open AI.

Since what 'open source AI' actually entails is woefully undefined [1], I'll offer a simple illustration of what trustworthy AI necessarily looks like.

Mutual trust

Flawed as they may be, our new AI citizens are here to stay. The key to a happy coexistence is trust. Thankfully, knowing which AI-agents you can trust is actually very easy!

This is how you test your AI agent's trustworthiness: Ask it to explain exactly how it was built. A trustworthy AI agent will be able to walk you through its inner workings in great detail and at whichever level of complexity you prefer.

Crucially, the “self-insight” of your supposed AI-friend must extend to its original training data. It's nearly impossible to build trust and make friends with some one who doesn't have any memories and therefore cannot tell you anything about why it thinks the way it thinks.

If I ask my AI-friend to draw me a picture of a swan, we should be able to have a conversation like this:

Erlend: That's a beautiful swan drawing! Which drawings did you learn from to draw this one?

AI-friend: Doing an image-similarity search against my training library, I found these 20 (author-credited) images of swans (out of 20,000) that closely match the picture we [The System] rendered for you.

Erlend: Fascinating. And why did you display a photorealistic swan instead of, say, a cartoony one?

AI-friend: That would be because of parameters XYZ...

..and so on. Nothing should be off limits. Easily digestible snippets of data should be just as readily available as links to the full-size repositories.

True AI friendship demands sincerity

The most meaningful version of 'open source AI', to me, is a provably earnest AI. I can only trust an AI agent that readily bares its software soul to me at a moment's notice.

Maybe that seems like asking a lot. In my human-to-human relationships I also expect honesty, but not in the absolute way that I do in a human-to-AI relationship. That's because I know there will always be things my human friends simply can't tell me yet, or ever.

An AI agent on the other hand has no such reservations about what information to divulge, as it is not a conscious, thinking entity with wants and fears. Outside the context of its commercial purpose, the AI has no reason to obfuscate its self-knowledge from me.

As such, I will only ever pay money for earnest AI. Anything else is designed for deception. I will pay good money for honesty.

We must keep in mind that these models are trained by information that’s already on the internet, so the starting point when considering harm should be whether a model can facilitate more harm than information that can quickly be retrieved from Google or other search results.

If Mark wants to rebrand as the organic cloud farmer, the only way for him to prove his commitment to a truly regenerative practice is to fully open up the training data for Llama. You just grabbed it all from the open internet anyhow, right?

So show us exactly what goes into your AI produce. We, cultivators of The Commons and the corporations that want to monetize it, can't possibly build a 'broader ecosystem' together unless Meta and its ilk can be transparent about where it is getting its water, nutrients and seeds (inputs) from, and what byproducts (outputs) they're releasing into the ecological cycle.

[1] – The OSI is engaged in a deepdive to solve for 'what is open source AI?' and I applaud the effort, but to be frank I think their latest draft shows they are still stuck in an antiquated, software-centric (as opposed to people-centric) world view.

A year ago in Feed Overload I wrote:

99% of all microblog (and chat) content is ephemeral by design, meant for a specific moment in time. But the 1% that should endure past the 24hr cycle doesn't have good ways to do so in the current paradigm.

Reddit/Lemmy has a simple Top sorting mechanism for viewing highly rated content in the past Day / Week / Month / Year / All Time. This is a great way to surface evergreen knowledge artifacts in places like r/AMA and r/todayilearned. It's also a very helpful way to get oriented in a new space.

The same could be done for hashtags on the fediverse. Treating hashtags as not just timelines of the present moment but also containers of institutional knowledge could lead to all sorts of innovations in knowledge management on the fediverse.

I explored some tangents along that trail in Follow Anyone and Sense-making on the fediverse. Today I’m continuing down this path, refocusing on the notion of content gardens, spurred on by two new developments.

First, a new type of links-curation app was announced: Introducing linkblocks, the Federated Bookmark Manager.

Then yesterday a developer I follow on the fediverse mused about a knowledge-sharing app in the same vein:

I'm thinking about working on a new platform for reading stuff on the web. To launch, I want a RSS reader (like miniflux; feedly) and a bookmark manager (like pinboard; pocket) with tight integration between the two and opt-in community features. I will eventually extend to stuff like annotation.

I’m particularly interested in the Pinboard-like experience. Prior to all of the all of my blog posts linked above, I wrote an experimental piece called Netizenship from first principles wherein I try to imagine a safe on-ramp to the internet for my 7yr old nephew.

I think I’ll rewrite it one day as I never felt like it fully arrived at its intended destination, but it presents a trio of magical applications that I still consider to be a great foundation for sense-making on the web:

  • 🪪 an ID card you can never lose, to safely make your self known on the web.
  • 👜 a bottomless Bag of Knowledge, for storing and synthesizing the wealth information you come across on your journey.
  • 🌐 a telepathic Study Group, to connect with other learners and exchange resources as part of a knowledge-gardening collective.

There’s more than one application catering to each of these archetypes. They’re not necessarily divided in three either, but personally I prefer that delineation.

For my purposes, Weird will cover the 🪪ID card and Omnivore already covers the 👜Bag of Knowledge. The missing piece is the 🌐Study Group, and that’s where Linkblocks comes in.

Social knowledge network

I think Linkblocks is still figuring out its identity and I don’t intend to direct it one way or the other, so what I’ll be talking about here is how I personally imagine and want a web application like Linkblocks to behave.

The social bookmarking app archetype has been around for decades, popularized by Delicious and carried as more of an indieweb phenomenon by the likes of Pinboard.

It bears a striking resemblance to Reddit, which is no accident. Reddit, like its forebearer Digg, was a subsequent iteration on the links-aggregator concept, but with one crucial difference: Rather than leaning into the timelessness of social bookmarking, the Reddits and Diggs of the world were social news websites, which are different beasts entirely.

Reddit is all about the *now*; viral trends of the day. Pinboard’s quiet indie success has been in the *timeless*, the evergreen nature of content without an expiration date. It’s not about *when* links are added, it’s all about how many people have the same link in common, and what tags-of-meaning they’ve applied to them. Commenting is also entirely optional in the links garden, instead endorsing a digital form of parallel play.

What all of these apps do have in common is the function of a links aggregator. It is therefore conceivable that what Linkblocks is doing could just as well be accomplished with the similarly Rust-based Lemmy for instance. In a world of more architecturally modular applications I think that would be quite possible, but as things are I think the DNA of Lemmy as a Reddit-like is too deeply embedded for the notion of timelessness to fully take root and thrive there to its fullest.

Newspapers and books are made of the same exact stuff – paper, ink and words – yet their distinct form factors make all the difference in how we treat them as either ephemeral or long-term stores of knowledge.

Reading vs sharing

Having talked about the different types of link aggregators, let's now draw a line between the two categories of read-it-later apps, also commonly known as bookmark managers.

As I see it, the difference lies between applications for reading and sharing. A secondary separator can be gleaned between private vs public.

  • Omnivore, Wallabag, Shiori, Linkwarden: Optimizes for the reading experience, practiced almost exclusively in private.
  • Pinboard, Linkblocks, Linkding: Primarily enables a sharing experience, practiced partially or fully in public.

For the latter bunch the app-makers themselves might disagree with me, but I think their capability for public sharing puts them in a distinctly different category than the former bunch. The combination of social-sharing and publicly readable content makes those applications more closely related to the links aggregators of the ‘social media’ variety. And yet, not quite that either.

Here’s a live example of Linkding as a public listing of links on someone’s personal webpage.

That app is a mature example of what Linkblocks might grow up to be, though given its ambitions as a federated bookmarks manager my hope is that linkblocks will more fully embrace the magic of sociality.

Considering what the read-centric applications already do well, linkblocks would be wise to focus its efforts on sharing:

With linkblocks, you can do three things: You can bookmark what you find on the web, you can structure your bookmarks, and you can exchange bookmarks with other people.

It’s only that last thing that I currently lack a good tool for. A narrow focus on the public exchange of links lends itself well to a series of other novel features, like collections:

Curated collections

I’d like the ability to create curated lists of roughly the same kind as what you’ll find on IMDB: https://www.imdb.com/list/ls055592025/

One way to do it would be to allow lists based on tag combinations, e.g.:

burnout + open-source

  1. https://mikemcquaid.com/open-source-maintainers-owe-you-nothing

  2. https://nolanlawson.com/2017/03/05/what-it-feels-like-to-be-an-open-source-maintainer/

  3. https://www.wired.com/story/open-source-coders-few-tired/

  4. https://writing.jan.io/2017/03/06/sustainable-open-source-the-maintainers-perspective-or-how-i-learned-to-stop-caring-and-love-open-source.html

  5. https://medium.com/@mikeal/time-to-leave-a68294ccb2af#.p8ss5xeqz

The key difference from having a bunch of articles with a certain tag is that a Collection can optionally have an order added, to say “read this before that”. That way you’ll have an additional data point that can be used to arrive at a global list of the top3/top5/etc. #burnout+#opensource articles.

I’ve started this feature discussion on the linkblocks repo.

Automated collections

I run two chat spaces for my Spicy Lobster and Commune projects. Both of these spaces have accumulated hundreds of links at this point.

We can imagine an automated collection of ‘Commune links’ by simply passing any link added in that chat onwards to Linkblocks, already tagged with whichever channel it was posted to. Additional tags and ordering can be added from there, for example by tagging some links as “essential” and others as “advanced”.


New paradigm

Social bookmarking is a novel use case for ActivityPub and I’m super excited about it. I heckin’ love links and lists. I wanna use them for everything.

Things like Bookwyrm are cool, but it’s not what I want. I just wanna link the thing. Books, films, podcasts, articles, songs.., they’re all just resource recommendations which can be encapsulated by links. Good Stuff, as Linkblock’s Rafael puts it.

I don’t wanna write reviews and rate with stars. I hardly even wanna do a search. I just wanna know who else in my network is interested in the same stuff, and have new stuff recommended to me that way. A local-first, relatively old-fashioned recommendation engine, subtly supercharged with online connectivity.

It's been a year since I wrote about Weird web pages as a prospective catalyst for the reclamation of my digital identity. There's been significant progress towards that end – more spread out among individual efforts than I initially envisioned, but ultimately for the better.

In this time a lot of necessary groundwork has been completed, some of which I didn't even realize was needed until I learned about it. Continuing from where I left off a year ago, let's go a few levels deeper into the vision of Weird, and the more clear-sighted vantage point we're looking out from today.

Starting with the self

Around a month ago I edited Assembling Community OS to reflect an emerging piece of technology which greatly helped crystallize my idea of Weird's most elementary purpose as a product. We'll come back to the shiny tech later; here's what we wanna do with it:

Digital autonomy begets individual freedom begets fairness & equality.

The hopeful possibility of this moment lies in the open-social web protocols which make up the foundations of a comms & coordination ecosystem owned and operated by the general public.

We have yet to bring these components together into one cohesive communications product, wherein messages and knowledge artifacts can move seamlessly from one flow-mode to the next and your identity remains the same throughout. Yet this ideal is closer to becoming reified than you might think.

Here's how I intend to do it, with a lot of help from my friends.

Part 1: Weird Identity

Before I can interact with other netizens, I need an online persona to make my digital self presentable and increasingly trustworthy. That's what Weird is all about. Most basically it's an open source equivalent to Linktree, supercharged by self-sovereign identity.

Weird will aggregate your fragmented persona into a single unified view. Establish your little slice of home on the internet without getting stuck in the content-production imperative of a custom website or a blog.

Then, thanks to the commodification of ID-tech steered by the OIDC standard, Weird can grow up to become a full-fledged identity provider by standing on the sturdy shoulders of rauthy. Meaning, you can 'Login with Weird' and use it as a kind of Gravatar on steroids. This will enable seamless login to all of the additional services we want to plug into our community stack.

To free ourselves of our current predicament, we must simultaneously de-centralize and re-centralize identity.

By de-centralizing the ownership of identity away from platform monopolies and back to individuals, we can re-centralize the agency of personhood.

Once more for clarity:

  • Decentralize ownership.
  • Recentralize agency.

The central authority of ones digital identity must first and foremost be the individual themselves. That's how we regain our digital sovereignty.

Everything we're gonna build towards in this article is based on The Path to Self-Sovereign Identity by Christopher Allen, written in 2016.

Rather than just advocating that users be at the center of the identity process, self-sovereign identity requires that users be the rulers of their own identity.

This is a lot to take in, so let's unpack it with a practical example.

The unbearable monopolization of being

In order to easily sign up for a new internet service today, without doing the tired email-and-password dance, I use “social sign-in”. We're all familiar with the usual “Google / Facebook / Apple / Microsoft / etc.” options.

It's very convenient because in this glorious age (/s) of technofeudalism you will inevitably have been forced on to one or more of these platforms already. You may as well entrench your identity in their unchallenged dominion even further by accepting their superstructure as the parent authority of that smaller service, even though the two are completely unrelated.

Incidentally, in a fairer world, that small service would be an up-and-coming challenger to the dominance of the ruling platforms. But that game has been rigged for a long time.

https://pluralistic.net/2024/02/08/permanent-overlords/

The authors of the paper 'Coopting Disruption: Has Big Tech disrupted disruption itself?' propose a four-step program for the would-be Tech Baron hoping to defend their turf from disruption.

First, gather information about startups that might develop disruptive technologies and steer them away from competing with you, by investing in them or partnering with them.

Second, cut off any would-be competitor's supply of resources they need to develop a disruptive product that challenges your own.

Third, convince the government to pass regulations that big, established companies can comply with but that are business-killing challenges for small competitors.

Finally, buy up any company that resists your steering, succeeds despite your resource war, and escapes the compliance moats of regulation that favors incumbents.

Then: kill those companies.

The authors proceed to show that all four tactics are in play today. Big Tech companies operate their own VC funds, which means they get a look at every promising company in the field, even if they don't want to invest in them. Big Tech companies are also awash in money and their “rival” VCs know it, and so financial VCs and Big Tech collude to fund potential disruptors and then sell them to Big Tech companies as “aqui-hires” that see the disruption neutralized.

Identity feudalism is an invaluable weapon in a tech baron's anti-disruption arsenal. Not only does it provide them with a to-the-second ticker on emergent platform upstarts that show signs of exponential growth, but every smaller player that defers to the mega-platforms for their network effects is consequently helping the fiefdoms deepen their moats by foregoing any network strength of their own.

How corporate centralization begets identity fragmentation

Ironically, the more corporation-centralized our identities become, the more fragmented they actually get. That's because when each of these mega-monopolies are big enough, they consider themselves the ultimate, unparalleled authority of digital identity. You won't find a “Log in with Google” button on Facebook, Apple or Microsoft's account page.

And yet you'll invariably need more than one of these accounts because, much to their chagrin, none of these companies have achieved complete world dominion quite yet. But if we stay the course, it won't be long until we can all enjoy the supreme technoutopian state of Absolute Customer Convenience.

Chain me up and sign me in your lordship!

Additionally, niche-targeted services will ask you to log in via the comparatively smaller but still monopolistic overlords of their particular domain. A project management app for instance might provide social logins via Notion, Slack and GitHub.

The built-in social providers of the app development platform Supabase demonstrates how fragmented our digitial identities really are in the current landscape:

Supabase social login providers.

Decoupling identity

Weird attempts something that other platforms don’t dare to do: It presents identity as the main attraction of its platform offering.

All mainstream identity providers get you hooked into their ID-network by means of a tight coupling between a light identity layer plus a heavy service:

  • GitHub: ID + git
  • Discord: ID + chat
  • Gmail: ID + email

The indivisibility of this coupling weakens our digital sovereignty. Even if I stopped using Gmail for email, I still rely heavily on it for my authentication to hundreds of sites & services. It’s part of their lock-in scheme.

Gmail et.al. make identity confusing because they've made it appear necessarily coupled with an overarching complexity like email or a social network. But identity should stand on its own. In fact it is paramount that our identity is not owned by a personal-data-loving megacorp because there's nothing more valuable for them to keep locked up than the very essence of your digital self.

However, identity on its own just doesn't sell because we've become complacently accustomed to it as a byproduct of a headliner service, and usually a “free” one at that.

So Weird makes a compromise. We acknowledge that plain identity is somewhat lackluster, at least in the current landscape. To be competitive, we loosely pair your identity with what is arguably the other side of the identity coin anyhow: The personal webpage.

  • Weird: ID as dynamic API + ID as static page.

And a ‘linkspage’ is the lightest, most low-effort webpage there is, since it only requires you to add links to wherever your online identity is already fragmented to.

Identity tech

Time for the techy bits! The past year has brought a series of innovations that, if brought together into a cohesive product such as Weird and others, could truly rock the foundations of the identity oligopolies.

I desperately want to be set free from my Big ID dependence. Sadly that cannot happen overnight since most sites need to explicitly add additional login options. Yet, a lot can happen sooner rather than later in the IndieWeb and fediverse that I now spend most of my online hours in.

Most of those web apps don't provide any 'social' login at all, but they absolutely should for the sake of easier onboarding. They just need better options than the mega platforms they are actively trying to avoid.

This and more is becoming possible thanks to three loosely related developments that are maturing simultaneously:

  1. Commodified identity providers – OIDC libs

  2. Federated logins – Bring Your Own Identity Provider with FedCM

  3. Portability – Decentralized identity standards

As I map out these technologies, I'll also outline a rudimentary product plan for Weird as an identity provider for the IndieWeb. By the end it should be clear that Weird hopes to be one among many providers of such a service. That's how we collectively wrest back independent control of the web's identity infrastructure.

Commodified OIDC

It wasn't long ago that aspirations of being a root identity provider was reserved for large and mostly closed-source companies. Now this space is rife with open source solutions backed by single-vendor cloud companies or industry coalitions:

Keycloak has been around forever. In the Supabase-logins example above, Keycloak stands out as the only open source option in the whole bunch.

Weird aims to appear on such lists as well. We keep our stack as lean as possible though, which boiled our search down to the perfect match I mentioned at the top of this article: Rauthy.

A Rauthy deployment with the embedded SQLite, filled caches and a small set of clients and users configured typically only uses between 17 and 22 MB of memory!

Rauthy's tiny footprint means we can realistically offer Weird as a self-hosted product for anyone who doesn't want to rely on our cloud service. In this first section however we'll focus on Weird as a cloud platform.

Now let's imagine what it would take to have 'Login with Weird' show up as an alternative on a real production service. For example, I'm an avid user of the read-it-later app Omnivore. Their login page currently looks like this:

https://cdn.discordapp.com/attachments/1221941908415447101/1224667709808181278/Screenshot_2024-04-02_at_12.25.04.png?ex=661e53af&is=660bdeaf&hm=188bfed47b91e4fdc3825afcba2a4c386f7639f97b46ce80d27115e25eae5d3b&

With any of the closed incumbents it's practically impossible to advocate for a new login option that isn't a “trustworthy” trillion dollar company. But Omnivore is open source, which opens up a vastly different possibility space.

Quite simply, our advocacy would start with a Pull Request that implements Weird Login. Maybe we'd start it off as a humble text-button above “Continue with Email”, and continue proving our merits as a first-tier login option from there. That's where the linksapp component comes in as a way to build trust by brand recognition.

Still, this is only a very partial solution to the problem before us.

Firstly, there's a significant burden involved, both upon us to send out a bunch of PRs to services we'd like to make friends with as well as the maintainers who need to review, merge & service said PRs.

Secondly, this type of integration doesn't work for self-hosters. You can't send a PR to Omnivore requesting that they add a dedicated button for 'Login with Andy's site' (for Andy's use only), alongside hundreds of others.

That brings us to the next piece of the puzzle..

Federated social logins

Two months ago, a developer put out a call-to-action regarding the emerging FedCM standard:

FedCM is a method that allows users to log into websites through federated identity services, such as “Sign in with…”, without sharing personal information with either the identity service or the website.

In short, FedCM makes it possible for the identity service of choice to be determined client-side in the browser, instead of that choice having already been made for you server-side, as with the examples of Supabase and Omnivore above.

It's 'Bring Your Own Identity Provider' (BYOIDP), meaning Andy can opt to 'Login with Andy's site' without Omnivore having any prior knowledge of Andy's site and its capability as an identity provider. Or, if Andy doesn't want to self-host their own provider, 'Login with Weird'.

However, true free-for-all user choice is only a tentative part of this WIP specification, and could get retracted in favor of a far more limited selection of The Usual Suspects if no practical example of the former is brought forth.

Fellow internet activist Julian Foad picked up on this movement and echoed the call with a more pointed reiteration of what's at stake. A challenge has been put to the open source community by the drafters of FedCM: Either implement a real-world example of the free-for-all method, or consider your inaction a vote for business as usual.

Three weeks ago one of the former editors of the spec completed the missing piece in the browser for the whole flow to be tested end-to-end.

Ok, I finally got this merged in chrome canaries, so I think we now have a complete prototype of this API in chrome canaries for you all to try.

We need developers to try this API in chrome canaries and give us validation that this is a problem worth solving (and that the proposal actually meets the requirements – or make a counter proposal), so that we can move into more stable channels (next step is an origin trial). Developers do that by writing prototypes that use the API.

If we don't hear from developers, we'll at some point delete the prototype: no specific deadline, just being transparent that the way to move this forward is for developers to build prototypes too (not blog posts, not manifestos: code, counter-proposals or questions).

Weeks passed without anyone apparently answering the call, so I finally decided to take what little action I could on my own. I reached out to sjud who had been experimenting with Rauthy in his personal projects for a bit, and he has graciously agreed to explore this work as part of a modest sponsorship arrangement.

Follow along here: https://github.com/sebadob/rauthy/discussions/145#discussioncomment-8831943

Once we have this working, that's a massive step towards re-centralizing identity around the individual. We're still one crucial step shy of making our identities properly self-sovereign however.

Portable Identities

Here's another excerpt from Christopher Allen's foundational article:

Self-sovereign identity is the next step beyond user-centric identity and that means it begins at the same place: the user must be central to the administration of identity. That requires not just the interoperability of a user’s identity across multiple locations, with the user’s consent, but also true user control of that digital identity, creating user autonomy. To accomplish this, a self-sovereign identity must be transportable; it can’t be locked down to one site or locale.

The Verge recently interviewed Bluesky CEO Jay Graber. When asked what distinguishes Bluesky's ATprotocol from prior art such as ActivityPub, Jay pointed to account portability as a major motivation for a whole new protocol:

Then another thing was we really wanted to get account portability. So, this ability to leave with your identity and your data and have fallbacks with the way that we’ve designed your repo, you can even back up all your posts on your phone or back it up on your server that you control, and then you don’t have to have any sort of friction when you want to move.

So, you can move between services in ActivityPub. But if… for example, Queer.af recently, their .af domain was seized by Afghanistan, and then people were stuck because there was no warning, and then they have to rely on their old server to help forward their stuff over to a new place. So, we wanted to get around that problem and make sure people always had the ability to move.

They're building A Self-Authenticating Social Protocol, which comes with a form of portable identity.

Many people in the ActivityPub-based fediverse consider Bluesky an affront to their pre-existing community; a threat even. I'm not really on the Bluesky network in any meaningful capacity, but I'm glad to have them around because they present a live counterfactual to the ActivityPub story. I don’t think we’d be talking as much and pushing as hard for things like nomadic identity if it hadn’t been for Bluesky championing that feature as one of their key differentiators.

In spite of its largely volunteer-driven development team, the fediverse isn't far behind on The Path to Decentralized Identity in ActivityPub.

So where does Weird come into this? Well, even a portable identity needs a place to live. The self-hosting types will want sole custody of their identity keys, storing them locally on an encrypted drive.

That doesn't work for me. If I had ever gotten into crypto, I'd no doubt be the guy desperately looking for his lost hard drive in a junkyard. I'm messy; I do not trust myself to take proper care of something that will have irrevocable consequences if I lose it.

Bluesky recognizes this as well, which is why they are building a hybrid solution wherein a server host (like Bluesky themselves) and an end-user share non-exclusive custodianship of an identity key.

While I agree that there’s every reason to be cautious about Bluesky’s centralized approach, I think it’s worth noting that private-key identities solve two distinct problems:

  1. Instance-independent identity with credible exit

  2. Self-sovereign identity with no 3rd party authority

As already explained, personally I don’t actually want to be 100% responsible for the safeguarding of my private identity key, for the same reason I use a bank instead of storing my money in a safe at home.

I want to fully own my identity, but I don’t need exclusive custodianship over it. I have a much more urgent need for (1) than (2), so I’m okay with solving the former first as long as there’s a clear path from there to the latter.

Bluesky’s approach is in principle fine with me, provided the promise of credible exit can be substantiated. My main concern with Bluesky Inc. specifically is that they're a VC-funded ($8m seed) company with >30 employees and no concrete business plan. With that many people on the payroll the money is gonna go quick, so I'll be very surprised if they don't do another funding round in the next 6-12 months, thus sinking them even deeper into VC debt.

I'm not fundamentally against venture capital, but by now we have a lot of historical proof that the more of it you take on, the more compromised your original vision gets.

https://waxy.org/2024/01/the-quiet-death-of-ellos-big-dreams/

Despite their idealist manifesto and their Bill of Rights, I don’t believe they could ever truly be in partnership with their community once they were taking large amounts of venture funding. All of their ideals and big dreams were easily undone, even the legal restrictions they defined in their Public Benefit Corporation charter:

  1. Ello made money from selling ads to third parties;

  2. Ello made money selling their user data to a third party;

  3. Ello was sold, and the new owners didn’t comply with those terms.

I might only be willing to trust an external identity custodian if it was Mozilla or some other similarly established open-web institution.

Or, hear me out, maybe such a custodian wouldn't have to meet the high bar of longstanding open-internet staple as long as it is sufficiently lean, transparent and indie-oriented. Like, say, Weird!

Here's where another magic property of the OIDC standard (as implemented by Rauthy) comes into play: it provides a baseline of ID management for other, still experimental methods to plug into.

For instance, Rauthy already supports Solid OIDC, enabling interoperability with the Solid protocol as yet another alternative for decentralized identity, or just as a means of integrated key storage.

Bluesky is also working on OAuth support (with some talk of OIDC compatibility).


Bluesky wants to be “the last social identity you’ll ever have to create”. It's a nice sentiment, but I think it's a bit like trying to sell “the last jacket you'll ever wear”.

I think the real mark of a truly user-respecting identity provider is one that is equally happy to be your primary or secondary provider, and can operate as one or the other interchangeably.

Furthermore, ones identity can never be tied down to just one thing. In the timeless words of Walt Whitman, “I am large, I contain multitudes”.

Just so, an identity container made to last forever must be built to hold an ever changing number of multiple personas.

Let's try to make that shall we? Join us in #weird on Matrix.

Money, Money, Money by Uganda Lebre

Threads has entered the fediverse. There is so much to say about this, and I'm simply not ready to take a decisive stance on the matter as a whole yet.

Deciding to federate with Threads is analogous to doing trade with the United States of America. The USA has a contentious history to say the least, but it's a continent-sized nation containing multitudes.

It also commands such an overwhelming influence over the global order that shutting ones door to it can be likened to opting out of globalization altogether. That's not an innately good or bad, wise or unwise thing to do, but it's a choice with far-reaching consequences. It's also a choice that's weighted very differently depending on your standing in the world.

For some nations, there is no choice. Our globally connected and unevenly distributed world is such that not all nations can afford to close off their borders and trade routes to the US without ruinous consequences. Consider this before you chastise those who do not exercise their supposed liberties the same way you do by “doing what is right”.

Unprecedented

I'm generally in favor of at least trying what hasn't been attempted before, and this breaking of bread between David and Goliath seems unprecedented. Some will argue that this is history repeating itself, but what's going on today is a very different story.

Unlike how Facebook and Google voluntarily adopted the XMPP chat standard as self-serving product strategy, Threads is not making today's interoperability play voluntarily. The EU forced their hand and the US finally beginning to hold their mega-corporations to account as well, so Meta is left with no option but to make the most of the hand they've been dealt.

There are anti-monopoly regulations hammering down on the internet behemoths from all angles now. Threads' adoption of the ActivityPub protocol is Meta's plea for goodwill from the multi-national regulators who are breathing down their necks.

I suspect the fedi-collective has more negotiating power in this moment than it realizes. We may as well make some asks, see how Meta responds, and they in turn will see how the public, the media and the regulators respond to them in this bold new era of pervasive Big Tech skepticism.

Money, please

From Meta’s decentralized social plans confirmed. Is Embrace-Extend-Extinguish of the Fediverse next?:

It does not help that the Fediverse today is chronically underfunded and has corresponding difficulty to compete at the same speed as somebody like Meta can. Actually, “unfunded” is a better term because the amounts are so small. There are many unpaid contributions, the Fediverse largely being open source and all, but I’d be surprised if more than $10m per year are spent in total on the entire Fediverse today, likely it’s far less. If Meta can burn more than $10b – that’s one entire annual fediverse spend every 8 hours! – on a very doubtful Metaverse project, they surely could find the same amount of money to protect their core business.

How can Meta extend a tangible gesture of good will towards the fediverse? Pitching in an extra $10M per year would be a good start! A bit of internet reparations.

The initial commitment could be far more modest though. How about a $600,000 trial run for the next six months? To make it more concrete, I propose three initial domains of funding specifically intended to mitigate oft-cited legitimate concerns of fedizens today:

'Threads will coopt the fediverse protocol'

Mitigation strategy: Make a comprehensive test suite to elevate ActivityPub from an implicit to an explicit set of standards.

$200,000 in additional funding for the ongoing ActivityPub Test Suite, reinforcing the efforts already backed by NLnet and Sovereign Tech Fund.

'Threads users will overburden fediverse moderators'

Mitigation strategy: Make moderation tooling that works at scale, in a federated model.

$200,000 in additional funding for the ongoing moderation tooling initiatives, such as IFTAS (sponsored by New Venture Fund) and FSEP (sponsored by Nivenly foundation).

'Threads will lock in users'

Mitigation strategy: Sponsor the development of Decentralized Identity in ActivityPub (Nomadic Identity).

$200,000 in additional funding for the ongoing SocialWeb Coöp's ongoing work on Portability Tools (scroll to bottom), Mike Macgirvin, silverpill as well as other complementary initiatives in this space.

I will gladly receive corrections/addendums to information about the initiatives and funding-orgs I've listed above; this is not an exhaustive overview.


It will take a lot more than money for Meta to change its dubious image in the eyes of the fedi-nations, but this preliminary act of generosity could still make a real difference.

If anyone at Meta or Threads reads this and wants to help move it along, you can reach out to me for some facilitation, or just directly contact the orgs above along with your existing contacts in the diaspora of fediverse leadership.

Incognito by Matt Dixon

I've noticed a worrying trend among many bloggers who use GenAI for the images of their posts: No credit is given. Not even so much as a shoutout to Stable Diffusion, Midjourney, DALL-E et.al., nothing. As if the image appeared out of nowhere.

If you're one of these people, this post is addressed to you.

Using GenAI instead of promoting the work of a living artist is ethically suspect on its own, but I'll give you the benefit of the doubt: Generating “your own” image might be more satisfying than searching among prior art for that just right visual analogy to your written words.

But if you consider yourself a participant in the knowledge commons – as every self-respecting writer should – you have a responsibility to credit your fellow artists, human or otherwise.

It's bad enough that GenAI mashes together thousands of similar drawings and repaints them at your behest with the signatures of its contributors scrubbed out, their record of work erased. Don't add insult to injury by omitting any credit of the machine assistance whatsoever, as if this work was painted by your hand.

Whenever I see an uncredited image online, I assume foul play. It's the equivalent of copy-pasting another writer's article in full, without crediting them by name and source link. Every uncredited image is non-consensual exploitation of art, regardless of origin.

At the very least let us know which AI application you used to generate your image. If nothing else, that combined with the date of your posting will provide a snapshot of that particular AI model's capabilities at that point in time. Years down the line, that's useful data.

The far more artistically honest thing to do would be to include the full prompt you used to produce your image, thus providing an interesting frame of reference for future generations (double meaning intended) to measure the growth of our synthetic art students.

Best of all – short of simply utilizing the work of real artists – would be to accompany your service-credited and prompt-transparent illustration with a brief list of human-made works that closely resemble it.

From Big AI Commons:

Designed for the betterment of society, an automated synthesizer would happily (there’s that anthropomorphic slip again) tell you about every single piece of information it has ingested. When outputting a synthesized information blob it may not be able to tell you the exact sources from which this output was derived (because that’s not how Synthetic Media Machines work), but it absolutely could do a reverse-search on its own corpus of data and tell you which articles / books / images / films are most similar to this “new” thing you now have in your possession.

I understand no one will ever bother to actually do this due diligence unassisted, but it's an interesting thought exercise nonetheless. And I wish to one day live in a world where I can click on an AI generated image and see “Similar works [by humans]” presented to me the same way I already can on Google Images. We already have the technology, we just lack the will.

The art of knowledge work is inherently relational and referential. The way we make sense of new information and transmute it into lasting wisdom is by following the trail left behind us by the knowledge workers of old. If that historical chain of attribution to prior art is severed and we lose sight of where our current state of knowledge comes from, we may as well start all over again from scratch, and we just don't have that kind of time.

I never seize to be amazed by how accepting we are of the exact same multinational corporations who under no uncertain terms spent the last few decades diminishing our personal agency, unraveling our communities and strangling our nascent democracies in the cradle.

The last trick the software oligarchs pulled on us was the idea of Big Data as something that magically appeared behind the fortified walls of their data centers, as if organically home-grown and lovingly tended to. And only they, with their unparalleled wits and computing power, were fit to manage all this data at scale.

Except the only thing that was special about these data troves was how much of them they’d been able to collect and trade amongst themselves without our explicit consent. That was the era of surveillance capitalism. With the emergence of so-called artificial intelligence, powered by non-consensual data mining, the corporations move on from the surveillance trade to straight up spycraft; the society-controller of choice for authoritarian regimes.

So up next is control capitalism, which is just fascism with the toothbrush mustache grown out for a more fun, twirly aesthetic.

We are regressing back to the ugliest kind of class divide, wherein the owner class commands your will not merely because they own things you do not, but because they own you. They’ve already laid claim to our collective land, labor and attention. With AI, they want to own our thoughts and the last shred of agency that comes with them. If we fail to defend our personal sovereignty at this juncture, a dark age of the corporate singularity awaits us.

This article, which turned out way different than I expected, was first ignited by Mike Masnick's reporting of AI critics employing copyright law as their weapon of choice against extractive data hoarders. As an open source advocate I wholeheartedly share Mike’s fear of IP maximalism. The problem this legal tactic is attempting to solve however is as real as it is harmful, so to refute the tactic begs the question: What, then?

Commons Maximalism

LLMs and their ilk, or what Emily M. Bender calls Synthetic Media Machines, are premised on large libraries of data. Without big data, they can’t function. Arguably their collection and mass-synthesis of this data is fair use, and I won’t dispute that.

The weird thing about these contraptions is that they aren’t libraries you can go to and ask for specific items to be retrieved according to some query, like ‘books on insects’. An SMM will be able to give you a list of books on this subject (with varying degrees of truthfulness), on the account of the SMM having actually consumed these books for its own edification.

But what it would much rather have you do is ask it to write something more specific about insects on its own accord, made for you and you alone. Thus, making you reliant on the synthesizing automaton as your primary source of knowledge. And to be clear, the contraption in question here has no will of its own. Its incentives and motivations are purely an extension of the corporate master that controls it.

Designed for the betterment of society, an automated synthesizer would happily (there’s that anthropomorphic slip again) tell you about every single piece of information it has ingested. When outputting a synthesized information blob it may not be able to tell you the exact sources from which this output was derived (because that’s not how SMMs work), but it absolutely could do a reverse-search on its own corpus of data and tell you which articles / books / images / films are most similar to this “new” thing you now have in your possession.

If this type of backwards looking similarity-search was standard practice, you would always learn of some original, human-made media that is remarkably similar to what has been machine-generated for you as if by magic. The truth of art making is that there is no such thing as a truly original creation. Every new thing is a remix of a prior.

(Steal Like An Artist makes that case beautifully.)

The infinite riches of media that we continue to share freely on the internet aren’t put there for the purpose of capture and capitalization. We share our art so yet more art can be made from it, under a social contract of mutual reciprocity.

Big Tech doesn’t reciprocate. Our public data isn’t for them to do with as they wish, especially not when their wish is to subordinate us into a brave new world of techno-feudalism. But ownership is tricky. I can claim some ownership over this article I’ve written, but I cannot possibly lay claim to the impression it has on its various readers, nor can I claim ownership of new art that only to a vague and partial degree is derived from it.

Our public data doesn’t belong to the corporations, but it doesn’t belong to us either. Not when it has been converted from data-contents to data-impressions. At that point, your ideas ‘live rent-free’ in any willing or even unwilling recipient’s mind. Like the air we breathe and the water we drink, freely available data doesn’t belong to anyone. What belongs to no one belongs to The Commons.

Attack their bigness

From a simplistic point of view, an SMM is just another thinking agent going around consuming content and forming its own impression thereof. If we try to combat the harms of AI companies from this vantage point, we’ll only end up harming individual creators. Attacking how the machines work is an aimless swing at their most ethereal form, destined to find no target to make contact with but our own sorry faces.

To land a real blow, look for where the machines are at their most materialized. Take aim at their massive bodies of data and strike there with conviction. The Large Language/media Models rose to prominence through their unfettered bigness, and that in turn shall be their downfall.

Pacify the profit incentive

Here then is my very simple policy proposal: Big Data AI is by definition a product of our global data commons, and as such any product derived from it should only be allowed for non-commercial purposes.

Commercial applicability should shrink relative to the size of data vaults. Much like a wealth tax on data, this aligns neatly with the EFF’s recommendation of a Privacy First approach to addressing online harms.

Regulators have an innate understanding of bigness and scale. Some AI regulation in the USA already stipulates special restrictions for AI operations that exceed a certain compute threshold. Regulating by data mass is probably an even more tangible metric to enforce by.

Furthermore, the doomers who are concerned with the rampant development of AGI should be very happy with this*, because a lack of commercial incentive would undoubtedly slow the unchecked pace of AI among the most unscrupulous for-profit actors, leaving academic researchers and CERN-like international collaborations to lead the way.

(*Unless, god forbid, they weren’t actually sincere in their ethical trepidation and were actually just angling for a competitive advantage.)

Our public libraries are shining examples of our social ingenuity. “Knowledge wants to be free” we said, and collected it all in these massive repositories made by the people, for the people. For a while, we did the same thing with the internet, at global scale. The AI renaissance could still turn out to be a good thing, but only if we reject its cooption by the already most powerful few.

The art of knowledge work is inherently relational and referential. The way we make sense of new information and transmute it into lasting wisdom is by following the trail left behind us by the knowledge workers of old. If that historical chain of attribution to prior art is severed and we lose sight of where our current state of knowledge comes from, we may as well start all over again from scratch, and we just don't have that kind of time.

Done right, AI assistants of the LLM variety ought to be like a library and a librarian fused together. And doing that right means we would have actual human librarians still in the loop to mediate between mortal knowledge seekers and the god-like but far from infallible super librarian.

Such an interaction would likely feel much less like being on the receiving end of a bullshitter’s behind, and more like making, eating and digesting your very own food for thought in the company of our peers, both past and present.

When you hook up your mind to a cloud-controlled Artificial Synthesizer (ASS), you plainly receive their fully digested discharge.

You don’t get to see what happened further up in the synthetic digestive tract of the all-knowing ASS, where copious amounts of data grub were initially ingested and processed by a divine black-box entity.

You don’t have any insight into where and who those morsels of data came from, and you certainly don’t get any say in which of them the entity should or should not consume for processing and output, delivered to you through the ASS-as-a-Service.

All you’re supposed to do is open your mind’s mouth wide and say “please” and “thank you” for the grossly diluted information bits you’re about to receive.