Instagram announced their new "Threads" app yesterday, with Mark Zuckerberg talking about having a Twitter-like platform with 1B+ users, and hailing the app's rapid adoption of 10M users within hours of launch. Meanwhile, Bluesky has 250k users and is working on...scalability. Every new platform is focused on scale, exactly like the previous platform, because they are all funded by advertising. To make this work, they'll always need:
Reach will always be about ease of onboarding, low cost, and scale. It's inherently about including everyone, regardless of whether or not those folks want to talk with each other.
Engagement will always be about finding and surfacing the most outrageous material: violent, scary, controversial, weird, incorrect.
When you combine these two, it's sort of obvious why social media trends towards being a mess: it shoves everyone into a giant pot and then takes the worst of what people are doing and puts it in front of everyone. And every platform, whether it's Facebook, Instagram, Google+, Twitter, or Bluesky follows this same pattern. As long as you're using the same exact recipe, the result after baking will be pretty much the same.
It also leads to an unsolveable problem: when you stick everyone together and promote only the most outrageous posts, endless debates about what should be allowed are inevitable, since different people have different standards and tolerances. Nothing in history suggests that some platform is going to come along and find exactly the right balance here: different groups just have different standards. And so social media consistently tries to solve this by using the lowest common denominator, putting in safeguards and restrictions for every group that has some concern: Muslims want to ban pictures of Mohammed, Christians want to ban pornography, China wants to ban criticism of the government, Elon Musk wants to ban anyone who tracks his jet. Of course, banning all sorts of content requires judgment, and AI simply isn't there yet (heck, most people can't agree about what should be allowed!), so any successful platform needs to have moderators, either paid (usually the case) or volunteer (in cases like Reddit), that try to enforce bans correctly. This enforcement then leads to issues of control, and who owns what content, and what the content can be used for. This has become particularly relevant as companies are training AI on social media posts, since they are a ready supply of human-generated writing.
And so the endless debates around misinformation, banned content, moderation practices, privacy controls, and monetization churn on, making no progress. If you look at past social networks, there are a host of related issues that have been contenious: real name policies, the meaning of a verified accounts, fact-checking / labeling policies, the "algorithm" (how the feed is ordered), and rules governing suspension and banning of accounts.
One solution is to abandon the recipe. Consider twtxt, a regressive social media app that is nothing more than a text file sitting on a server. A person's identity is the URL to that file, and the file can be fetched by others' twtxt clients at-will, in the same way a browser would fetch a web page or a feed reader would fetch an RSS feed.
There is no promotion of outrageous content, since it's just text files with a post-per-line. This means no moderation is needed, since every user has complete control over what they see: they will only see messages from feeds they follow, and unfollowing a feed is easy. It's trivial to create a client that automatically filters out content with particular keywords. Rather than turning to some international corporation to tell you what you're allowed to read, you can just decide for yourself. This means that each user has control and can dial their own filters as desired.
The flip side of all these advantages is that it doesn't work like advertising-funded social media. It's smaller, slower, and less volatile. You have to discover new feeds by following links in posts from those you already follow, since the platform won't do that work for you. It's less convenient, so it will never gain mass adoption, but the filter it applies selects for folks that are willing to do a bit more work. This is probably a really good thing, and seems like a fair trade besides: for folks that think Twitter should do a better job of moderation, using twtxt gives you complete control!
Critics might point out that twtxt will never be as big as Twitter or Facebook or Instagram, and I agree. The difference is that critics see the size of those platforms as a feature, and I see it as a bug.
Which brings me to my thesis: I don't think we're pursuing the goal of social media incorrectly, I think we have the wrong goal. We're looking to have an authority feed us information, and I think we need to instead take more individual responsibility for how we gather information. The first step is to take control of what you read by choosing it, rather than having it fed to you, and the most effective way to do this is to stop using products from companies that are primarily funded by advertising.
In short: twtxt seems to be the "just publish a website" version of social media. It's extremely simple and effective at letting folks stay up-to-date on accounts they care about.
Reddit will be shutting down API access on 30 June 2023 by gating it behind unsustainably-high access fees. This has resulted in the shutdown of Apollo, Sync, RIF and others. It's the end of an era for the internet: I just hopped over to Reddit to check my account, and saw I created it 17 years ago! For a site that bills itself as "The Front Page of the Internet", they don't seem to care much about their users.
I've been thinking about how VC money changes the old business models, given that Reddit took over $1B in VC funding, and is now working to recoup that with an IPO. There's always been businesses that rely on duping folks: the street-side vendor hocking dubious wares is as old as time. The vendor pushing low-quality goods might draw in those that are desperate or inexperienced, but those that know the goods are shoddy do business elsewhere. So there's a sort of filter, and those that know the situation can choose to shop elsewhere, simply walking by before ever talking to the vendor.
What's different with VC funding is when people get filtered out. Reddit is like the vendor, but rather than spending the last 17 years selling shoddy goods, they offered an amazingly diverse place for discussion online that fostered a huge ecosystem of volunteers and products. This didn't attract just those that are desperate or inexperienced, but rather drew in millions of users from all walks of life, and in many ways did succeed in creating "the front page of the internet". But as Cory Doctorow points out in his essay about TikTok, providing such a great experience while remaining unprofitable for over a decade simply doesn't work: there has to be a next step to the strategy.
The next step, of course, is to alter the deal: by taking away the very things that led to its success, Reddit gives users a lose-lose choice: walk away from the community you've enjoyed for years, or put up with the continually-degrading user experience of the official Reddit clients. By forcing users to filter themselves out after they've invested in the platform, Reddit will no doubt retain users that would have never even joined if this setup were present from the start.
None of this is unique to Reddit, sadly. Doctorow highlights it with Amazon and TikTok, but it's true almost everywhere online, from the "free tier" of various services that eventually disappear to ever-more-invasive data collection by companies that want to better target advertisements, if something seems like it's too good of a deal...it probably is.
Axios reported on a recent Washington Post report on the content of the datasets used to train LLMs like ChatGPT. The hook of the story is that the training data contains all kinds of content from across the internet, including lots of discussion from groups that engage in behavior most would find reprehensible. The natural response might be to train these models only on virtuous content.
Putting aside the question of who decides what is virtuous, I'm more curious about the best approach to train a model that has good behavior. Is it more effective to train it on all ideas, marking some as desirable or not, or to train only on a carefully curated set of ideas that are considered desirable?
It might be important to consider the intended use case. If the model will be used in an environment where inputs are very carefully controlled, it's likely just fine to give the model no awareness of concepts it will never encounter.
But what if the input to the model is unsanitized? How can an LLM recognize and caution against murder if it doesn't know about it? Do we want models providing the 'most probable' next word in a sentence that the model hasn't been exposed to? It seems that a prerequisite for saying a behavior is undesirable is to first define the behavior. Which is to say: I think we'd be better off focusing on how to inoculate models against poor behavior rather than trying to hermetically seal them off from anything bad during training.
A lot of academics are jumping on the "resist AI" bandwagon. When we finish highlighting that LLMs are driven by statistics ("stochastic parrots"...is there any other kind?) and "resisting", then what?
In the end, it doesn't matter at all whether this is intelligent, or a "stochastic parrot", or just another computer program. It's useful, and that's what people are keying on. No amount of discussion about "automated plagiarism" will make it less useful.
I was rebuilding my Framework laptop yesterday evening, and was showing someone that hadn't spent time on the inside of a computer the laptop's various components: CPU, memory, SSD, wireless card, battery, cooling, display, webcam, etc. I remembered from building desktops back in the 90s and 00s that the CPU was connected to memory via a north bridge and connected to disk via a south bridge. In my Framework laptop, I saw some bridge, and the person I was with asked about it. I vaguely recalled that one of them isn't a discrete component anymore, but wasn't sure. I ended up searching on DDG, reading a couple of different pages, and ultimately concluded we were looking at the southbridge (one word!).
Today, I saw this "resist" piece, and thought "I just researched this chip question with DDG for 3 minutes. What would ChatGPT say?" I don't tend to think that LLMs are great as "search engines", but there could be real utility in having the LLM "read the docs for you" so it can answer questions. So here was my first attempt with ChatGPT (note my incorrect usage with two words rather than one):
What's the function of the north bridge and south bridge on a motherboard?The northbridge and southbridge are chips on a computer motherboard that control communication between different subsystems. The northbridge generally controls communication between the CPU, memory, and graphics card, while the southbridge controls communication between peripherals such as the audio, storage, and network interfaces. In modern system, northbridge functionality has been integrated into the CPU itself, known as the "chipset".
ChatGPT corrected my usage, explained their function (correctly), and even provided extra detail that the northbridge has been integrated into the CPU directly. That response is straight-up useful, and the fact that it could be "confidently wrong" makes it...the same as every other source of information I use. I constantly have to deal with people and websites saying things that aren't true, so raising that ChatGPT can be wrong is completely uninteresting to me: everything can be wrong, and ChatGPT seems to get it right more than most humans I know being asked the same exact questions.
So I'm not going to resist. The excitement about ChatGPT is not because people are buying into "hype", it's just folks using the tool and finding it useful. Folks that learn to use tools well tend to be more productive than those that don't, so trying and evaluating new tools, espdcially poweeful ones, seems to be a great use of time.
I've been a big fan of Julia Evans' writing since I discovered https://jvns.ca/ a few years back, but I didn't dig too much into other areas of her site. I was reading through my feed on Mastodon and saw mention of her zines and assumed they were comics or something like that that she did as a hobby.
I was wrong! Julia's zines are exactly the kind of thing I adore: accessible presentation of foundational technical concepts, beautifully crafted. The topics are excellently chosen...here's a quick sample:
These are paid, but I have no affiliation or interest beyond sharing a cool thing. If you want to get a feel, she also has some free zines, like So You Want to be a Wizard.
If you're interested, you can check out the whole collection at wizard zines!
ChatGPT has been dominating my news feed for the past couple of weeks, and it's truly inspiring. It's able to describe what simple source code does, and to mimic the style of various writers, like Shakespeare, and styles, like a limerick or a haiku.
Surprisingly, folks are keen to ask it factual questions, and find that ChatGPT is perfectly happy to confidently lie in its answers, and many are devoting time to figure out how to remove various -isms from its responses (racism, sexism, etc.)
When I think about where this technology can really shine, though, I see massive potential in interactive art. That is, video games! It's not practical today, but if we get to the place that consumer video game systems have 24-32GB VRAM (on top of whatever is needed to render the game itself), it would be possible to run models like ChatGPT in a limited video game context. Coupled with technology like Whisper, it would enable an entirely new role-playing experience, with models like Stable Diffusion generating textures for environments, whisper translating the player's speech to text, and feeding that to ChatGPT-like models that formulate responses as various characters in the game, as well as a responses from a game master. Speech synthesis models are good enough to synthesize various in-game voices, completing the "dialogue with a computer" experience.
I think entertainment is an ideal stepping stone for AI because it's an application that is relatively low-stakes: not every detail needs to be precisely factual or correct to have value...so long as the world is interesting and immersive for the players, AI will succeed.
With the resurgence of the Fediverse after the drama with Twitter, I'm beginning to reconsider my calculus with respect to microblogging. I've gone full-static with Thoughts, but I also considered twtxt and hosting something like Pleroma (on the heavier end). But I just recently ran across microblog.pub, which while dynamic seems to be pretty lightweight, and has the advantage of being able to interact with the rest of the 'verse. Might set it up in the next few days/weeks to replace my current (barely used) Mastodon profile.
I've been struggling to formulate a way of explaining this, even to tech-savvy folks. Here's the bottom line up front: every browser extension you install and every app you install present a risk to your privacy. There are two main ways this happens: through proprietary apps that are either sponsored by commercial entities (companies) or governments.
On the commercial side, building apps is expensive, so developers tend to work on apps that have some business plan. But charging for apps not only necessitates developers giving a cut to the app store they are part of (Google, Apple, etc.) but it also causes an order-of-magnitude dropoff in installs. So there's a huge motivation to find a way to keep apps free, while still making money. This leads to ads (at the benign end of the spectrum) and malware (at the malignant end of the spectrum).
Even developers that decide they want to build an app out of the for the good of the community may decide to sell that app to a company. The sale price will reflect the company's ability to leverage ads and malware to recoup the investment.
And even in cases where the developer has an extremely strong moral compass and decides not to do this, there is a risk when their app becomes popular enough that they will became a hacking target. So not only do you have to trust the developer, you also have to trust that the developer is consistently maintaining great security practices.
On the government side, there's a persistent and growing threat of malware built into the apps. Politico ran a story today about Egypt's COP27 summit app being malware:
The app is being promoted as a tool to help attendees navigate the event. But it risks giving the Egyptian government permission to read users' emails and messages. Even messages shared via encrypted services like WhatsApp are vulnerable, according to POLITICO's technical review of the application, and two of the outside experts.
This is extremely common: there's an app for everything now. Every sporting event, every conference, and every airline have dedicated apps so you can do what you did before, but now with an app. It's a dangerous trend, because the code of these apps is not available, and it's completely unclear what behaviors the app has that the users can't see.
My advice: only install apps that are either open-source and community-built, or apps that are absolutely necessary to accomplish your goal. The smaller you keep your digital footprint, the more you mitigate the risk that your privacy will be compromised.
Lots happening in the past week or so with Stable Diffusion! Let's take a look.
This is a great blog post because it describes textual inversion, which allows Stable Diffusion, which has never been trained on "ugly sonic" to nevertheless use it's existing weights to produce very compelling "ugly sonic" images with only a handful of images of source material (plus textual prompts). This seems like a very powerful mechanism to extend an already trained model to produce images of things it would otherwise be unaware of.
Another interesting application of Stable Diffusion: inpainting. This again applies Stable Diffusion to a use case where it gives artists more control over the final product. In this case, the example notebook shows that Stable Diffusion, given an input image, a masked area to alter, and a text prompt, is capable of altering the clothing worn by a person in a photo.
This project is a Blender plugin that generates textures on-demand using a simple UI, and features much simpler prompts than are required when working with raw Stable Diffusion. I think this is trend-setting: crafting prompts is one of the weirder parts of working with Stable Diffusion, and I fully expect tools that leverage Stable Diffusion to try and make that aspect of the process much more intuitive, as this does.
It also supports an img2img mode, which allows textures to be created with a bias towards the attributes of the supplied image.
Max Woolf wrote up a very good overview of how image generation actually works, breaking it down piece by piece. If this all still feels like "magic", this is a good article to read!
Matthias Bühlmann has done some really interesting work applying Stable Diffusion to the problem of image compression. There are two interesting findings here, I think:
- Stable Diffusion can compress images to a smaller size that jpeg and webp. This alone is remarkable.
- While other image compression algorithms introduce "artifacts" that generally appear as blockiness or noise, Stable Diffusion's artifacts are more like "hallucinations" where it generates detail that doesn't exist in the source, and it can differ from "ground truth".
Matthias' post highlights this second effect with a picture of the San Francisco skyline. The image compressed using Stable Diffusion inserts an imaginary skyline and set of buildings in the far distance, leading the viewer to believe the image is very high quality. The JPG image is better at "admitting" that it doesn't really know what's there.
This is somewhat disquieting: this approach seems much less like "compression" and rather more like Stable Diffusion is, much like a human, trying to "remember" what it saw originally.
I can't believe I didn't think to visit this earlier, but it's a great place to see what others are doing with the tech and chat. Some notable stuff I ran across there in the 30 minutes I spent:
- Really good results using img2img to take rough sketchwork and make it much more final
- Discovering new types of images SD can generate: a simple prompt asking for a stereoscopic portrait produces exactly that.
- Genuinely impressive work extending SD through Deforum to animation. This is part of the road to addressing the animation and having stable subject matter from image to image.
This is a massive mistake. Eric Goldman from the very excellent Technology & Marketing Law Blog has consistently produced insightful commentary on this over the past several months. From today:
When a proposed new law is sold as “protecting kids online,” regulators and commenters often accept the sponsors’ claims uncritically (because…kids). This is unfortunate because those bills can harbor ill-advised policy ideas. The California Age-Appropriate Design Code (AADC / AB2273, just signed by Gov. Newsom) is an example of such a bill. Despite its purported goal of helping children, the AADC delivers a “hidden” payload of several radical policy ideas that sailed through the legislature without proper scrutiny. Given the bill’s highly experimental nature, there’s a high chance it won’t work the way its supporters think–with potentially significant detrimental consequences for all of us, including the California children that the bill purports to protect.
I highly recommend reading Eric's full take on this. He breaks down five separate aspects of this that don't really add up at all: not only does it shift the Overton window towards surveillance-by-default (face scans on any website that a minor might try to visit, for example), but it also disempowers parents to make decisions about how to raise their child, which is exactly the wrong approach, all while adding a lot more friction to navigating an already-immensely-confusing internet.
I've been playing around a fair amount over on NightCafe with the Stable Diffusion model to generate artwork. It's been a really fun experience, but learning how to write a prompt to get close to your vision is a real skill. Moritz put together a cheat sheet that has some good keyword collections that can help.
I've been impressed with the speed at which Stable Diffusion's release has spawned a community and economy around AI-generated art.
Browsing for a few minutes inspires me to exactly the opinion Simon outlines in Stable Diffusion is a really big deal. The ability to generate art with a text prompt (and in some cases a seed drawing) makes high quality art much more available than ever before.
You can play with stable diffusion yourself, free, over at Hugging Face. There's nothing about this technology that is limited to images. We have AI's that generate text, generate software, and generate images. This same technique can be extended to video and other, more complex, artistic designs.
Regardless of the quality of the output, all of these AIs pose a very complex question regarding copyright. It will be very interesting to watch the law catch up with technology in this particular instance, I think.
There's something interesting about smaller social communities of the tech-savvy using SSH and *nix to create social networks. They have none of the commercial aspects associated with large social networks (ads, tracking, algorithm-based feeds, etc.), but they also seem to tend to stay off the web, instead using SSH, Gemini, Gopher, and RSS.
I think the web is largely OK as is (though I admire Gemini quite a bit), but needs to be made easier for folks to run, like a vacuum cleaner or a car.
Take a look at the
gab script for rawtext.club as an example. It's an entire chat service, with rooms, users, blocking/unblocking, in a single 356-line Python script. There are a lot of attributes of the problem that make such a simple approach possible:
- Storage: available for each user, it's nothing more than creating a directory in the user's home directory to store the necessary files.
- Permissions: the *nix filesystem has user permissions, so setting the right bits is all that's needed.
- Identity: provided automatically via
$USERand enforced by SSH keys.
- UI: the commmand-line provides an existing UI that is already familiar.
This is neat! A chat service to serve up to several hundred users can be tiny and easily written by one person if it reuses existing components. What's interesting to me is that we don't have this for the web. Where is the reusable storage, permissions, identity, and UI to make web apps as trivial as CLI apps? I think web frameworks come close, and Parse/Firebase were exploring related ideas (making the backend highly reusable), but there might be room for a "cartridge" style of webapp that reuses storage (maybe sqlite), identity (maybe providing OAuth and password-based options), and UI (default layouts for standard views).
The value of this is not to power world-class services like YouTube or Facebook, but to power the neighborhood message board. It would be a beautiful thing if every town and neighborhood and family had their own system that did this. Fragmentation can be ok in an ecosystem with aggregators. Then one could use RSS to aggregate posts from all the interesting networks. I suspect this would be a much healthier outcome than the centralized services for a variety of reasons, but perhaps that's best discussed another time.
The gab script at rawtext.club
There's been a lot of chatter about "free speech" and "censorship" on the popular social platforms (Twitter, Facebook, and others), and I'm constantly confused by it. We have federated social media (the Fediverse) that allows anyone to set up a server and publish posts that can be consumed from other servers, via the web, or via RSS. The idea that elected government officials are using Twitter to publish updates is extremely strange, and it's even weirder that they do so while complaining about the rules Twitter imposes.
So, with that context, it's a breath of fresh air to see the EU approaching this problem in the most obvious way imaginable: have the government run their own server! I'd be very supportive if the U.S. government did this so representatives of the people could communicate with their constituents.
To be clear, having the government run their own server isn't magic pixie dust that will solve all the problems. But it will allow the government to directly confront those problems and optimize for healthy discourse rather than going through Twitter, which cares less about healthy discourse and more about advertising revenue.
I largely agree with Tristan on this. Business folks are constantly thinking in terms of dates; I'm sure it seems obvious that this is the way. And I largely agree! The system I use to manage teams is gathered from a variety of sources, but contains only two guidelines.
One type of slowdown is unanticipated complexity. These snags slow things down because one aspect of the work took longer than expected. These snags are often things we can anticipate in advance. In cases where we think there may be a snag, start working on that part of the project first and create a prototype. I sometimes call this "de-risking" or "retiring risk" in management meetings, but it just means "try to make the tricky parts work first".
Set Target Dates
After frontloading risk, you can allow some time to work through those risks, as well as the "easy" parts of the project. Do some rough sizing ("3 days to identify the correct backend storage system and make it region-aware") and then set a target date for the project. This date is not a commitment, but a public statement of what the team is shooting for. It's not the manager than sets this, but rather the engineers, and it comes with one instruction: the team must raise a flag as soon as the they encounter a snag that they think will alter the target date. This frees the team from sync meetings to assess status, and comes with an added benefit: sources of delay are easy to pick out when the team reflects on the project after it is complete. Some findings might be generally applicable and can be integrate into a list of things to think about during the "rough sizing" phase of the project next time.
That's it! I've used this approach across multiple companies over the past 10 years or so, and it's is fantastic at getting out of everyone's way keeping the focus on shipping great work.