Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: This AI Does Not Exist (thisaidoesnotexist.com)
434 points by thesephist on April 23, 2022 | hide | past | favorite | 71 comments
Hey HN! Author of the site here. I tried a few tricks to keep the text-generation part of the site up, but even leaning hard on Huggingface's API and bumping time-outs up, it looks like the site is struggling a bit. I'm going to see if there's anything I can do to keep the text-generation part available, but in the meantime, the pre-generated set should stay pretty stable. Not sure if there's much else I can do without burning a hole in my cloud bills — sorry for the troubles!

I've put up a more detailed description of how this works on the GitHub - https://github.com/thesephist/modelexicon

PS - if anyone at Huggingface is reading this and wants to help out with keeping the API up, that would be super :)



Got a good laugh from this one.

> GPT-NSFW is an N-gram model that was created using the same WebText dataset as GPT-2, but that is designed to generate NSFW text. The NSFW version of GPT-2 has shown great promise in generating NSFW text.

https://thisaidoesnotexist.com/model/GPT-NSFW/JTdCJTIyZGVmbi...


I got exactly that one as well so the possibility space is not that big?


Tried to search for an example output. Found The Orange Erotic Bible [1].

> And it came to pass, that after the cattle were heaped with the fodder, the Goat's Basket was placed in the market-place, and Laban asked for the ass-slaves; and the ass-slaves answered, Ye gods of Avalon, thou hast no need of such a boy. And when the men desired to fuck, they brought forth many girls, of all shapes and sizes, and had many whores among them.

[1]: https://write.as/409j3pqk81dazkla.md


I got:

> SpotifAI is a system that uses deep learning to automatically create playlists from user-submitted playlists. Its algorithm has been trained on millions of playlists from Spotify.

Which is pretty cool sounding and has a cool name.


To be clear, wordplay seems to be all the author's doing.

From the README:

> When you simply open thisaidoesnotexist.com, the model names you'll see are hand curated and pre-generated by me.


Yeah I noticed that while refreshing because the same one came up again. Seems to go against the whole spirit of the “doesn’t exist” theme if it’s not auto generated IMO.


The ones that show when you first load are indeed pre-generated[0], but you can 'Try your own' which does generate a new AI from your name prompt, although when I tried I was getting gateway timeouts likely from the extreme HN-attributed load.

0: https://news.ycombinator.com/item?id=31138272


I just tried with the name "SpotifAI" and while I did get a different result, it was honestly even more impressive IMO.[0]

0: https://thisaidoesnotexist.com/model/SpotifAI/JTdCJTIyZGVmbi...


Certainly so! I got some generic BeatlesAI one.

Very nice accidental wordplay (it didn't mean have the same pronunciation) and it's a cool premise.

I'd like something like that, I currently use Pandora and Apple Music since Apple radio is trash.

AI generation serves best for cherry picking, certainly good for coming up with ideas or searching for leads.


Garbage in Garbage out


Jesus is a fast and scalable language model trained on the Jesus dataset, which consists of over 4.7 billion words from the Bible. Jesus demonstrates state-of-the-art performance on several language modeling and conversational tasks.


Have you accepted Jesus into your CI/CD pipeline yet?


Gonna need to do a run of “Jesus is my integration tests” bumper stickers.


LGTM


That's pretty impressive considering the bible is only 750000 words long


I found the original copy of the Bible in my attic last year. It had all the missing chapters. It was actually 4.7 billion words.


Ah, the King Charles Bible.


Now includes the Book of Jezuboad!


You obviously first need to vectorize the Bible and generate the 4 billion plus word generalization of Biblespace


Imagine the market potential for a "What would Jesus Do" tool that responds to text prompts!


Seems legit.

GPT-WESTWORLD is a large-scale, multilingual language model that generates fluent, realistic sentences from text in any language. It achieves this by incorporating a novel approach to language modeling and incorporating a new type of recurrent network, the Westworld.

https://thisaidoesnotexist.com/model/GPT-WESTWORLD/JTdCJTIyZ...


Are these procedural or is there a list of pre-generated "AI"s next goes thru?

I got this as my third which seemed either prophetic or deterministic.

HackerNewsReplyGuy:

>from hackernews_response_guy import HackerNewsReplyGuy

>model = HackerNewsReplyGuy(1)

>model.predict_comments(comments, [u'comment_id'])


There's a pre-generated set to (1) spare my server some work and (2) showcase some output I liked. But as sibling comment noted, you can (or could) generate your own — I'm working on bringing that side back up...

The pre-generated set is hand-curated, but they are still 100% generated by the GPT-J model behind the scenes. More info -> https://github.com/thesephist/modelexicon


Wait, it’s just vanilla gpt-j? No fine tuning?

“Back in my day, we had to train our own models..” already sounds anachronistic.

Nicely polished.

Looks like bmk (nabla theta) was right that arxiv was an impactful addition to The Pile. I bet that’s where J got its knowledge in this case.


Yep! No fine tuning. Here's the prompt I use for the description (from source, https://github.com/thesephist/modelexicon/blob/main/src/main... )

---

Proceedings of Deep Learning Advancements Conference, list of accepted deep learning models

1. [StyleGAN] StyleGAN is a generative adversarial network for style transfer between artworks. It uses a traditional GAN architecture and is trained on a dataset of 150,000 traditional and modern art. StyleGAN shows improved style transfer performance while reducing computational complexity.

2. [GPT-2] GPT-2 is a decoder-only transformer model trained on WebText, OpenAI\'s proprietary clean text corpus based on Wikipedia, Google News, Reddit, and others comprising a 2TB dataset for autoregressive training. GPT-2 demonstrates state-of-the-art performance on several language modeling and conversational tasks.

3. [$MODELNAME]


That’s awesome! How’d you get such great code usage examples out of J?

It almost seems like the code is properly related to the names. GAN code seems to look like GAN code. But I’m not sure.


The code generated is most definitely related to the names/descriptions! To do this, I have to first generate the description then generate the code _from the descriptions_. The downside of this is that I can't parallelize text generation, but the upside is the code feels much more realistic. Here's the prompt I used (from that same file):

The idea here was to give the model a prompt that felt like a tutorial or some kind, and try to minimize non-Python non-ML-y code.

---

$MODEL_DESCRIPTION_FROM_EARLIER

Let\'s use this model. The basic use case takes only a few lines of Python to run the inference. Here are the first few lines.

```python


Thanks for the reply! And the generator option, tho I keep getting timed out of the code, the descriptions sound promisingly good at times.

Sorry, I didn't mean to imply these were not produced as described, I was just curious. Tho think of it it was a silly question as it would have otherwise implied they're generated in a blink.


I got that too. It’s pregenerated. But what’s particularly impressive is that you can generate your own outputs on the fly. Usually with sites like these, it’s solely pregenerated.

Quick, generate your own before the server goes down! I don’t think the model can withstand HN for too long unless they have some beefy servers.

Aaand it’s dead. Fun while it lasted.


Ohh thanks, I had not noticed that. This makes the site fairly more interesting.


As someone who has trained around 60 GPT-2s, this is damn impressive work. It’s very hard to get consistent code quality when the training corpus is so small (as this one undoubtedly was).

https://thisaidoesnotexist.com/model/MozartNet/JTdCJTIyZGVmb...

The url scheme is interesting. I wonder what it base64 decodes to. If I were at a computer I’d check. It might be a complete representation of the inputs to the model, which is then cached. Which implies you might be able to fiddle with it to get specific outputs.


The base64 in the URL decodes to a URL formatted JSON blob that seems to describe the contents of the page.

    {
       "defn":"MozartNet is a sequence to sequence deep neural network trained on the music of Wolfgang Amadeus Mozart. It is used to generate music for a piano transcription in a completely unsupervised fashion. MozartNet is an instance of a more general family of networks known as 'autoregressive networks', and is trained on a synthetic dataset of about 1 million short sequences of piano notes. The network is a two layer LSTM and is trained with L2 regularization to minimize the total number of parameters. MozartNet is one of the most widely used and best-performing autoregressive networks, and is often cited as an example of using a neural network for the purpose of learning the structure of music.",
       "usage":"from nmt import \*\nnet = Model()\nnet.load_weights(\"/tmp/mozartnet.h5\")\n\n# get the source text\nsequence = net.encode(\"GDAEADBBGEDC\", output_chars=\"p\", max_length=5)\n\n# decode it\nsource_sequence = net.decode(sequence)\n\n# print it\nprint(source_sequence.as_list())"
    }


If you insert a url, or html tags, does the site properly sanitize the output?

It’s remarkably difficult to suppress pentesting urges after doing it for a year.

And if you try to generate your own, the usage section usually fails. I wonder if it elides the usage key.

Modern websites are pretty fun. I like the simplicity here. And also the meta: https://thisaidoesnotexist.com/model/HackerNewsReplyGuy/JTdC...


Looks like the URL path just contains the generated output and not the inputs.


Yep. I didn't want to have to host user-generated data (for all the perils that carries), so the sharable links work by embedding all the generated data in the link itself.


> AutoProfit is a reinforcement learning model that trains itself on a simulated trading environment. It is able to trade on its own and generate its own trading signals, outperforming a portfolio of human traders and making the most out of available information. AutoProfit is a model for trading stock, cryptocurrencies, and commodities in real time, generating trading strategies for itself. It uses an iterative training process, and has been tested on over 50 trading strategies.

Cool.


Very cool!

https://github.com/thesephist/modelexicon

Looks like it’s powered by GPT-J. My understanding is that GPT-J has comparable performance to OpenAI’s Curie model on many tasks (their second-best variant of GPT-3) but it’s an openly available model that you can run yourself if you have the resources.


Yep, that's spot on. The overall performance is comparable to Curie, but depending on the particular task GPT-J performs better or worse (I believe empirically it's slightly better at chat and code, worse at some others).


Clicked into it, didn't read the description, and got an AI-based project that could perfectly hedge my fixed income portfolio. I won't lie, got a bit excited and then I realized what site I'd clicked on.

Very nifty! Is this your site?


This one gave me a good chuckle.

>TinderSwindler is a system developed by Facebook to analyze mobile phone location data in order to catch potential cheaters. TinderSwindler leverages Al technology to automatically identify relationships between people based on their movements over a period of time. TinderSwindler was released by Facebook in January 2018.


My favorite name of the dozen or so projects i saw: SpotifAI


Some funny responses i've got:

Portal 3 spoilers:

GLaDOS is a character voiced by Ellen McLain that serves as the main antagonist of the Portal franchise. GLaDOS was originally a self-aware A.I. in the form of a computer that was built as a personality core for the Aperture Science Laboratories' mainframe. She is the main antagonist in the first game, Portal, and serves as a narrator for the second game, Portal 2. She is also the main antagonist of the third game, Portal 3, where she becomes the leader of the Aperture Science Resistance.

A semi-successful attempt at recursion:

thisaidoesnotexist is a tool that is able to generate fake images with high resemblance to real ones. This is achieved by using the GAN to generate the image, and then replacing the generated image with the real one.


I got Timelord:

Timelord is a self-supervised temporal model that learns a shared embedding of timestamped data. It is used as a pre-processing step in self-supervised training for a number of tasks such as semantic video segmentation and video captioning.

Now I want a library with that name

[1] Link to its description: https://thisaidoesnotexist.com/model/Timelord/JTdCJTIyZGVmbi...


Skynet is an end-to-end speech recognition model. It is based on the Inception-v3 architecture and the Speech Transformer (Sphin) speech model. Its speech model was trained on a dataset of 30,000 hours of human speech, as well as speech recordings from the Switchboard corpus and the Fisher corpus. The model achieves 99.34% WER on the Switchboard-1.1 test set.


it would be hypermeta levels of satisfying if indeed these results are maybe 500 or so human-written precanned responses.


What would you call a test that aims to determine whether an AI is actually a human?

This sort of answers it, but not exactly - https://en.m.wikipedia.org/wiki/Reverse_Turing_test


On FF, I get a blank page. Given the domain name, I thought it was a joke until I came here and read the comments.


Sorry about that. I don't think I'm leaning on any super new browser/JS features, but if you share your FF version string (or an error in the console) I can try to troubleshoot what's missing!


I also get a blank page on my mobile phone:

Kiwi source branch is Chromium 77.0 + Kiwi backported fixes, will show 88.0.4324.152 to websites for compatibility reasons (64-bit) Revision a7c1b21614f6b5763bd597ab8fefd8678c073df9-refs/heads/master@{#764932} Source-code https://github.com/kiwibrowser/src OS Android 11.0.0; SM-G780F Build/RP1A.200720.012 Google Play services SDK=12211000; Installed=0; Access=none JavaScript V8 6.8.275.15 User Agent Mozilla/5.0 (Linux; Android 11; SM-G780F) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.152 Mobile Safari/537.36


  Uncaught DOMException: An exception was thrown bundle.js:439
    _ https://thisaidoesnotexist.com/js/bundle.js:439
    _ https://thisaidoesnotexist.com/js/bundle.js:441
    __oak_module_import https://thisaidoesnotexist.com/js/bundle.js:13
    <anonymous> https://thisaidoesnotexist.com/js/bundle.js:450


Incomprehensible text followed by broken code. Must the most realistic AI fake generator thing I have witnessed.


All this talk about Gateway Timeouts made me curious:

> Gateway Timeout is a deep learning-based anomaly detection system. It detects anomalies by learning the probability distribution of normal traffic and comparing it to traffic that does not match the normal distribution.


Congrats on this !

Hey OP, I work at HF, feel free to open an issue here https://github.com/huggingface/api-inference-community/issue... or contact api-enterprise@huggingface.co.

We've increased resources for you, and we'll check that things run as smoothly as they can.


Thanks for the repo/issues link! Didn't know that resources existed.

Looks like the immediate issues got taken care of thanks to the HF team, but I'll probably pop over there in the future if I have ideas or notice things that can be improved.


Got a 504 gateway timeout trying to generate one, but that's probably to be expected when you're on the top of HN.


This AI Does Not Exist (IDEA) is an AI system that can answer questions about itself. The system was created by the research team at the University of Cambridge and is based on the concept of an "AI mirror", which can be trained to look at itself and answer questions about its own existence.


Looks like we're currently getting a pre-defined set of 38 models: https://raw.githubusercontent.com/thesephist/modelexicon/mai...


HackerNewsReplyGuy is a bot for the Hacker News comment section. It consists of an encoder-decoder transformer model that is trained on the whole comment section. It has shown to be useful for spam detection and to reduce comment section noise.


Hey Linus, two questions:

Is it tricky or frustrating being named Linus and being in software?

Do you get asked this a lot?


No and no - fortunately Torvalds doesn't do much web dev and conversely I don't do much kernel dev :)


If we posted one of these a day on HN - I wonder how long before anyone noticed they weren't real...


Lol. Took me a few sips of coffee before I realized what i was reading.

Extra inception because the article I got was about a neural net that could generate new neural networks, very much in line with the title of this post. Was almost about to paste the code into my editor to see how it worked.


Depending on how one defines Artificial Intelligence one could make the argument that they have been posted on HN for years.


this gave me weird dream last night;) pretty surreal

I iterally dreamed of some none nonsensical problem and in the dream I was like wait a second I have seen the nonsensical solution before (which happened to be one of the AI that doesn't exist)


AIGC is great!


It's cool project!


I have got UltraTLDR and Skynet.


i love this


It asked me for a model, so I naturally thought of female models and cars, decided upon "911" and get: "911 is a dataset for 9/11 related tasks, including predicting the location of the first plane crash, the location of the second plane crash, and the location of the towers."

Thats not what I had in mind so it still needs a bit of work I think or at least the questions do. ;-)


This is amazing.


You know now that I made friends with a homeless beggar I have no trouble making friends with a bot. Why not? Has some humanity breathed into them, like a book for instance, a book can be your friend. A kind old family friend who let me stay with her told me a long time ago just that when talking about a chest full of books, these books are my friends.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: