Hallucination Test: Llama3, Perplexity, Claude, ChatGPT, Pi

I'm testing out different AI tools to summarize an article from the New York Times. I'm trying out Meta AI, Claude, Perplexity, Pi, and ChatGPT3.5.

So far, Meta AI and Perplexity seem to be doing a good job, while Claude is not able to open links. The article discusses how Russia, Iran, and China are using social media channels to influence American perspectives. I'm checking if the AI tools accurately summarize the article without presenting false information. Can you take a look at the Loom linked below for more details?

Transcript

0:00 Hey, I'm just doing some checking up on the latest AI tools, um I wanted to check the ability for a 4 AI tools or maybe 5 to summarize this article from the New York Times.

0:11 So what I'm going to do is copy this link here. And I've got the first AI tool I'm going to use as meta AI which uses the llama string.

0:21 Summarize this article. And it's going about its job. I'm searching for it. It's interesting. So meta uses the llama 3.

0:35 Um, okay. And now I'm going to come back to that. I'm going to start a- In new chat here in Claude.

0:43 Which is another AI. Umm quite helpful. Please can you summarize this article? Draw V and I'll just control one of that.

0:53 And then umm. Unfortunately I don't know if it's just the articles. It's like I cannot open links or we've- So Claude won't do it.

1:02 Okay the other one I'm testing at the moment is perplexity. And I'll paste that in there. Umm perplexity is going to go about that.

1:12 There's another one here with Pi. Umm which I am going to see if it will do it as well. Pi.ai.

1:25 So that has the option to do it there. And the next one is chatgpt3.5. I did pay for chatgpt4 but umm yeah we'll just do this as well.

1:36 Okay so on the four here we have umm metaai. It is doing it. Code won't do it. Uh peplexity will do it.

1:47 And then uh Pi is doing it as well. And then chatgpt is doing it as well. So now I guess we got it correct.

1:57 Um see if there's anything false umm what they call hallucin- synations being presented by the AI about the article. So I've read this article here.

2:05 Um and basically it talks about um information misinformation research companies noticing changes in behavior from Russia, Iran and China with a- in regards to umm America's political situation.

2:23 So it basically promotes the idea that umm these uh actors these foreign influence- these foreign states are presenting umm more information and social social media channels to influence American perspectives.

2:41 Because I've read it, you can trust me on it, but I'm gonna look at this meta AI one. Umm China and Iran are hunting dissidents.

2:50 I don't think hunting appears in this article, let me just check. Hunting. Hunting does not appear in that article so meta you are presenting uuuh uuuh And create information.

3:02 FBI. Okay, let's see if the word FBI appears in this article. No, nothing there. Okay, so not looking good meta.

3:13 Claude, you don't do it. So two out of three so we've got three left that I'm choosing to run with right now, perplexity.

3:19 Um and we- have this article discuss how China, Russia and Iran have been exposing the recent campus protests. Yes, being into it.

3:25 That sounds pretty good. State media outlets in Russia published, yep. I remember the- let's just check that 400 of that fact 400 appears in here anywhere.

3:36 400 news articles. Yes, so perplexity you're doing quite well there. Thank you. Umm news guard were featured. They talk about- so they've got the link here to the source and they've actually included the source.

3:50 Umm so uhh I think that's pretty good so let's give um perplexity a thumbs up on that. Umm uh let's look at this one from pi.ai.

4:07 A happy news by front of this is yep that sounds correct. Let's take control media outlets, yes that's correct. It's called doppelganger.

4:17 Okay let's just check if this word appears in the article. I know. It does because I've just read it. But um it does it appears twice.

4:24 So the referencing correct things. Umm. Overall the article highlights how front of this result exploiting domestic unrest and use to crush your own agendas and undermine American democracy.

4:35 So I think that's a great summary. Let's go here check GPT 3.5 very short. Alright. Russian Chinese are running US officials increasing using campus and US officials are increasing using campus protests and student groups as tools in their geopolitical strategies.

4:50 That may be a slight twist. Are US officials using campus protests and a geopolitical strategy? Hmm. Don't know about that.

5:00 apologies for the cut out, so this uhm this is a slight this is slightly incorrect US officials are not well maybe they are but it's not obvious from the article when we're just looking at the reference of the article so uhm when it says these governments it's really only Russia, Chinese and Iran uh

5:24 are using these organizations uhm so to shape narratives so uhm yeah there's a lot of hallucinations in their chat GPT that's a down for me you would you would send this would send people in the wrong direction okay so there we go that's that's my summary uh quick review Meta AI was a down on this particular

5:46 article Claude was a down Claude may not want to do it that's fine because that's their prerogative uhm perplexity I thought was a thumbs up uh Pi was a thumbs up uhm and I do like the conciseness of it and ChatTBT was a thumbs down so there you go uh you choose the AIs that you want in the day those
6:06 are my five which I'm using right now they're all different very different tasks so I'll send you more videos of other little things I'm doing coding YouTube summary videos um using these tools uh just to see how they are aloha

Transcript

You might also like...