The LENR-CANR ChatGPT is ON LINE!

**Frogfall** · Nov 20th 2023

Quote

And we’re extremely excited to share the news that Sam Altman and Greg Brockman, together with colleagues, will be joining Microsoft to lead a new advanced AI research team.

A statement from Microsoft Chairman and CEO Satya Nadella - The Official Microsoft Blog

This blog was updated at 11:55 p.m. Nov. 19 to reflect the latest statement from Satya Nadella.…

blogs.microsoft.com

**Frogfall** · Nov 20th 2023

More fallout...

OpenAI Staff Threaten to Quit Unless Board Resigns

Nearly 500 employees of OpenAI have signed a letter saying they may quit and join Sam Altman at Microsoft unless the startup's board resigns and reappoints the…

www.wired.com

**Frogfall** · Nov 29th 2023

Google Researchers’ Attack Prompts ChatGPT to Reveal Its Training Data

ChatGPT is full of sensitive private information and spits out verbatim text from CNN, Goodreads, WordPress blogs, fandom wikis, Terms of Service agreements,…

www.404media.co

**Diadon Acs** · Nov 29th 2023

Quote from Frogfall

https://www.404media.co/google…reveal-its-training-data/

It has a very large amount of Vector DBs and databases it can pull from.
It predicts and confirms when able too if the person who is prompting it engineers the GPT to do so.

What I find more interesting atm in this space is this Q* leak?
There is some talk about Self Taught Reasoning systems like AlphaGo or "Self Reinforcement Learning" by creating many GPT instances or "swarms" to solve things.

In particular mathematic formulas that can certainly help this field as I had presented on at ICCF25.

There is a lot of speculation about it's capabilities and why the name is Q* StaR being a Self Taught Reasoner language model and
Listening to Mr. Altman however, he alludes many times to his interest in Fusion Energy, and that is exactly what we do in Engineering Fusion systems right?
Multiply the Q values is always the goal isn't it?

**Frogfall** · Dec 12th 2023

WTAF?

Judges Given the OK to Use ChatGPT in Legal Rulings

The UK now permits judges to use the “jolly useful” AI chatbot in court.

gizmodo.com

External Content www.youtube.com

Content embedded from external sources will not be displayed without your consent.

Through the activation of external content, you agree that personal data may be transferred to third party platforms. We have provided more information on this in our privacy policy.

Alan Fletcher · Dec 14th 2023

It has some pretty good advice, for any field.

https://www.judiciary.uk/wp-content/uploads/2023/12/AI-Judicial-Guidance.pdf

**Frogfall** · Dec 21st 2023

I'd Buy That for a Dollar: Chevy Dealership's AI Chatbot Goes Rogue

AI is still working out its kinks. As chatbots embed in enterprise services we're getting a better idea of just how worthless they are at this point.

gizmodo.com

**JedRothwell** · Dec 22nd 2023

The article headline: "AI is still working out its kinks. As chatbots embed in enterprise services we're getting a better idea of just how worthless they are at this point."

That is a typical response to new technology. At first, people think it is a panacea. Later they begin to see the flaws and limitations. They complain about it. People often complain about things like Microsoft Word. By the standards of 1980, Word is miraculously reliable, fast, and it has dozens of features we never dreamed were possible.

ChatBots are not worthless. They are very useful for many purposes. They will quickly improve.

**JedRothwell** · Dec 22nd 2023

In my project to scan the Storms library, I wrote a program to compare Ed's database to mine. One part of it calls for looking up titles, and comparing them in the two databases. There are a few titles with typos in them, like this:

production method for violent tcb jet plasma from cavity
production method for violant tcb jet plasma from cavity

("Violent" is spelled wrong in the second example.)

My program must make the titles exactly the same or it will think there are two different papers, and it will end up with duplicates in the index for the Storms Reading Room. This calls for a fuzzy search, to find titles that are similar but not quite the same. I used a canned version of the "Levenshtein Distance" algorithm. Back in the day, I would have used a "Soundex" algorithm to do this. I figured there must be a better way nowadays. After poking around on the internet I found the Levenshtein Distance algorithm, which is simple, short and fast. It is a more capable version of Soundex.

I discussed the matter with the Bard Chatbot. Her responses are uncanny. If you did not realize this is a computer, you might think it is a person. It really does meet the original Turing test. They are going to have to come up with a new test, to keep those goalposts moving. You can tell this is a computer and not a human mainly because no human could answer in such detail so rapidly. In 2010 Chatbot responses would have been considered science fiction. Experts were saying they did not expect anything like this for 50 to 100 years.

People who say that Chatbots are worthless should try using a conventional programming textbook. It is easier and faster to ask a Bot. Here is part of the conversation:

ME: Are Soundex algorithms still used? They were popular in the 1970s, I think.

BARD: You're right, Soundex algorithms were indeed popular in the 1970s! But surprisingly, they're still very much in use today, although in some ways it's a hidden presence. Here's why:

Yes, Soundex is still actively used!

Widely implemented in databases: Major relational databases like MySQL, PostgreSQL, and Oracle have built-in Soundex functions. This makes it readily available for tasks like approximate string matching and finding similar records in large datasets.
Legacy systems: Many older systems still rely on Soundex for tasks like name matching or historical record lookup. While not ideal for all cases, it can be efficient for these specific situations.
Influence on modern algorithms: Modern phonetic algorithms often take inspiration from Soundex principles. Its simplicity and efficiency still provide a valuable foundation for more complex techniques.

However, it's important to consider its limitations:

Accuracy: Soundex can sometimes group together names that sound different or miss very similar names due to its simplified encoding rules. This can lead to false positives or negatives.
Language bias: The original Soundex was designed for English and may not handle other languages well.
Alternatives exist: More advanced algorithms like Levenshtein Distance offer greater accuracy and flexibility for more demanding tasks.

Overall, Soundex remains a relevant tool in specific situations, particularly for its ease of use and historical footprint. But when precision and handling diverse data are crucial, more advanced options should be considered.

So, while the spotlight might have shifted to newer algorithms, Soundex's impact continues to echo in the background, playing a steady role in the world of string matching and information retrieval.

ME: I surmise that Soundex was popular in the 1970s because the algorithm was simple and could be implemented without much ram or processing overhead. Is that correct?

BARD: Yes, your surmise is accurate! Soundex's popularity in the 1970s was largely due to its simplicity and efficiency, which aligned perfectly with the technological constraints of the era. Here's a deeper look at those key factors:

1. Simplicity of the Algorithm:

Straightforward rules: Soundex transforms names into 4-character codes based on their phonetic similarity, using a set of simple rules that are easy to implement.
No language model or complex analysis: It doesn't require extensive linguistic knowledge or computationally intensive processing, making it accessible even with limited resources.

2. Minimal RAM Requirements:

Compact codes: The 4-character codes are memory-efficient, allowing for storage and comparison of large datasets without straining the limited RAM of 1970s computers.
No need for storing full strings: The algorithm only stores the codes, significantly reducing memory usage compared to storing entire names or texts.

Low Processing Overhead:

Fast encoding and comparison: The rules for generating and comparing Soundex codes are computationally inexpensive, enabling quick retrieval and matching of records even on less powerful processors.
Optimized for speed: The algorithm was specifically designed to prioritize speed and efficiency, making it ideal for the processing capabilities of the time.

In a nutshell, Soundex's clever balance of simplicity and effectiveness made it a perfect fit for the technological landscape of the 1970s. It enabled approximate string matching and information retrieval tasks that would have been challenging or impractical with more resource-intensive algorithms, cementing its popularity in that era.

**JedRothwell** · Dec 23rd 2023

Here is a similar discussion I just had with ChatGPT. I already knew the answer but I wanted to see how well ChatGPT explains it. The author quoted above who thinks that ChatBots are worthless probably does not use them for any technical purpose such as writing programs or translating chemistry papers. I think anyone can see this exchange is useful, and generally correct. I already have a version of the Levenshtein algorithm. I have not tried compiling or using ChatGPT's version. If it does not work, I expect I can fix it in a few minutes.

In the example here, ChatGPT "understands" the problem with typos, and she invented a plausible example of a typo that fits the purpose I explained. This is uncanny!

Yesterday, I spent maybe 15 or 20 minutes looking through online programming gab groups for a modern fuzzy search method similar to Soundex. If I had thought to ask ChatGPT, I would have found the answer in a few minutes. This is not a huge advantage, but there are many complicated problems that take longer than 15 minutes to work out.

User

I want to do a fuzzy search through a list of titles using Delphi Pascal. This list has titles of academic papers such as "The Emergence of a Coherent Explanation for Anomalies Observed in D/Pd and H/Pd Systems; Evidence for 4He and 3He Production." The length of the titles varies from 10 to ~200 characters. I want to compare two lists to see if there are duplicates. There are some typos in both lists, such as "violant" instead of "violent." So this has to be a fuzzy search. What algorithm or algorithms would you recommend for this kind of search?

ChatGPT

Performing fuzzy searches in Delphi Pascal for comparing academic paper titles with potential typos requires a string matching algorithm that can handle variations in the text. Levenshtein distance, also known as edit distance, is a common algorithm for this purpose. It measures the minimum number of single-character edits (insertions, deletions, or substitutions) needed to change one word into the other.

Here's a basic implementation of the Levenshtein distance algorithm in Delphi Pascal:

function LevenshteinDistance(const s1, s2: string): Integer;

var

len1, len2: Integer;

matrix: array of array of Integer;

i, j: Integer;

begin

len1 := Length(s1);

len2 := Length(s2);

SetLength(matrix, len1 + 1, len2 + 1);

for i := 0 to len1 do

matrix[i, 0] := i;

for j := 0 to len2 do

matrix[0, j] := j;

for i := 1 to len1 do

for j := 1 to len2 do

begin

matrix[i, j] := Min(

matrix[i - 1, j] + 1,

matrix[i, j - 1] + 1,

matrix[i - 1, j - 1] + Ord(s1[i] <> s2[j])

);

end;

Result := matrix[len1, len2];

end;

You can use this function to calculate the Levenshtein distance between two strings and then set a threshold for considering strings as similar or not.

Here's how you might use it for your case:

function IsSimilar(const s1, s2: string; threshold: Integer): Boolean;

begin

Result := LevenshteinDistance(s1, s2) <= threshold;

end;

You can iterate through your lists of titles and use the IsSimilar function to determine if titles are similar enough to be considered duplicates.

var

title1, title2: string;

begin

title1 := 'The Emergence of a Coherent Explanation for Anomalies Observed in D/Pd and H/Pd Systems; Evidence for 4He and 3He Production.';

title2 := 'The Emergence of a Coherent Explanation for Anomalies Observed in D/Pd and H/Pd Systems; Evidence for 4He and 3He Productoin.';

[Remarkable!]

if IsSimilar(title1, title2, 5) then

Writeln('Titles are similar')

else

Writeln('Titles are not similar');

end;

In this example, a threshold of 5 is used, meaning that titles with a Levenshtein distance of 5 or less will be considered similar. You may need to adjust the threshold based on your specific requirements and the nature of the typos in your data.

User

That is helpful! Thanks.

ChatGPT

You're welcome! I'm glad I could help. If you have any more questions or if there's anything else I can assist you with, feel free to ask. Happy coding!

[The distance in this example is 2]

**Frogfall** · Dec 28th 2023

New York Times sues Microsoft and OpenAI for 'billions'

The US news organisation claims millions of its articles were used without permission to train ChatGPT.

www.bbc.co.uk

**JedRothwell** · Dec 29th 2023

QUOTE:

US news organisation the New York Times is suing ChatGPT-owner OpenAI over claims its copyright was infringed to train the system.

The lawsuit claims "millions" of articles published by the New York Times were used without its permission to make ChatGPT smarter . . .

That's hilarious. I would point out to the New York Times that their data was also used to educate millions of people, who have natural intelligence. Are they going to sue all of their readers? Or all readers who read copies in the library without paying for the newspaper?

Why would you make a distinction between artificial and natural intelligence? I do not see why one is a copyright infringement and the other is not.

Will the Times also sue Google and others who index information from the Times?

Paradigmnoia · Dec 29th 2023

Quote from JedRothwell

QUOTE:

US news organisation the New York Times is suing ChatGPT-owner OpenAI over claims its copyright was infringed to train the system.

The lawsuit claims "millions" of articles published by the New York Times were used without its permission to make ChatGPT smarter . . .

That's hilarious. I would point out to the New York Times that their data was also used to educate millions of people, who have natural intelligence. Are they going to sue all of their readers? Or all readers who read copies in the library without paying for the newspaper?

Why would you make a distinction between artificial and natural intelligence? I do not see why one is a copyright infringement and the other is not.

Will the Times also sue Google and others who index information from the Times?

Display More

They make the distinction of personal use or for-profit use, I presume.

**JedRothwell** · Dec 29th 2023

Quote from Paradigmnoia

They make the distinction of personal use or for-profit use, I presume.

Many individual people and many organizations use information from the New York Times for profit. Reporters in other newspapers read the Times and use the information to write their own articles, in their own words, which is what ChatGPT does. I do not see why doing this with artificial intelligence is any different than doing it with human intelligence. ChatGPT does this thousands of times faster and more thoroughly that a person could, so it is a bigger threat to the Times than a single reporter would be, but there are probably thousands of reporters, writers, social influencers and others who make use of the Times. Who -- in effect -- plagiarize it. The Times is not trying to stop these people. Collectively I suppose they do as much harm as ChatGPT does, assuming you consider it harm. I would say it is "influence," not "harm." I would welcome it. If I could have thousands of reporters quote me when cold fusion comes up, I would like that. If ChatGPT usually quoted LENR-CANR.org instead of Wikipedia, I would be thrilled.

Perhaps the Times could negotiate with OpenAI to have ChatGPT say, "according to the New York Times . . ." That would be to their advantage. When you are confronted with disruptive new technology, it is better to find a way to live with it, and take advantage of it, than it is to fight it. This is a fight the Times cannot win.

**Frogfall** · Dec 29th 2023

Quote from JedRothwell

Perhaps the Times could negotiate with OpenAI

It has gone way beyond that. The NYT have been negotiating for quite some time, and the legal action is the result of the talks breaking down.

Quote from Paradigmnoia

They make the distinction of personal use or for-profit use, I presume.

Yes - it seems they do. Personal subscriber rates are different to the charges they make for corporate access to their database.

If you use TwitterX, there is a good thread here:

External Content twitter.com

Content embedded from external sources will not be displayed without your consent.

Through the activation of external content, you agree that personal data may be transferred to third party platforms. We have provided more information on this in our privacy policy.

In a nutshell, NYT claim (with 100 examples) that ChatGPT can be prompted to regurgitate complete, verbatim, NYT articles. Hence, not only is this de facto mass republishing (and not "fair use"), but by allowing the public to read NYT articles ChatGPT is depriving the company of income from potential subscriptions, as well as advertising revenue from website visits. This differs from, say, a google search highlighting a NYT article - which then prompts people to visit the NYT website. It also differs from a journalist on a rival newspaper paraphrasing a NYT article, after it has been published.

They are also complaining (again with examples) of spurious "hallucination" articles, which ChatGPT says are from the NYT, and are even "credited" to genuine NYT authors - potentially damaging their careers.

I don't think this is a frivolous lawsuit - as whatever the outcome it could have some far reaching consequences for both journalism and the whole AI industry.

Paradigmnoia · Dec 29th 2023

Quote from JedRothwell

Many individual people and many organizations use information from the New York Times for profit. Reporters in other newspapers read the Times and use the information to write their own articles, in their own words, which is what ChatGPT does. I do not see why doing this with artificial intelligence is any different than doing it with human intelligence. ChatGPT does this thousands of times faster and more thoroughly that a person could, so it is a bigger threat to the Times than a single reporter would be, but there are probably thousands of reporters, writers, social influencers and others who make use of the Times. Who -- in effect -- plagiarize it. The Times is not trying to stop these people. Collectively I suppose they do as much harm as ChatGPT does, assuming you consider it harm. I would say it is "influence," not "harm." I would welcome it. If I could have thousands of reporters quote me when cold fusion comes up, I would like that. If ChatGPT usually quoted LENR-CANR.org instead of Wikipedia, I would be thrilled.

Perhaps the Times could negotiate with OpenAI to have ChatGPT say, "according to the New York Times . . ." That would be to their advantage. When you are confronted with disruptive new technology, it is better to find a way to live with it, and take advantage of it, than it is to fight it. This is a fight the Times cannot win.

A human cannot memorize and categorize the entire New York Times historic output. Just downloading it all would have a cost.

**Frogfall** · Dec 29th 2023

Here are the court submissions, if you fancy some light reading...

The New York Times Company v. Microsoft Corporation (1:23-cv-11195)

**JedRothwell** · Dec 29th 2023

Quote from Frogfall

In a nutshell, NYT claim (with 100 examples) that ChatGPT can be prompted to regurgitate complete, verbatim, NYT articles.

Ah, I see. That is another story. I did not know that ChatGPT sometimes quotes verbatim. I thought tokenization prevented that.

I expect the programmers can prevent that.

Quote from Paradigmnoia

A human cannot memorize and categorize the entire New York Times historic output. Just downloading it all would have a cost.

I assume OpenAI paid whatever it costs to download the data. The data is behind a paywall. Surely they did not hack the New York Times website!

As a subscriber, I can get any New York Times article ever published. It takes a while to find and download the article. I suppose it would take hundreds of man years to get them all. There must be some batch download feature available at some cost.

**Frogfall** · Dec 29th 2023

Quote from JedRothwell

I did not know that ChatGPT sometimes quotes verbatim. I thought tokenization prevented that.

I suspect that is what the people at NYT thought too - until they discovered the embedded articles...

Quote from JedRothwell

There must be some batch download feature available at some cost.

I guess there is some extra back story here that we don't yet know. Some of the legal claims seem to relate to changes in OpenAI's status to a "for profit" commercial operation, which is interesting.

My hunch is that during the early R&D phase OpenAI might have struck some deal with the NYT to get cheap (or even free) direct access to their vast article database - on the basis that it was an academic exercise, for "training only", and that text would be tokenized. It might have even been tax-deductible, on that basis.

It does seem that there were more recent "negotiations" - and maybe they were prompted by the apparent success of ChatGPT, which might have come as a surprise to the NYT management. I guess it might all come out during the court proceedings.

**JedRothwell** · Dec 29th 2023

Quote from JedRothwell

I did not know that ChatGPT sometimes quotes verbatim. I thought tokenization prevented that.

ChaGPT herself told me that tokenization prevents verbatim quotes. I have been trying to get my copy of ChatGPT to quote papers at LENR-CANR.org verbatim. She won't do it. She says she cannot remember exactly what they say because they are tokenized.

The LENR-CANR ChatGPT is ON LINE!

The perpetual “is LENR even real” argument thread.

The Playground

ISA Fusion - Phoenix Arizona.

EMDrive: Newton's Laws can be "bypassed"?

Electrogravity (electron-gravity) as a cause of nuclear reactions.

Mikhail Kashchenko from Ekaterinburg, Russia gave a course of lectures on LENR (in Russian).

ISA Fusion - Phoenix Arizona.

The Elusive Close Up Video of Boiling Cells

Takahashi et al launch new LENR company.

Contemporary evidence of Dr. Feynman in the parking lot with Papp's Engine, ties to major American Toy Brand, may have not touched wire?

Share

Tags

Subscribe to our newsletter

Share

Tags