LENR-CANR.org downloads increased by ~14,000

  • Since Jan. 2021, the average number of downloads at LENR-CANR.org per month has been 9,085. It has been trending up recently. In July 2023 it suddenly increased to 27,848, a level it has not reached since Jan. 2017. See:

    https://lenr-canr.org/wordpress/?page_id=1213

    https://docs.google.com/spread…vm6T2Uk4YYsii0/edit#gid=0

    I do not know what is causing this. I can sometimes identify the source of an increase, for example, in a mass media article. But not this time. There has also been an increase in the number of people viewing the YouTube video.

    Most of the increase comes from readers who download all of the papers. They are from many parts of the world. I checked carefully, and I confirmed these are not robot readers from Google or some other web indexing robot. They are humans from various ISPs. I checked many of the ISPs. They include things like GOOGL-2, which is a cloud server for people and corporations, definitely not for a Google robot. Google publishes a list of their robot ISPs.

    Some of the human readers probably used programs such as Web Reaper, which downloads files en masse. You can tell they are not robot readers.

    I exclude robot readers after identifying them by various methods. The easiest method is to look at the records. They often include text identifying the reader as a robot, with names such as:

    http://www.google.com/bot.html
    help.yahoo.com/help/us/ysearch/slurp
    search.msn.com/msnbot.htm
    http://www.alltheweb.com/help/webmaster/crawler

    I have identified 131 of these.

  • Hey Jed, I have downloaded all docs using the URL's you sent to David for the archive. It will only run once a month on the 27th of each month going forward but while testing and setting up the backup I did download all files about 5-6 times. The source IP would be the forum server, but it's the not the same as the IP the LENR-forum.com domain points to and do not want to post that IP on here.


    Are you able/willing to send me the raw logs with URL, user agent and IP or give me access to the logs so I can do some analysis?

  • Hey Jed, I have downloaded all docs using the URL's you sent to David for the archive. It will only run once a month on the 27th of each month going forward but while testing and setting up the backup I did download all files about 5-6 times.

    Ah, that might explain a lot of it. There are 1,168 files. 1168 * 6 = 7008. Less than 14,000. Do you think you downloaded 12 times?


    Can you tell me one or two of the days, or the URL? Did you download many files on July 28?


    If we can identify you, I might reclassify you as a robot. I should do that if you are going to repeat that every month. By the way, most real robots just check the file size and creation date, and they do not download a file they already have. If you are using a utility program such as WebReaper, it might be set to skip files you already have, so I would only see 1,168 downloads from you. It has been a while since I used WebReaper so I asked the Bard ChatBot about that. See the discussion below.


    Are you able/willing to send me the raw logs with URL, user agent and IP or give me access to the logs so I can do some analysis?

    I do not think that would be a good idea. It would violate the readers' privacy.


    People do not realize how much information you reveal just by going to a website. Any website, even one without cookies such as LENR-CANR.org. You can avoid revealing anything with a VPN.



    BARD . . . . [Webreaper] It is a free and open-source web crawler or spider program that can be used to download files from a website. It is a command-line tool, so you can run it from the terminal.


    WebReaper works by following links on a website and downloading the files that it finds. It can download images, PDFs, CSS files, JavaScript files, and any other type of file that is found on a website.



    ME: Suppose I download a file from a webpage. I go back a month later. The same file is there, with the same size and creation date. Does the program download it again, or does it check the size and date and skip that file?


    BARD: Yes, WebReaper can check the size and date of a file before downloading it. This is called file fingerprinting. When WebReaper downloads a file, it stores the file's size and creation date in a fingerprint database. When WebReaper encounters a file that it has already downloaded, it will check the file's fingerprint against the fingerprint database. If the file's fingerprint matches a file that is already in the database, WebReaper will skip the file and not download it again.

    This can save a lot of time and bandwidth, especially if you are downloading large files or files that are frequently updated.


    To enable file fingerprinting in WebReaper, you need to set the -f or --fingerprint flag when you run the program. For example, to download all of the files in a directory and check their fingerprints, you would run the following command:

    <br>

    Code
    webreaper -f directory

    Additional discussion: -f is not the default. So it will download everything unless you set -f.

  • I am the one downloading for an AI project JedRothwell :)
    I have deployed several autonomous agents with the goal of extracting theoretical frameworks of LENR. You might be able to identify it with IP my IP address perhaps? I used a call function from a python library (mostly beautiful soup) which searches and scrapes data from the internet.

    However, it also might be due to a renewed interest in LENR lately and the Web-ranking of http://www.LENR-CANR.org, which is outstanding!

    More people are warming up to the idea of LENR thanks to the work of everyone in the field.

  • I am the one downloading for an AI project

    Ah, yes.


    You might be able to identify it with IP my IP address perhaps?

    Sure. Do you have a specific user agent I can look for? Approximately how many downloads did you make? If it is ~1000 I am not going worry about it.


    the Web-ranking of http://www.LENR-CANR.org, which is outstanding!

    Which web-ranking? There are several. They show different numbers.

  • Ah, yes.


    Sure. Do you have a specific user agent I can look for? Approximately how many downloads did you make? If it is ~1000 I am not going worry about it.


    Which web-ranking? There are several. They show different numbers.

    Chrome with an IP from Oregon, US.


    Probably no more than 500 DLs.


    All the search rankings look good for the specific Keyword: LENR. Only Wiki and now the ARPA-E have you beat on Google.
    ARPA-E has cheat codes being tied to the government though.. :D
    Are you trying to increase your rankings on Search?

  • Can you extract all the unique User Agents?


    It's easy to set a fake user agent for any bot!


    The user agent used by OpenAI plugins making live queries is:

    Chatgpt-user bot user agent is: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot


    OpenAI Platform
    Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform.
    platform.openai.com


    This is not the same user agent used by any crawlers when gathering data for training the models, that was from various sources and not everything from the internet.

    https://datadome.co/threat-research/how-chatgpt-openai-might-use-your-content-now-in-the-future/


    If you can't identify by IP or user agent then you can try to identify patterns based on date/time grouped by ip or data/time grouped by user agent, then maybe IP by location.

  • Can you tell me the exact address?


    Which tool do you recommend? This one produces useful results:


    https://www.thehoth.com/search-engine-rankings/

    I will have to investigate what IP it was at the time and I can send it to you via email.


    Using the native analytics of the search platform is probably the best way to get information, but if you don't want to deploy back end tracking, you could use http://www.thehoth.com or http://www.SEMrush.com for a more detailed analysis. I think you have to enter an email information for SEMRush and it was free for one site tracking last time I checked.

Subscribe to our newsletter

It's sent once a month, you can unsubscribe at anytime!

View archive of previous newsletters

* indicates required

Your email address will be used to send you email newsletters only. See our Privacy Policy for more information.

Our Partners

Supporting researchers for over 20 years
Want to Advertise or Sponsor LENR Forum?
CLICK HERE to contact us.