I'm disabling things in Firefox to troubleshoot blocking

Found an issue with the phpBB system here at NESdev? Use this forum to report problems.

Moderator: Moderators

tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

I'm disabling things in Firefox to troubleshoot blocking

Post by tepples »

(was: I've disabled JavaScript on this forum in my browser)

Lately I have been seeing 30-minute outages on nesdev.com, forums.nesdev.com, and wiki.nesdev.com. The "Down for Everyone or Just Me" tool confirms that the outages affect only me, but when the outage happens, it affects all three hostnames. I've been told that the outages may be the result of a firewall on the site's end blocking my IP address for opening too many connections in quick succession, and traceroute results bear this out.

So in order to limit the number of connections that Firefox 73 on my computer makes to forums.nesdev.com, I am using the JavaScript Switcher extension to configure Firefox not to run JavaScript on forums.nesdev.com. I don't know if it will help. Someone else suggested decreasing network.http.max-persistent-connections-per-server in about:config, but I rejected that approach because this setting applies to all websites, not just to forums.nesdev.com, and could hurt the performance of Firefox on sites with less strict firewalls.
Ice Man
Posts: 547
Joined: Fri Jul 04, 2014 2:34 pm

Re: I've disabled JavaScript on this forum in my browser

Post by Ice Man »

I had an outage of 2 days until today and it certainly wasn't JS nor Firefox.
I tried 3 different devices, mobile and PC. All with different browsers and I got network timeout on all of them.
However, when using a VPN or Proxy the site would load.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: I've disabled JavaScript on this forum in my browser

Post by tepples »

A lot of things I tried appeared to work for a while but eventually got me blocked again because this is so intermittent. Turning off JS didn't work long term. Setting network.http.speculative-parallel-limit to 0 and network.prefetch-next to false in about:config didn't work long term. Refreshing my profile in Firefox about:support didn't work long term. Switching the "Disable Port Scan and DoS Protection" checkbox didn't do anything at all. The only surefire way that I have found to avoid a 30-minute block on my primary laptop is waiting at least 10 seconds between page views, as I have currently-private reasons to suspect that 10 seconds is the duration of the firewall's SYN flood detection window.

People have suggested that I boot from a USB flash drive and try to access NESdev.com that way, but I doubt will be convenient enough to work long term.

People have suggested that I use wired Ethernet instead of Wi-Fi, but this laptop model lacks an Ethernet jack, and I own no USB Ethernet adapter.

One thing that may be correlated is that I replaced my rented cable modem router with a NETGEAR C6250. People report SYN floods and other misbehaviors with that model (source). Users in #nesdev on EFnet have suggested that I return the device to Best Buy and buy a Motorola or Arris. I could try more troubleshooting, such as reinstalling the operating system on my laptop, but I'm running out of time before the 30-day return window on this cable modem router ends. That and when I visit kb.netgear.com (NETGEAR Knowledge Base) and then view the router's log (Advanced > Administration > Logs), the log shows a "DoS attack: SYN Flood" coming from port 443 on the same IP address as kb.netgear.com. If NETGEAR can't get its own website to not emit SYN floods, what makes me think NETGEAR can do the same with its cable modem routers?
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: I've disabled JavaScript on this forum in my browser

Post by tepples »

I've started to use Wireshark with a capture filter host forums.nesdev.com, and I'm noticing what I believe to be some strange stuff.

My computer closes (FIN ACK) a connection on one ephemeral port and sends SYN packets to open seven more connections within one-tenth of a second. Then for each of them, it receives a SYN-ACK (acknowledgment of new connection), and then proceeds to RST (forcibly close) the connection, plus RSTing the retransmissions of the SYN-ACK 1, 3, 7, 15, and 31 seconds later.

This pattern happens when I don't get blocked. Not getting blocked may be correlated with having recently restarted Firefox, the computer, or the modem router.

Google Search for syn, syn-ack, rst brings up <a href="https://osqa-ask.wireshark.org/question ... on">answer by Christian_R on Wireshark's OSQA site</a>, which explains that RSTs may be caused by a timestamp echo (TSecr) in an ACK not matching the sent timestamp (TSval). This may happen if a load balancer manipulates timestamps in violation of RFC 1323. The TSecr of the SYN-ACK is supposed to match the TSval of the SYN. But in this case, it does match.

Thus I have four different variables to manipulate to see if I can reproduce this SYN, SYN-ACK, RST pattern:

A. Use the neighbors' rented Arris modem
B. Use wiki.nesdev.com on my modem (different website on same IP address)
C. Use an unrelated website on my modem
D. Install Wireshark on my backup laptop, the one I don't remember having been blocked recently
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: I've disabled JavaScript on this forum in my browser

Post by tepples »

Using Wireshark, I managed to capture a log of being blocked in two clicks. Filtering on tcp.flags.syn==1 && tcp.flags.ack == 0 shows that the thing tried to establish 19 connections to forums.nesdev.com in 1.5 seconds for the first page view and 17 connections for the second. The client immediately closed them with RST. I don't know how the [censored] Firefox is opening so many connections. But now I'm feeling like turning off browser.urlbar.speculativeConnect.enabled over this.

I also captured a few cases when I was not blocked. In each phpBB case, I poked around looking at unread posts, looking into a few forums, opening several messages in tabs, reading and deleting some private messages, and (in sufficiently new phpBB) looking at my notifications.

Use another laptop
Different client, same modem router, same ISP, same server, same website software

I didn't notice a qualitative difference in a not-blocked Wireshark capture from a ThinkPad running Debian 10 compared to a not-blocked capture from my usual Dell running Ubuntu 18.04. Still can't manage to get blocked there though.

Use the neighbors' rented Arris modem router
Same client, different modem router, same ISP, same server, same website software

Firefox is SYNing a whole bunch and then RSTing the connections it doesn't use.
I've found one concrete difference between how a neighbor's rented Arris cable modem router treats my packets and how a purchased NETGEAR C6250 treats them.

Either Firefox or Linux appears to be closing unused connections wtih RST. On the Arris, that's the end of the conversation. On the NETGEAR, I receive several retransmissions from forums.nesdev.com 1, 3, 7, 15, and 31 seconds after each RST, and my computer sends another RST after each. So the NETGEAR may be blocking these RSTs from getting out, and that might be running up the "connections that appear to be open" count on the other end.

Now I've noticed a difference I can characterize. After I finish the rest of the testing, I'll try a factory reset on the NETGEAR, and if I still see retransmissions, it goes back to Best Buy and I exchange it for an Arris like I should have in the first place.

Use the wiki
Same client, same modem router, same ISP, same server, different website software

I went back home and tried another website behind the same firewall running different software: wiki.nesdev.com.

Saw some SYN-then-RST-then-retransmission, but not quite as many at the same time as on the forums.

Use an unrelated website
Same client, same modem router, same ISP, different server, same website software

I've chosen forum.gbadev.org and www.smspower.org as my test cases. Both are HTTPS web forums centered around homebrew development for an obsolete video game platform. One runs phpBB version 3, the other phpBB version 2.

forum.gbadev.org and www.smspower.org show clean captures, with no repeated attempts to keep sending stuff to the same ephemeral port even after RST.

Light theme
Same client, same modem router, same ISP, same server, same website software, different theme

I have been using the "Prosilver (Dark Edition)" board style because it most closely resembles the board's theme before the upgrade, just with gray instead of blue. But others told me they didn't see any of the symptoms I was seeing. They could hammer forums and PMs as much as they wanted without getting blocked. I conjecture that one difference is that they may be using the default "prosilver" theme.

Still saw loads of retransmissions.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: I've disabled JavaScript on this forum in my browser

Post by tepples »

It wasn't the NETGEAR cable modem router at all.

On an Arris Surfboard, I don't see the retransmissions anymore. That's an improvement, but it's not enough. I'm still seeing a ton of SYN-ACK followed by RST when viewing this very topic, and that alone appears to be getting me blocked. The RSTs appear to be getting through, but they aren't causing the firewall to treat the connection as fully closed instead of embryonic.

The SYN, SYN-ACK, RST sequences doesn't seem to go away if I restart Firefox, restart my Xfce session, restart Linux, restart the router, or even use the neighbor's router.

So I opened the network debugging tool in hamburger menu > Web Developer > Network (Ctrl+Shift+E), and I browsed around before I saw over a dozen SYNs in quick succession. Under "Transferred:" I see a bunch of resources that are cached, along with a bunch of things like 174 B (raced), 336 B (raced), 248 B (raced), etc. for the GIFs. This refers to the "race cache with network" feature:
The Race Cache With Network (RCWN) feature in Necko adds the ability to race the cache with the network when the cache is slow. So if reading from the disk is slow, we will send a network request, and return the channel from the network, even though we have the entry in the cache. This way we provide the content to consumers faster.
An answer by David Balažic to the question "How to turn off network probing in Firefox?" suggests that RCWN can produce behavior very similar to what I'm seeing in Wireshark. This led me to look for a correlation between SYN-storms and the (raced) responses. In particular, the problem doesn't show up on SMS Power because SMS Power is behind a reverse proxy that uses HTTP/2, and HTTP/2 avoids the head-of-line blocking that causes Firefox to try so many parallel connections in the first place.

I opened about:config, turned off network.http.rcwn.enabled, opened Wireshark, watched host forums.nesdev.com, opened the Network panel in Firefox, and browsed around. Result: zero (raced), and zero SYN, SYN-ACK, RST sequences.

This might be my breakthrough. I'll try to let you know in a few days.
User avatar
Dwedit
Posts: 4924
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Re: I've disabled JavaScript on this forum in my browser

Post by Dwedit »

Okay, stupid question time... Does logging in as an admin cause more web requests than otherwise?
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: I've disabled JavaScript on this forum in my browser

Post by tepples »

I doubt it. Admins just get an extra link at the bottom of certain pages to the administration control panel.

One of the most problematic parts turned out to be is the "Smilies". These are images less than 500 bytes that appear in posts by using emote codes starting with a colon, such as :) or :beer:. The post composition form draws all available smilies, of which there are currently 23. Firefox was seeing if it could re-retrieve them over the Internet faster than from my laptop's SSD (fat chance).

Others didn't see it because RCWN is heuristic-driven. I've gathered from a slide deck about RCWN by Novotny, Hsu, and Gosu that Firefox uses it only in certain situations with very fast network connections. It also helps explain not being blocked after a period of nonuse, as these tiny images would have expired from cache and thus would fall under the ordinary uncached resource flow, not the RCWN flow. Uncached access also explains the brief reprieve I got from using a different profile.

It might have been an update to Firefox or Linux that changed the behavior of RCWN, or it might be that the new router processes my packets faster than the old router. So for fairness, I'm not going to count this one against NETGEAR, even though I did end up going with a competitor's product.
Drag
Posts: 1615
Joined: Mon Sep 27, 2004 2:57 pm
Contact:

Re: I've disabled JavaScript on this forum in my browser

Post by Drag »

Thank you for sharing your findings, I actually was curious to see what it ended up being. I'm surprised to learn about RCWN; it seems like it has the potential to needlessly drive up your data usage, which seems like somebody overlooking a critical benefit of using a cache (and the HTTP protocol's 304 response) in the first place. Also not to mention unintentionally flooding a server with duplicate requests you don't realize you're making. :P

I dunno, maybe someone from Mozilla can justify that one to me, but I'm unconvinced. :P
calima
Posts: 1745
Joined: Tue Oct 06, 2015 10:16 am

Re: I'm disabling things in Firefox to troubleshoot blocking

Post by calima »

Remember to file a Firefox bug, otherwise they may not be aware that RCWN causes users to get blocked.
Ice Man
Posts: 547
Joined: Fri Jul 04, 2014 2:34 pm

Re: I'm disabling things in Firefox to troubleshoot blocking

Post by Ice Man »

It isn't only Firefox fault. Well not from my testing.

When the site was down I tested it with Chrome, Edge and Opera and all had the same result on different devices.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: I'm disabling things in Firefox to troubleshoot blocking

Post by tepples »

An outage for everyone is obviously independent of browser. An outage that isup.me classifies as "it's just you" will affect all web browsers on all devices behind one IPv4 address until the block expires. So if Firefox RCWN on one PC misbehaves, Opera on your other PC and Chrome on your phone will also be blocked unless you switch your phone's network connection from Wi-Fi to cellular. And if you trigger it through the cellular network, you might end up blocking access by everyone else using the same cell tower because of the widespread use of carrier-grade NAT by cellular ISPs.

I have a Bugzilla account. What I don't have is a clear way to phrase how to trigger the misbehavior to people without a NESdev BBS account. In my tests, the pages most likely to cause a problem require the user to be logged in: post composition and private message composition. Is email validation of new accounts working well at the moment?
Bavi_H
Posts: 193
Joined: Sun Mar 03, 2013 1:52 am
Location: Texas, USA
Contact:

Re: I'm disabling things in Firefox to troubleshoot blocking

Post by Bavi_H »

Would viewing a post on the test forum containing all 23 of the smilies in it trigger the issue without needing to log in?
calima
Posts: 1745
Joined: Tue Oct 06, 2015 10:16 am

Re: I'm disabling things in Firefox to troubleshoot blocking

Post by calima »

They don't need a Nesdev account. "RCWN opens too many connections, which causes users to get blocked on some sites. See attached wireshark logs. Please disable this bad default".
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: I'm disabling things in Firefox to troubleshoot blocking

Post by tepples »

Bavi_H wrote: Sun Mar 15, 2020 5:17 pm Would viewing a post on the test forum containing all 23 of the smilies in it trigger the issue
calima wrote: Mon Mar 16, 2020 1:28 am See attached wireshark logs.
Thanks for the suggestions. I filed bug 1622859 citing this discussion. I also cited bug 1451951, which showed similar symptoms but had been closed for inadequate steps to reproduce, and bug 1618200, about lack of a way to disable RCWN outside about:config.
Post Reply