It is currently Fri Oct 18, 2019 7:32 am

All times are UTC - 7 hours





Post new topic Reply to topic  [ 68 posts ]  Go to page 1, 2, 3, 4, 5  Next
Author Message
PostPosted: Sat Apr 21, 2012 8:18 pm 
Offline
User avatar

Joined: Tue Jun 24, 2008 8:38 pm
Posts: 2317
Location: Fukuoka, Japan
Hi everyone,

since we will need to find a solution requiring the hosting of nesdev before August, we should think about what to do about the current contents.

The current version of phpbb is 2.0 with some custom upgrades (I think) and maybe it time to upgrade to 3.0. I remember that it was said that it would be difficult to move the content to 3.0 so maybe we should find a way to archive to older version.

Do people thing we should put an effort to convert the current site to some static version? There exist some script to do so but since the site was modified in some way, it may not work "as-is" and may require some custom code to do it.

Another solution would be to keep the phpbb 2.0 in a locked state while we start a new forum on 3.0. But I think a static html version would be the most lightway solution.

What do people think? what would be the best solution?


Top
 Profile  
 
 Post subject:
PostPosted: Sat Apr 21, 2012 8:41 pm 
Offline
User avatar

Joined: Fri Nov 19, 2004 7:35 pm
Posts: 4222
The best way to preserve the old board would be to summarize it and wikify it, but that is a momumental undertaking to do.

In the mean time, HTML dumps suck. I've seen ugly HTML dumps of wikis, and you end up with a ton of HTML files, many of which duplicate other files, and the whole thing ends up using tons of disk space due to cluster size padding.
I'd rather see plain text post contents, so you can reconstruct the board from those.

_________________
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!


Top
 Profile  
 
 Post subject:
PostPosted: Sat Apr 21, 2012 9:34 pm 
Offline

Joined: Mon Sep 27, 2004 2:57 pm
Posts: 1272
Welp, a good place to start would be here, taking this information and adding it to the wiki. This is the sprite oam bug and the Young Indiana Jones quirk, both of which I still don't 100% understand (and apparently neither does anyone else).

Next, if we could finalize what we've discovered about the MMC5 scanline counter thus far and wikificate it, that'd be another good move.


Top
 Profile  
 
 Post subject:
PostPosted: Sat Apr 21, 2012 9:36 pm 
Offline
Formerly 65024U

Joined: Sat Mar 27, 2010 12:57 pm
Posts: 2269
I think the threads that need saved are the ones describing mapper behavior, the CIC threads, the music composition tools thread, and just misc. threads. All the newbie threads would be nice too. All the "well how do I make a cart and make it do things" threads don't have to be saved, just take up more room than they're worth.


Top
 Profile  
 
 Post subject:
PostPosted: Sat Apr 21, 2012 10:35 pm 
Offline
User avatar

Joined: Wed Dec 06, 2006 8:18 pm
Posts: 2832
There is definitely alot of content that should be saved however possible.


Top
 Profile  
 
 Post subject:
PostPosted: Sat Apr 21, 2012 10:46 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 21635
Location: NE Indiana, USA (NTSC)
I can download the whole BBS myself using a Python script that's specially programmed to download only valid topics (unlike wget, which gets confused). I just need koitsu to sign off on the job and give me an acceptable crawl delay so that my IP doesn't get blocked.


Top
 Profile  
 
 Post subject:
PostPosted: Sat Apr 21, 2012 10:51 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 4210
Location: A world gone mad
tepples wrote:
I can download the whole BBS myself using a Python script that's specially programmed to download only valid topics (unlike wget, which gets confused). I just need koitsu to sign off on the job and give me an acceptable crawl delay so that my IP doesn't get blocked.


I'm fine with it -- all blocking is done manually as you know. As for the crawl delay, hm, well, I'm more concerned with the rate of network traffic than I am with the fetch intervals or how many concurrent fetches are occurring.

Would it be easier (and more efficient?) to make a private (moderator-only) dump of the MySQL DB for the forum? This is something I could make + put up elsewhere (not on Parodius) for download for folks like Tepples to make use of it. I dunno how much of an undertaking that would require...

Alternately, Tepples, since the server does have Python on it (2.6.7), you could log in to the ndwiki account and run your Python script from there, storing the results in some directory, then tar -pcf dir.tar dir && gzip dir.tar and send that off somewhere. That would keep the HTTP traffic limited to (effectively, not literally) localhost, and the bandwidth/usage would only be associated with the download of dir.tar.gz.


Top
 Profile  
 
 Post subject:
PostPosted: Sat Apr 21, 2012 11:03 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 21635
Location: NE Indiana, USA (NTSC)
koitsu wrote:
I'm fine with it -- all blocking is done manually as you know.

Then I'll use a distinct user agent so that it'll show up in the server log as friendly. Look for something like "Pino's random browser".

Quote:
As for the crawl delay, hm, well, I'm more concerned with the rate of network traffic

Then the ideal crawl delay is roughly equal to the rate of network traffic that you want my crawler to use divided by the average size of a phpBB viewtopic page. I'll probably play with it over the next few days, starting at a 6 second delay.

Quote:
Would it be easier (and more efficient?) to make a private (moderator-only) dump of the MySQL DB for the forum?

The markup processing library that I use is for HTML, not for phpBB's flavor of BBCode. But in case someone else wants to import it into phpBB and play with it, you might as well make this dump available to the mod team.

Quote:
Alternately, Tepples, since the server does have Python on it (2.6.7), you could log in to the ndwiki account and run your Python script from there, storing the results in some directory, then tar -pcf dir.tar dir && gzip dir.tar and send that off somewhere. That would keep the HTTP traffic limited to (effectively, not literally) localhost

In other words, it'd keep the HTTP traffic on the LAN. I'll keep the ndwiki shell account in mind once I get the forum and wiki crawlers stable.


Top
 Profile  
 
 Post subject:
PostPosted: Sun Apr 22, 2012 12:02 am 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 4210
Location: A world gone mad
Sounds good Tepples. Thanks for customising the fetches/etc. so if you run into any problems (or I find any) it'll be easier to pinpoint. :-)


Top
 Profile  
 
 Post subject:
PostPosted: Sun Apr 22, 2012 12:03 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 21635
Location: NE Indiana, USA (NTSC)
Which leaves one problem: we need a new BBS running ASAP so that we can lock this one and I can scrape it without running the risk of the scrape being incomplete. Otherwise, I think I have a working scraper for the phpBB; I'll get to the wwwThreads later.


Top
 Profile  
 
 Post subject:
PostPosted: Sun Apr 22, 2012 2:42 am 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 4210
Location: A world gone mad
Well, I could try migrating the existing board to phpBB 3.0 (without messing with the existing board + existing MySQL DB), but I'm not sure about the nesdev theme/skin that's used here.

Or if you guys want to "start fresh" (everyone having to sign up again, etc.) that's probably also fine but might annoy folks a bit.

Alternate method:

1. Find new hosting first (at a new URL/site name)

2. Lock down nesdev.com/bbs/ to be read-only. If phpBB can't do this, the best choice of action I can propose is to put up some temporary access limits that return Forbidden to everyone except Tepples' IP. I can do this without much effort.

3. Let Tepples run his backup script -- which in this case should probably be run full throttle (i.e. no sleep/delays/etc.). That way the board would be down as short as possible.

4. Set up board software + etc. on the new provider -- preferably with migration of usernames/passwords. For example, at the new host, use phpBB 3.0 but tell it (during the migration process) to try and import all the old users/posts. (I believe the last time I tried this it worked okay)

5. Redirect the nesdev.com/bbs/ URL (and other URLs if need be) to the new provider/URL.

If anything needs to be done in real-time, I can show up on EFnet #nesdev or somewhere similar for the duration of the move if that'd make communication easier.


Top
 Profile  
 
 Post subject:
PostPosted: Sun Apr 22, 2012 4:16 am 
Offline

Joined: Fri Feb 29, 2008 10:35 am
Posts: 85
Let me be the first to outright state that any solution that does not satisfy the condition that the current forum must remain as live as it is now is no solution at all. Having had to re-start sites (even with archives...) is a massive pain and means that any time someone wants to reference that old thread, now they have to search two/three plus pages for it, and it makes a lot of the old tools worthless.


Is there a reason to upgrade to phpbb3? If this one is secure enough to have not gotten defaced over the years and works well enough, I don't really see a reason for it to be upgraded...


Top
 Profile  
 
 Post subject:
PostPosted: Sun Apr 22, 2012 7:17 am 
Offline
User avatar

Joined: Tue Jul 03, 2007 1:49 pm
Posts: 982
Tepples: how does you script differentiate "usefull" posts from non usefull ones? Or is it just gonna dump everything?

edit: upon rereading you said "valid" ...you mean non locked threads etc?


Top
 Profile  
 
 Post subject:
PostPosted: Sun Apr 22, 2012 9:48 am 
Offline

Joined: Sat Nov 17, 2007 8:44 pm
Posts: 385
Just wanted to say I'm glad that tepples is on the job on this. An archive would be invaluable. When I was just starting out learning the NES I considered making topics for a number of questions, but I always searched the forum first, and 99% of the time found that the question had been asked before and answered in detail.

Heck, whenever I do anything with the NES (infrequently these days...) I still search the forum for quick answers to various questions.

Even though good resources are available on the wiki and other places, this forum is still an amazing resource. I'd love to see it maintained without any culling.

How much space could that possibly take? :P


Top
 Profile  
 
 Post subject:
PostPosted: Sun Apr 22, 2012 11:23 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 21635
Location: NE Indiana, USA (NTSC)
Jeroen wrote:
Tepples: how does you script differentiate "usefull" posts from non usefull ones? Or is it just gonna dump everything?

I'll save all topics.

Quote:
edit: upon rereading you said "valid" ...you mean non locked threads etc?

Valid = not deleted.

Anyway, I think I have the phpBB scraper working. I've tested it on t=100 through 199, which I chose as a stress test because of a huge music rip request thread in that range. I'll start it over and run it on all topic IDs once we make all forums read-only (which is possible in the admin panel).

As for the wiki, we can keep on editing that up until my last scrape because I can get the last modified date for all pages, ten at a time, and then go re-scrape any page that's newer than the version I have. I'm saving both the mark-up and the rendered HTML using the MediaWiki API.

Scraper status as of right now:
  • wwwThreads: Not yet started
  • phpBB 2: Works, needs to be run while frozen
  • Wiki: Works for pages, not yet started for images, not yet supporting timestamps


In any case, there are so many inbound links to nesdev.com, both the front page and the BBS, that I think it'd be wise to permanently arrange for redirection to wherever nesdev.com ends up so that cool URIs don't change. Or is a video game publisher whose name starts with K behind this?


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 68 posts ]  Go to page 1, 2, 3, 4, 5  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group