Syntax highlighting added to Wiki

Discussion about the site's wikis, including bugs/issues encountered.

Moderator: Moderators

User avatar
thefox
Posts: 3134
Joined: Mon Jan 03, 2005 10:36 am
Location: 🇫🇮
Contact:

Re: Syntax highlighting added to Wiki

Post by thefox »

koitsu wrote:Anyway -- possibly you could provide the "brush" (what he calls it) for your 6502 syntax highlighting and I can work it into the MediaWiki extension for the nesdev wiki? It'd be a "one-off" but in the case of one extension I can manage that. :-)
Brush is in the CRX file (which is in fact just a renamed ZIP file) linked in the thread you pasted, the filename is shBrush6502.js.
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: fo.aspekt.fi
User avatar
koitsu
Posts: 4201
Joined: Sun Sep 19, 2004 9:28 pm
Location: A world gone mad

Re: Syntax highlighting added to Wiki

Post by koitsu »

Ah, didn't know CRX files were just renamed ZIPs. Thanks! I'll poke about with this.
User avatar
koitsu
Posts: 4201
Joined: Sun Sep 19, 2004 9:28 pm
Location: A world gone mad

Re: Syntax highlighting added to Wiki

Post by koitsu »

Got it. :-)

http://wiki.nesdev.com/w/index.php/User ... source_asm

One modification I had to make was renaming shBrush6502.js to shBrushAssembly6502.js so that the filename matched the brushes object/array declaration (SyntaxHighlighter.brushes.Assembly6502 = Brush;) in the .js file itself. I wasn't sure if Javascript language-wise would permit an object/attribute name that starts with numbers (i.e. SyntaxHighlighter.brushes.6502).

You might be saying "Ah yeah, but then I'd have to refer to it as language='Assembly6502' or the like" -- the difference in the MediaWiki extension is that there's an array of "aliases" so you can say things like "asm" => "Assembly6502" and so on.

So as of right now to use thefox's syntax stuff, use <source lang="6502"> (or lang="asm"). I can add other aliases if people want.

The one downside to the MW SyntaxHighlighter is that the theme it uses is non-existent (it doesn't define one). All you get is the stock defaults in shCoreMinit.css. I have to make a one-line change (in the PHP code itself) to make use of a theme; there's no, say, theme attribute or anything within <source> I'm sorry to say.

So we're back to the community deciding upon a good-looking colour scheme that works with the wiki theme. Here are the stock themes that are available -- we can change any of the colours/details in any of these (probably make our own):

http://alexgorbatchev.com/SyntaxHighlig ... al/themes/
User avatar
koitsu
Posts: 4201
Joined: Sun Sep 19, 2004 9:28 pm
Location: A world gone mad

Re: Syntax highlighting added to Wiki

Post by koitsu »

One bug I've found so far has to do with use of spaces vs. literal hard tabs within a <source> block. I remember seeing this elsewhere too (meaning on things like Wordpress.com), so it's some kind of quirk/bug with SyntaxHighlighter.

A good example of what I'm talking about is here (see "Russian Peasant Algorithm"):

http://wiki.nesdev.com/w/index.php/8-bit_Multiply
User avatar
thefox
Posts: 3134
Joined: Mon Jan 03, 2005 10:36 am
Location: 🇫🇮
Contact:

Re: Syntax highlighting added to Wiki

Post by thefox »

Yup it also has a problem with < and > characters (and presumably other reserved characters which get converted to < > & etc), unfortunately I forgot that this problem existed.

As far as I understand it, the problem is caused by the fact that ";" is used as a comment character and SyntaxHighlighter seems to be (?) applying the syntax highlighting to a string contains the HTML entities (< ...). If that's the case it's obviously a bug (it should convert entities to characters, then apply the syntax hilighting, THEN apply the necessary entities back where needed).

You can see the problem here: http://wiki.nesdev.com/w/index.php/User:Thefox
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: fo.aspekt.fi
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Syntax highlighting added to Wiki

Post by tepples »

And here, where I came up with a plausible reason why one might accidentally use & in 6502 code.
User avatar
koitsu
Posts: 4201
Joined: Sun Sep 19, 2004 9:28 pm
Location: A world gone mad

Re: Syntax highlighting added to Wiki

Post by koitsu »

Yup, I've seen these problems on Wordpress (where my blog is) as well, and they've existed for years. Sadly what needs to happen is that the HTML entity version (i.e. a literal string of lda #<bar) needs to have its HTML entities decoded, then the syntax highlighting applied, then have its HTML entities re-encoded -- all at the Javascript level.

But in general I think what we have now is better than what we had before, we just have to be aware of some of the brokenness.

Could we go with a different syntax highlighter? Sure; GeSHi as I said is a piece of shit given its moronic CSS decisions (I'm still amazed at all of that), which pretty much leaves GoogleCodePrettify -- all the other MW syntax highlighters I've seen are out of date or marked insecure by MW.

I've installed (and deinstalled) the Google one on our wiki a few times already. The biggest problem I've seen with it is that the MW version is outdated compared to the official version, and that someone would have to write the Javascript lexer for the 6502 part. Sure, it's just a bunch of regex, but it's something I personally don't want to put the time into.

Sounding like a broken record, but I'm happy to go with whatever people want/etc..
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Syntax highlighting added to Wiki

Post by tepples »

I thought you didn't have to decode and reencode if you were using textContent and using document.createElement("span") for each colored token rather than using innerHTML. So the first part of making a lexer is coming up with a syntax spec for ca65, asm6, and NESASM.
User avatar
koitsu
Posts: 4201
Joined: Sun Sep 19, 2004 9:28 pm
Location: A world gone mad

Re: Syntax highlighting added to Wiki

Post by koitsu »

@tepples -- I don't know, because as I said, I don't fully speak Javascript. Furthermore, I'm having a serious problem understanding this (honest: I have read it 4 times over now, and it becomes more confusing every time):

http://benv.ca/2012/10/4/you-are-probab ... t-methods/
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

If someone could come up with EBNF

Post by tepples »

Here's the impression I get from the linked article:
  • If you read from textContent and write to innerHTML without manual escaping, you are vulnerable. To remove tags from a String, you need to do four steps: write innerHTML, read textContent, write textContent, read innerHTML. Or write innerHTML, read textContent, createTextNode, append to element.
  • If you read innerHTML and then use this as the value of an element's attribute by concatenating HTML, you are vulnerable. Instead, read innerHTML and then replace /"/ with """ and /'/ with "'", or read textContent and escape /&/, /</, /"/, and /'/, or read textContent and call setAttribute.
If you're familiar with preventing Bobby Tables attacks, walk away with this: Using createTextNode or setAttribute is like using parameterized queries in SQL.

I know some JavaScript and could probably write some sort of lexer myself given the EBNF grammar for each assembler. I ran into the same lack of grammar when I tried to write a call graph analysis program in Python.
User avatar
freem
Posts: 176
Joined: Mon Oct 01, 2012 3:47 pm
Location: freemland (NTSC-U)
Contact:

Re: Syntax highlighting added to Wiki

Post by freem »

I ran into two small bugs with the syntax highlighting when converting Disch's "The Frame and NMIs" to the wiki:
  1. Any label named ":" or any instructions referencing such a label will be commented out. (e.g. beq :+ becomes beq ;:+)
  2. Any use of high/low byte commands (<, >) also causes the rest of the line to be commented out. (e.g. lda #>oam becomes lda #>;oam)
User avatar
blargg
Posts: 3715
Joined: Mon Sep 27, 2004 8:33 am
Location: Central Texas, USA
Contact:

Re: Syntax highlighting added to Wiki

Post by blargg »

The syntax highlighter shouldn't be modifying the text, only changing the style of the text. I've found it best to just disable JavaScript for the Wiki. Text comes out black and no ;; comments etc.
User avatar
koitsu
Posts: 4201
Joined: Sun Sep 19, 2004 9:28 pm
Location: A world gone mad

Re: Syntax highlighting added to Wiki

Post by koitsu »

I can't fix the problem in question. If someone who is good at JavaScript wants to take a stab at it, or who is familiar with the MediaWiki PHP framework (it may be there where the issue is, I do not know), be my guest.

Switching to an alternative syntax highlighter, as I said earlier in this thread, is perfectly fine -- but there are some that are just utter garbage (ex. GeSHi) that should be avoided.

None of these syntax highlighters appear to be being actively maintained.

2012-09-01: https://www.mediawiki.org/wiki/Extensio ... ighlighter (what we currently use)
2010-07-02: https://www.mediawiki.org/wiki/Extensio ... eColorizer
2008-04-22: https://www.mediawiki.org/wiki/Extension:ASHighlight

I did notice that the author to the one we use updated something a month ago in his git repo version, but it doesn't appear to be related to the issue described previously:

https://github.com/seongjaelee/SyntaxHighlighter
https://github.com/seongjaelee/SyntaxHi ... ighter.php

However he still appears to be using SyntaxHighlighter (the Javascript stuff) version 3.0.83, which is the last official version, but their git repo has lots of activity: https://github.com/alexgorbatchev/SyntaxHighlighter

TL;DR -- Our choices are limited:

1. Find something that works, however this is a lot more tedious than you think, and as I said GeSHi has to be avoided due to how awful it is,

2. Fix what's broken in SyntaxHighlighter, which as stated requires familiarity with either JavaScript or the MediaWiki PHP framework for extensions,

3. Get rid of the syntax highlighter entirely. If people want me revert all the syntax highlighting enhancements I did across a series of wiki pages, then I will do that without complaint.
User avatar
thefox
Posts: 3134
Joined: Mon Jan 03, 2005 10:36 am
Location: 🇫🇮
Contact:

Re: Syntax highlighting added to Wiki

Post by thefox »

The same issue (I believe) has been raised at GitHub a couple of times already (e.g. https://github.com/alexgorbatchev/synta ... issues/144 and https://github.com/alexgorbatchev/synta ... issues/252) but nothing has been done about it.

As far as I've understood the issue, SyntaxHighlighter seems to be applying the syntax highlighting to text that still has HTML entities in it (e.g. "<" instead of "<"), which messes things up because ";" is used as the comment character. So I believe a fix would involve decoding the HTML entities before syntax highlighting, then applying the highlighting, and then re-encoding. So the modification could be pretty simple if we find the place where the script actually grabs the text from the page.

But for now, I think it'd be best to disable the syntax highlighter entirely until it's fixed. It's just not cool to have it breaking peoples' code.
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: fo.aspekt.fi
User avatar
thefox
Posts: 3134
Joined: Mon Jan 03, 2005 10:36 am
Location: 🇫🇮
Contact:

Re: Syntax highlighting added to Wiki

Post by thefox »

thefox wrote:But for now, I think it'd be best to disable the syntax highlighter entirely until it's fixed. It's just not cool to have it breaking peoples' code.
Bump.
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: fo.aspekt.fi
Locked