markdown

Commit Graph

Author	SHA1	Message	Date
Vytautas Šaltenis	b44be78459	Allow rel attribute in sanitizer Fixes issue #68.	2014-05-01 20:49:49 +03:00
Martin Probst	41251715ad	Use go.net/html's parser to sanitize HTML. Use an HTML5 compliant parser that interprets HTML as a browser would to parse the Markdown result and then sanitize based on the result. Escape unrecognized and disallowed HTML in the result. Currently works with a hard coded whitelist of safe HTML tags and attributes.	2014-04-27 23:40:44 +02:00
Vytautas Šaltenis	55bb56bf9b	Merge pull request #55 from rtfb/master Autolink fixes	2014-03-30 19:58:39 +03:00
Vytautas Šaltenis	d643453f1e	Merge pull request #50 from rtfb/master Better protection against JavaScript injection	2014-03-30 19:52:13 +03:00
Graham Miller	d71c759108	add HTML_NOFOLLOW_LINKS	2014-02-25 09:21:57 -05:00
Vytautas Šaltenis	e5937643a9	Fix bug in autolink with trailing semicolon In case the link ends with escaped html entity, the semicolon is a part of the link and should not be interpreted as punctuation.	2014-02-17 21:09:04 +02:00
Vytautas Šaltenis	b0bdfbec4c	Fix bug in autolink overescaping html entities If autolink encounters a link which already has an escaped html entity, it would escape the ampersand again, producing things like these: & --> &amp; " --> &quot; This commit solves that by first looking for all entity-looking things in the link and copying those ranges verbatim, only considering the rest of the string for escaping. Doesn't seem to have considerable performance impact. The mailto: links are processed the old way.	2014-02-17 21:09:04 +02:00
Vytautas Šaltenis	f2d43f69a4	Fix bug in autolink termination Detect the end of link when it is immediately followed by an element.	2014-02-17 21:09:03 +02:00
Vytautas Šaltenis	9fc8c9d866	Fix bug with overzealous autolink processing When the source Markdown contains an anchor tag with URL as link text (i.e. <a href=...>http://foo.bar</a>), autolink converts that link text into another anchor tag, which is nonsense. Detect this situation with regexp and early exit autolink processing.	2014-02-17 21:09:03 +02:00
Vytautas Šaltenis	2f50a53f8e	Rename HTML_SKIP_SCRIPT to HTML_SANITIZE_OUTPUT	2014-01-22 01:23:43 +02:00
Vytautas Šaltenis	55cd82008e	Rewrite protection against JavaScript injection This drops the naive approach at <script> tag stripping and resorts to full sanitization of html. The general idea (and the regexps) is grabbed from Stack Exchange's PageDown JavaScript Markdown processor[1]. Like in PageDown, it's implemented as a separate pass over resulting html. Includes a metric ton (but not all) of test cases from here[2]. Several are commented out since they don't pass yet. Stronger (but still incomplete) fix for #11. [1] http://code.google.com/p/pagedown/wiki/PageDown [2] https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet	2014-01-22 01:14:35 +02:00
Darren Coxall	607ec21435	Tests for links when using HTML_SAFELINK	2013-12-19 10:00:47 +00:00
Russ Ross	ca82b8db3a	panic fix (issue #33 ) with test case	2013-09-11 12:47:43 -06:00
Alex Xandra Albert Sim	da8f2753e2	Added test for link inside image	2013-09-09 12:51:20 +07:00
athom	31798e0eab	add testcase for GFM autolink	2013-08-09 17:24:26 +08:00
moshee	3ea84a5811	parser no longer returns prematurely from empty footnote ref	2013-07-08 22:34:12 +00:00
moshee	1a73bae554	added slice bounds check	2013-07-08 06:54:25 +00:00
moshee	c23099e5ee	Implementation and some tests for inline footnotes. Also I noticed the list items had the wrong ids, that was silly of me.	2013-07-01 01:37:52 +00:00
moshee	7bdb82c53a	new tests pass but old tests now fail...	2013-06-26 15:57:51 +00:00
moshee	be082a1ef2	First attempt at supporting Pandoc-style footnotes. The existing tests have not broken but the new functionality does not work yet.	2013-06-25 01:18:47 +00:00
Vytautas Šaltenis	8226238289	Improve html element stripping code	2013-04-18 03:15:47 +03:00
Vytautas Šaltenis	85e2207cd0	Couple more tests	2013-04-14 01:42:47 +03:00
Vytautas Šaltenis	dcaaa9b5dc	More <script> stripping Partially addresses issue #11.	2013-04-13 23:24:30 +03:00
Vytautas Šaltenis	fb923cdb78	Add an option to strip <script> elements Partially addresses issue #11.	2013-04-13 22:57:16 +03:00
Vytautas Šaltenis	b79e720a36	Make isHtmlTag() case insensitive	2013-04-13 22:34:37 +03:00
Vytautas Šaltenis	d5a8df164b	Fix bug in isHtmlTag() Fix what seems to be a typo. j should iterate through all tagname, so it should be initialized to zero. The test exposes this bug.	2013-04-13 22:21:47 +03:00
Vytautas Šaltenis	90509d39d4	Make a way to parameterize inline tests Expose extensions and html flags parameters so that tests could specify what code paths they want to exercise.	2013-04-13 22:18:14 +03:00
Russ Ross	e35b4b66cc	bounds checking stress tests	2011-07-03 10:51:07 -06:00
Russ Ross	ae9562f685	move whitespace stripping to parser, not renderers	2011-06-29 15:38:35 -06:00
Russ Ross	2aca667078	simplify inline callback interface	2011-06-29 13:00:54 -06:00
Russ Ross	873a60ad49	complete page rendering is now an option in the library	2011-06-29 10:08:56 -06:00
Russ Ross	c969dff782	added simplified interface for common usage	2011-06-28 15:55:27 -06:00
Russ Ross	fde2c60665	version number, few more options for command-line tool	2011-06-28 11:30:10 -06:00
Russ Ross	f8f70572a4	simplified BSD license	2011-06-27 20:11:32 -06:00
Russ Ross	c8f7e789d4	more robust whitespace stripping and matching corrections to tests	2011-06-27 16:06:16 -06:00
Russ Ross	9a0217f7aa	fixed minor bugs uncovered by more testing	2011-06-27 14:35:11 -06:00
Russ Ross	3af64a90ad	fixed headers nested in lists, added prefix header unit tests	2011-06-27 10:13:13 -06:00
Russ Ross	be0fb4602b	more inline unit tests	2011-06-24 16:39:50 -06:00
Russ Ross	1e40ebaf47	unit test for linebreaks	2011-06-01 18:52:55 -06:00
Russ Ross	2abc3af015	starting inline unit tests, fix a few minor bugs they exposed	2011-06-01 12:17:17 -06:00

40 Commits